More

chrisprobert · on Jan 2, 2019

insitro | Machine Learning for Drug Discovery | South San Francisco, CA | Full Time | Onsite

insitro is reinventing drug discovery by bringing cutting-edge machine learning in a closed loop with our high throughput robotic biology data factory.

Current software roles include:

- Machine Learning Engineer

- Data Engineer

- Head of Data Engineering

See: http://insitro.com/jobs or feel free to email me: cprobert@insitro.com

chrisprobert · on June 26, 2018

I'm curious whether this is backed by Google File System (https://static.googleusercontent.com/media/research.google.c...), or something else.

ebikelaw · on June 26, 2018

Nothing at Google has been backed by GFS in many years. https://www.wired.com/2012/07/google-colossus/

puzzle · on June 26, 2018

Well, the poster probably meant whatever GFS incarnation is around these days (Colossus is v2 or v3, depending on how you look at things). In the end, almost everything at Google is backed by "GFS": Bigtable, Blobstore, GCS, Spanner, Megastore, etc. There's little else that is not backed by GFS, but talks to D directly. At least that has been mentioned in public, of course. Still, none of it is user facing/serving, though.

cobookman · on June 26, 2018

Google Cloud Storage (GCS), an S3 competitor, is backed by Colossus (CNS). [1]. Colossus is the replacement to Google File System (GFS).

[1] https://cloudplatform.googleblog.com/2014/01/easier-faster-l...

manigandham · on June 26, 2018

Judging by pricing and performance, it's mostly likely using their existing Persistent Disks offering and wrapping it in a managed NFS layer.

https://cloud.google.com/persistent-disk/

chrisprobert · on June 4, 2018

What's interesting about this approach (TIL expansion) relative to other leading cellular immunotherapy approaches (CAR-T/NK and TCR) is that it doesn't rely on gene editing. Long term, I think genetically "programmable" cellular immunotherapies are more likely to win (e.g. because they can be programmed to overcome tumor immune suppression), but it's impressive that a durable response is achieved here with clonally expanded TILs.

There is a great review on progress in cell based immunotherapies here: https://www.cell.com/cell/abstract/S0092-8674(17)30064-8

kurthr · on June 4, 2018

Thanks! That is an excellent and relatively modern review article.

chrisprobert · on March 13, 2018

Coupa Cafe on Ramona St. in Palo Alto, or the one in Y2E2 at Stanford.

chrisprobert · on Oct 31, 2017

Announcing TensorFlow's new development roadmap mandate: copy everything PyTorch is doing :-)

ychujo2 · on Oct 31, 2017

I think you mean Google is following the leadership of Chainer, like Facebook already does? PyTorch started as a Chainer fork. Its dynamic graph internals are all from Chainer.

bradleyjg · on Oct 31, 2017

This isn't art. There are no points for originality. If open source projects borrow the best parts from each other, that's a good thing.

ychujo2 · on Oct 31, 2017

It's not a bad thing. It's good for users. But give credit to the leaders in the field. If you make an iPod clone, you call it an iPod clone, not a clone of the Zume HD.

Chainer started it, was around years earlier, and it still has more users. So Google is not copying PyTorch, it's copying Chainer.

arbie · on Nov 1, 2017

Do you have a source for Chainer currently having more users than TensorFlow?

Narew · on Nov 3, 2017

Not more users than Tensorflow. But maybe more user than other dynamic deeplearning framework (PyTorch, Gluon, DyNet...)

chrisprobert · on Oct 31, 2017

Totally agree!

ma2rten · on Nov 1, 2017

This is the first time I am hearing this. I though pytorch was based on torch (like the name implies). Do you have a reference or more information?

Narew · on Nov 3, 2017

PyTorch use the same backend as torch (cutorch for GPU ...) But PyTorch use almost the same Python API than Chainer. On this point we can say PyTorch "copy" Chainer.

tempw · on Oct 31, 2017

Based on your reasoning PyTorch is copying TensorFlow static optimizations and production capability with JIT and ONNX then? I've seen many folks requesting an imperative API.

You can't please everybody, as if they listen or not to users people still complain. If both are making effort to improve themselves though, the community has only to benefit from this competitiveness.

make3 · on Oct 31, 2017

I'm usually against this type of framework baiting, but being a tensorflow guy myself & having just spent the week coding with pytorch full time.... this is basically identical to pytorch

brittohalloran · on Nov 1, 2017

What are the strengths and weaknesses of each? I've been using keras but planning on diving into a real deal framework next. Tensorflow is appealing for the momentum it has in the community, but pytorch looks easier to learn.

Doing image classification, object localization, and homography (given an input image, which of my known template images is matches it and in what orientation).

alextp · on Nov 1, 2017

I think Keras is a real deal framework. It provides a higher-level API than most other frameworks, but it has pretty sweet portability of models across frameworks and platforms and most research papers are implementable in Keras without too much trouble.

Narew · on Nov 3, 2017

In my opinion, the real deal with Pytorch or Chainer, there are similar than numpy API. So the learning curve is flat. The NN construction part and gradiant part are specific but all the glue is regular python unlike Keras, tensorflow ...

solomatov · on Oct 31, 2017

That's actually very good that they are copying good things from other frameworks.

make3 · on Oct 31, 2017

the question now is, are tensorflow eager's RNN as slow as pytorch's are?

yablak · on Oct 31, 2017

(I'm author of the TF rnn api & tf.contrib.seq2seq)

There's a lot of work being done on this specific part. If you have a standard RNN architecture you want to run, you can probably use the cudnn code in tf.contrib.cudnn to get a super fast implementation.

There is some performance work that needs to be done on properly caching weights between time steps of an RNN if you use a tf.nn.RNNCell. Currently if you want to implement a custom architecture, or a seq2seq decoder, or an RL agent, this is the API you would want to use. Several of the eager benchmarks are based on this API; so that performance will only improve.

I'm hopeful that for the next major release, we'll also have support for eager in tf.contrib.seq2seq.

chrisprobert · on Oct 16, 2017

Great to see this coming from a team with such a solid AI background!

chrisprobert · on Dec 3, 2015

I would put > 50/50 odds on there being a human alive today with at least one CRISPR-edited germline variant. I think the question is now how can we regulate/control CRISPR germline editing; not how can we prevent it.

Human embryo gene editing was reported in May 2015 (http://dx.doi.org/10.1007/s13238-015-0153-5). It's reasonable to assume that the editing took place significantly before the submission date, especially given reports that the paper was first rejected from several other journals on ethical grounds (http://www.nature.com/news/chinese-scientists-genetically-mo...). I'm not trying to suggest that these particular scientists have performed experiments on viable embryos also, but I'd be very surprised if someone hasn't.

nonbel · on Dec 3, 2015

Don't they need to use controls? I mean these are faulty zygotes, they could already have mutations at that site and other places as well.

chrisprobert · on Nov 19, 2015

Key question for digital health / biotech startups is whether direct to consumer (D2C) advertising for testing services (e.g. genetic carrier screening; cancer risk screening) would be included.

chrisprobert · on Oct 16, 2015

Correct - their current v4 chip covers ~602K SNPs and they have > 1 M customers, so you couldn't uniquely identify all of them by genotype. But given the frequency of rare variation in humans, you'd expect to be able to reduce a given genotype to a much smaller set of customers.

chrisprobert · on Oct 16, 2015

You're right that Ancestry.com is focused on mtDNA, but 23andMe actually uses a custom Illumina microarray with autosome, allosome, and mitochondrial targets. Here's a study where 23andMe used their (old version) chips to replicate a bunch of known genetic associations, most of which are autosomal: http://journals.plos.org/plosgenetics/article?id=10.1371/jou...