Modern DML and scientific computing toolkits

February 9 — by April Chen & Zev Nicolai-Scanio

Thoughts on the Tensorflow whitepaper?

Thoughts on the Dask and Dask vs Spark papers?

Discussion ideas

  1. How does Tensorflow address the issue of flexible consistency and the trade-off between system efficiency and convergence rates?

  2. TF claims to build on timely dataflow and parameter server ideas. Are there any aspects of timely dataflow and/or parameter server that appear missing from TF? Why might TF have left off those features?

  3. What are the differences between how Dask and TF assign nodes of their compute graphs to resources? What might be the advantages and disadvantages of the different systems?

  4. Dask is a Python system that implements its graph structure in Python itself. In fact users can make their own graphs directly as Python dictionaries as opposed to using Dask objects that automatically generate them. What are the pros and cons of this design?

  5. Dask and Tensorflow (and Spark) all use lazy evaluation for their compute graphs. Why do you think lazy evaluation is a dominant paradigm? Are there any cases you can think of where lazy evaluation might be a bad idea?

  6. Compare machine learning applications across the papers we’ve read so far. For which problems would you expect Dask and Spark to have significantly different performance?

Takeaways: TensorFlow

Subgraphs

EEG

Takeaways: Dask

Dask parallel

Grid search scheduler

Dask vs. Spark

stupid.png

Synthesis

Are these good papers?

What do you think are the most important contributions?