Accelerating the Dask Scheduler

To celebrate Coiled’s 1st birthday, we’re having a special Livestream with Matthew Rocklin, Dask maintainer and Coiled CEO, to discuss ongoing work to accelerate the Dask scheduler. 

What does it take to make Dask run at extreme speed?

The Dask scheduler is the beating heart of Dask’s distributed computing system.  It tells which computers to run which code when.  For intense workloads this scheduler can become a bottleneck, limiting the effectiveness of a Dask cluster at a large scale.  Over the last several months, engineers from Capital One, Coiled, and NVIDIA have collaborated to profile and accelerate the Dask scheduler using a wide variety of techniques.  

Unfortunately, understanding and optimizing performance in systems like the Dask scheduler is complex.  The scheduler is an interconnected system of state machines, network communication, and eventing systems for which there is no single magic bullet.  This talk will cover the problem, and then highlight several of the approaches used by the team to profile and optimize different parts of Dask, and Python itself.  It will also be a teaser into a talk track for the upcoming Dask User Summit.

After attending, you’ll know

  • How to think about task scheduling
  • About advanced profiling and tracing techniques in Python
  • About performance optimizations like Cython-in-pure-Python, GIL tracing, socket performance, and more

Level up your Dask using Coiled

Coiled makes it easy to scale Dask maturely in the cloud