Dask Clusters

Dask is a general purpose library for parallel computing. Dask can be used on its own to parallelize Python code, or with integrations to other popular libraries to scale out common workflows.

Dask has its own docs and examples, but we’ll include a few typical use cases below.

Simpler examples work well with a local Dask cluster if you just want to experiment.

from dask.distributed import LocalCluster

cluster = LocalCluster(processes=False)
client = cluster.get_client()

In many cases you’ll likely want a Coiled cluster set up.

from coiled import Cluster

cluster = Cluster(n_workers=20)
client = cluster.get_client()

Once you have a Dask cluster you can then run Python code on that cluster. Here is the simplest code you could run:

def inc(x):
    return x + 1

future = client.submit(inc, 10)
future.result() # returns 11

Use Cases

Here are some more examples for how you can use Dask: