FAQ

What is Dask?

Dask is a python library for parallel computing. It can be used on its own, where it’s kinda like multiprocessing on steroids, or it can be used with other PyData libraries. Dask + pandas is big pandas, like Spark. Dask + Numpy is big Numpy. And so on with PyTorch, XGBoost, Prefect, Airflow, etc...

How do I set up Dask to run on a cluster?

Actually, that’s what we do! Dask is fully open source and you can deploy it yourself with any technology. Check out the Dask docs to get started.

This ends up being easy to start, but kinda hard to actually use seriously in a corporate setting. Coiled makes this super easy on the cloud. See our Build vs. Buy page for more details.

Can I pay to get help with Dask?

If you’re using Coiled you’ll find that our engineers reach out frequently. We’re constantly tracking failures and engaging with users to see how we can make Dask and Coiled better. If you want to ask questions we’re happy to help.

If you’re not using Coiled these engagements are less efficient and so it’s harder to help. For a few large partner organizations with heavy use of Dask on-prem we do sell Enterprise Dask Support.

How does Dask compare to Apache Spark?

Spark is great. If Spark does what you want, use Spark.

Dask is lower level and lighter weight than Spark. It is more flexible and can do more things. If you’re doing primarily SQL queries or common dataframe operations, Spark will likely outperform Dask. People tend to choose Dask for the following two reasons:

They like Python
.
Dask is more Python native. For example the pandas API is the same, and integration with all of the Python libraries is more native. Debugging is easier.

They need more flexibility.
Many people using Python need to do things that are more complicated than SQL. This includes complex workflows, ML, etc. Python teams tend to do a lot of crazy unstructured stuff.

Check out our blog post Dask vs. Spark for more details.

What does Coiled add on top of Dask?

Coiled makes it easy to set up and use Dask in the cloud. This isn’t one big thing. It’s dozens of small things. For example Coiled does the following:

  • Replicates your local software environment to your workers
  • Forwards data access credentials to your workers
  • Starts up reliably in any region in a couple minutes
  • Turns itself off automatically and cleans up if clusters are left on
  • Tracks usage across a team
  • Gives cost saving measures like Spot, ARM, and others
  • Gives visibility to historical jobs and errors
What is Coiled?

Dask is an open source project, like pandas or Jupyter. Coiled is a for-profit company around Dask. We work on Dask and we also sell a cloud platform to make it easy to deploy Dask in the cloud. Do you know Databricks? We're like that. Request a demo to give it a try.

Does this run in your cloud or mine?

Coiled manages resources in your cloud account. Most of our customers don’t trust us with their sensitive data. We work hard to manage cloud resources for you without ever having direct access to sensitive data. For more information on our security posture, see our Coiled Security page.

Can I get a free trial?

Yes! You don’t even have to ask:
pip install coiled
coiled setup

Coiled is free to use for the first 10,000 CPU hours per month (you still have to pay your cloud provider). You don’t need to give us a credit card or anything. See our documentation to get started or reach out to us, we'd love to chat and are happy to help.

How much does this cost me?

Nothing until you use 10,000 CPU hours per month. After that we do usage-based pricing. If we’re managing $1M of spend on AWS/GCP then we expect to receive hundreds of thousands of dollars. If we’re managing hundreds of dollars then we don’t care and are happy to give away the product for free. In general we find that our cost-saving measures save customers more money than we charge. More details on our Pricing and Build vs. Buy pages.

Can you help me run on-prem?

You can deploy Dask anywhere with technologies like HPC job schedulers, Kubernetes, or cloud deployment APIs. See the Dask documentation. Coiled manages Dask clusters in the cloud, deploying within your own cloud provider account (we never see your data).

If you are truly on-prem (like on OpenStack) then we can connect you to other companies and contractors that can set things up for you. Send us a note and we’ll connect you to groups that we trust to do excellent work.

What if Coiled fails? Will the service still be around?

Yeah, we're definitely an early stage startup, so this is a pretty reasonable question to ask.

Most startups fail. That being said, we're on pretty solid ground as far as startups go. We're a small team of about twenty, have real revenue, and have three years of runway with plenty of opportunities for more money if we want it.

We're also based on Dask, which is pretty mature and has a broad user base. There's definitely risk, but probably far less risk than is typical for startups.

In the event of the company failure we'd love to see the service continue in some form, but honestly it's hard to predict that far out.