An increasing number of organizations need to scale data science to larger datasets and larger models. However, deploying distributed data science frameworks in secure enterprise environments can be surprisingly challenging because we need to simultaneously satisfy multiple sets of stakeholders within the organization: data scientists, IT, and management.
Solving simultaneously for all sides of this problem is a cultural and political challenge as much as a technical one. This is the problem that we’re passionate about solving at Coiled, and that we recently spoke about in our PyCon 2020 talk.
Here at Coiled, we have been speaking with data scientists, management, IT, and open source developers (among others) about the challenges of scaling data science to both the cloud and on-premise clusters.
In order to open up this necessary conversation, we’ve published posts detailing the challenges encountered by data scientists, IT, and team lead. The intention of this post is to list all the challenges together and allow you, the reader, to dive into the posts that interest you the most (hint: all of them).
We often see the paint points felt by data scientists boil down to three main challenges (for more depth, read our detailed post here):
We often see the paint points felt by team leads and management boil down to three main challenges (for more depth, read our detailed post here):
We often see the paint points felt by IT departments boil down to three main challenges (for more depth, read our detailed post here):
These are the types of data science problems we’re building products to solve for here at Coiled. The truth is we’re really excited to be building products for scaling data science in Python to larger datasets and larger models, particularly for data scientists and teams that want a seamless transition from working with small data to big data. If the challenges we’ve outlined resonate with you, we’d love it if you got in touch with us to discuss our product development.