Science Thursday: Scaling up Geospatial Data Science with Distributed Computing

Brendan Collins (co-founder at makepath), who has created and/or contributed to libraries including Datashader, Bokeh and xarray-spatial, joins Matt Rocklin and Hugo Bowne-Anderson to discuss how to scale xarray-spatial with Dask. 

xarray-spatial is a Python library that implements common raster analysis functions using Numba and provides an easy-to-install, easy-to-extend codebase for raster analysis. It is free of GDAL / GEOS dependencies and was created for general-purpose spatial processing for the GIS community.

For large scale Geo projects, such as Natural Resource Management and Conservation efforts, a connection between the GIS and PyData communities goes a long way. With tools like xarray-spatial, coupled with Dask and Numba, GIS professionals can exponentially increase the computing power and scalability of their work for faster results. 

After attending, you’ll understand how:

  • The xarray-spatial library works and how to scale it with Dask
  • To apply xarray-spatial to domain specific use cases
  • To generally use Python libraries for Geo applications, without being a software developer: by extracting over the stuff you don't need to know and interfacing with familiar APIs 

Level up your Dask using Coiled

Coiled makes it easy to scale Dask maturely in the cloud