Privacy-Preserving Machine Learning

Hugo Bowne-Anderson
July 20, 2020

Katharine Jarmul, privacy activist and Head of Product at Cape Privacy, joins Matt Rocklin and Hugo Bowne-Anderson to chat about how distributed computing and privacy support one another and are mutually beneficial, especially when considering today’s data science and machine learning problems.

We’ll cover both the opportunities and the challenges of protecting distributed data science workloads using the newly released Cape Python open source package:

  1. Learn about privacy-enhancing techniques and when to use them;
  2. See how to write policy for privacy-enhancing techniques and apply them to a pandas DataFrame;
  3. Explore when transformations might be important during distributed data processing and how distributed computing in machine learning could be a harbinger for advanced privacy techniques, such as federated learning and secure multi-party computation.

If you know a bit of machine learning, you’ll learn how to reason about privacy policy using Cape Python.

If you’re comfortable with Dask or other forms of distributed compute, you’ll learn about how distributed pipeline tasks can benefit from privacy as one part of preprocessing and what the future of fully distributed machine learning might look like with workflows and pipelines!

Level up your Dask using Coiled

Coiled makes it easy to scale Dask maturely in the cloud