Parallel Processing and Distributed Data Structures Using the Dask Package for Python

Presentation | April 16 | 3:30-6 p.m. | Dwinelle Hall, Academic Innovation Studio (117 Dwinelle)

 Data Sciences

Dask allows you to set up parallel computations on one or more machines (or Savio nodes), including working with large datasets distributed across multiple Savio nodes. Berkeley Research Computing and Research IT will cover the different ways to set up and run parallel computations using Dask.

Topics will include:
-Parallelizing loops using delayed evaluation
-Distributed data structures (including parallel I/O)
-Parallelization on one or more machines
-Using Dask in the context of SLURM job submissions
-Random number generation
-Nested parallelization, memory use, and load-balancing

After the training, there will be an informal get together with snacks and drinks.

Register for the event here: https://docs.google.com/forms/d/e/1FAIpQLSdVZnop1Nl0DM6D-lxl3zL0ISAJZ_GHzh63rEbSXXazfi0Y0Q/viewform

 aneeser@berkeley.edu