This is a forked from https://github.com/dask/dask-tutorial
Dask is a parallel and distributed computing library that scales the existing Python and PyData ecosystem. Dask can scale up to your full laptop capacity and out to a cloud cluster.
git clone http://github.com/esarrazin/dask-tutorialInstall pixi
In the main repo directory
cd dask-tutorial
pixi install
pixi shellFrom the repo directory
jupyter labYou are welcome to use Jupyter notebook if you prefer, but we'll be using lab in the live tutorial.
- Reference
- Ask for help
dasktag on Stack Overflow, for usage questions- github issues for bug reports and feature requests
- discourse forum for general, non-bug, questions and discussion
- Attend a live tutorial
-
Overview - dask's place in the universe.
-
Dataframe - parallelized operations on many pandas dataframes spread across your cluster.
-
Array - blocked numpy-like functionality with a collection of numpy arrays spread across your cluster.
-
Delayed - the single-function way to parallelize general python code.
-
Deployment/Distributed - Dask's scheduler for clusters, with details of how to view the UI.
-
Distributed Futures - non-blocking results that compute asynchronously.
-
Machine learning - use dask for machine learning.