Running Jupyter notebooks in parallel

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • african_microbiome_portal_data

    Raw and corrected data with correction python notebook

  • Here we will share the results after testing and evaluating some of these tools. Note that to make this comparison fair, it takes into account the use of the same code for all executions and we also use Python's time module to measure the execution time. The notebooks used for benchmarking can be found here and correspond to the african_microbiome_portal_data repository. Serial execution cases (each notebook sequentially) are evaluated first, followed by parallel notebook execution cases.

  • papermill

    📚 Parameterize, execute, and analyze notebooks

  • As a first option, we will use Papermill, which has a Python API that allows us to run different notebooks using some functions:

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • As a second option, we will use Ploomber with serial execution, which also has a Python API that allows us to execute different notebooks using the NotebookRunner function:

  • ploomber-engine

    A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!

  • As a third option we will use Papermill again, but now with the ploomber-engine, which adds debugging and profiling features to Papermill:

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts