[Project] package Hub: store, stream, and access large datasets in seconds

This page summarizes the projects mentioned and recommended in the original post on /r/Python

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • n5

    Not HDF5

  • For readers' context: zarr is a self-describing n-dimensional array hierarchy format specification which can sit over more or less any key-value store. If you've ever used HDF5, it's basically that, but array chunks are exploded over the file system/ cloud store, and all the metadata is JSON. It's gaining traction in the biological imaging and geo/meteorological data communities, among other places. Work on the v3 specification is in progress, which aims to abstract away a generic protocol, as well as fold in the community behind N5, an almost-identical format used by a small but vocal number of bio-imaging labs.

  • Activeloop Hub

    Discontinued Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake] (by activeloopai)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts