What are good alternatives to zip files when working with large online image datasets?

This page summarizes the projects mentioned and recommended in the original post on /r/datascience

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Activeloop Hub

    Discontinued Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake] (by activeloopai)

  • What solution have you used that you like as a data scientist when working with large datasets? Any standard python API to access the data? Other solution? If anyone has used https://github.com/activeloopai/Hub or other similar API I'd be interested to hear your experience working with it!

  • labelflow

    The open platform for image labelling

  • We are hosting image datasets on our platform and until recently the stored datasets were relatively small (several hundreds of images, few GB) so we only offered the possibility to export zip files containing images and labels in the COCO or YOLO format. As the average size of the datasets is growing, it's not convenient anymore to export a zip.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts