TensorFlow Datasets (TFDS): a collection of ready-to-use datasets

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • LearnThisRepo.com - Learn 300+ open source libraries for free using AI.
  • WorkOS - The modern API for authentication & user identity.
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • datasets

    TFDS is a collection of datasets ready to use with TensorFlow, Jax, ... (by tensorflow)

    I tried Librispeech, a very common dataset for speech recognition, in both HF and TFDS.

    TFDS performed extremely bad.

    First it failed because the official hosting server only allows 5 simultaneous connections, and TFDS totally ignored that and makes up to 50 simultaneous downloads and that breaks. I wonder if anyone actually tested this?

    Then you need to have some computer with 30GB to do the preparation, which might fail on your computer. This is where I stopped. https://github.com/tensorflow/datasets/issues/3887. It might be fixed now but it took them 8 months to respond to my issue.

    On HF, it just worked. There was a smaller issue in how the dataset was split up but that is fixed now, and their response was very fast and great.

  • blackjack-basic-strategy

    A computer vision powered Blackjack basic strategy app powered by Roboflow.

    For computer vision, there are 100k+ open source classification, object detection, and segmentation datasets available on Roboflow Universe: https://universe.roboflow.com

  • LearnThisRepo.com

    Learn 300+ open source libraries for free using AI. LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts