Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression. Learn more →
Activeloop Hub Alternatives
Similar projects and alternatives to Activeloop Hub
-
dvc
🦉 Data Version Control | Git for Data & Models | ML Experiments Management
-
petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
CKAN
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
-
-
datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ... (by tensorflow)
-
caer
High-performance Vision library in Python. Scale your research, not boilerplate.
-
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
typedb-ml
TypeDB-ML is the Machine Learning integrations library for TypeDB
-
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
pytorch-image-models
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
-
-
dwc
Darwin Core standard for sharing of information about biological diversity.
-
cryptoCMD
Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
-
-
-
sqlmodel
SQL databases in Python, designed for simplicity, compatibility, and robustness.
-
-
Transformers-Tutorials
This repository contains demos I made with the Transformers library by HuggingFace.
-
django-ninja
💨 Fast, Async-ready, Openapi, type hints based framework for building APIs
-
evidently
Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
-
CodiumAI
TestGPT | Generating meaningful tests for busy devs. Get non-trivial tests (and trivial, too!) suggested right inside your IDE, so you can code smart, create more value, and stay confident when you push.
Activeloop Hub reviews and mentions
-
[D] NLP has HuggingFace, what does Computer Vision have?
u/Remote_Cancel_7977 we just launched 100+ computer vision datasets via Activeloop Hub yesterday on r/ML (#1 post for the day!). Note: we do not intend to compete with HuggingFace (we're building the database for AI). Accessing computer vision datasets via Hub is much faster than via HuggingFace though, according to some third-party benchmarks. :)
-
[N] [P] Access 100+ image, video & audio datasets in seconds with one line of code & stream them while training ML models with Activeloop Hub (more at docs.activeloop.ai, description & links in the comments below)
u/gopietz good question. htype="class_label" will work, but querying doesn't support multi-dimensional labels yet. Would you mind opening an issue requesting that feature?
We've recently added a Huggingface integration that allows ingestion of HuggingFace datasets.
-
[P] Database for AI: Visualize, version-control & explore image, video and audio datasets
Hub, our open-source package, lets you stream datasets while training to PyTorch/TensorFlow. Check out how we achieved 95% GPU utilization while training on ImageNet at 50% less cost. We're building the Database for AI, with everything it should contain. If there's an adjacent feature that would make it more useful for your workflow, do let us know!
Our early users love the tool and I hope you'll love it too. We have many more features other than visualization on the roadmap (the current feature list includes querying, version control UI, and integrates through our open-source package Hub (dataset format for AI) with TensorFlow, PyTorch, Sagemaker, other tools on the roadmap.
Please take a look at our open-source dataset format https://github.com/activeloopai/hub and a tutorial on htypes https://docs.activeloop.ai/how-hub-works/visualization-and-htype
The platform allows to: - Inspect the data with all its bounding boxes, masks, etc, and have important stats such as distribution of the labels (adding more stuff in the future to fight bias and improve data quality). - Query datasets to create new, highly specific ones - Version control datasets (while visualizing the changes). I'm confident that if you've ever worked on iteratively improving your models, dataset versioning is probably something you've done. - Stream computer vision datasets while training in PyTorch/Tensorflow via Hub, our open source package (we might add an even more straightforward way to the UI). - For larger organizations access management is important, and we do take care of that.
The visualization interfaces with our open-source dataset format for AI, enabling workflows such as querying/filtering to create datasets/inspect subsamples, tracking changes to the data with data version control visualization (e.g. cross-referencing if the transformations applied had intended effects), and will have integrations with other tools (e.g. experiment tracking, labelling) very soon.
Yes, we're not entirely relevant for your use case, especially if the data is not that big/complex, and benefits that you'd get from switching to Hub format are not as pronounced in case of text as they are in case of computer vision datasets (actually, we still have a couple of diehard NLP community members, but they have ridiculously big text datasets). I presume your university system doesn't use unstructured data like videos/images/audio, either, so our product wouldn't be very helpful in that regard. I do wish you tons of luck and patience though (>10ˆ6?! good Lord...)
-
The hand-picked selection of the best Python libraries released in 2021
Hub.
-
A note from our sponsor - InfluxDB
www.influxdata.com | 31 May 2023
Stats
activeloopai/Hub is an open source project licensed under Mozilla Public License 2.0 which is an OSI approved license.
The primary programming language of Activeloop Hub is Python.