Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more โ
Snorkel Alternatives
Similar projects and alternatives to snorkel
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Judoscale
Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
-
OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
-
-
refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
-
-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
-
-
-
argilla
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
-
-
-
BotLibre
An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
-
-
-
grape
๐ GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations (by AnacletoLAB)
-
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
snorkel discussion
snorkel reviews and mentions
-
Harnessing Weak Supervision to Isolate Sign Language in Crowded News Videos
Hello everyone, we are trying to make a large dataset for Sign Language translation, inspired by BSL-1K [1]. As part of cleaning our collected videos, we use a nice technique for aggregating heuristic labels [2]. We thought it was interesting enough to share with people on here.
[1] https://www.robots.ox.ac.uk/~vgg/research/bsl1k/
[2] https://github.com/snorkel-team/snorkel
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
The paid product came out of an open source tool: https://github.com/snorkel-team/snorkel
- [Discussion] - "data sourcing will be more important than model building in the era of foundational model fine-tuning"
-
Can't use load_data from utils
Actually, I referenced it in my issue as well. There seems to be different utils.py file in different folders under the snorkel-tutorials repo but the utils file we get after importing snorkel has a different [file](https://github.com/snorkel-team/snorkel/blob/master/snorkel/utils/core.py) ,i.e. the utils file is different in the main snorkel repo
- [D] A hand-picked selection of the best Python ML Libraries of 2021
-
[Discussion] Methods for enhancing high-quality dataset A with low-quality dataset
Snorkel (https://github.com/snorkel-team/snorkel) might provide you exactly what you are looking for. From the docs:
-
A note from our sponsor - Judoscale
judoscale.com | 22 Apr 2025
Stats
snorkel-team/snorkel is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of snorkel is Python.
Review โ โ โ โ โ 8/10