dataset-viewer
tasksource
dataset-viewer | tasksource | |
---|---|---|
6 | 3 | |
622 | 125 | |
4.0% | - | |
9.8 | 6.6 | |
1 day ago | 14 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dataset-viewer
tasksource
-
[D] What are notable advances in NLU?
Technically, BERT (bert-base) is not sota anymore. deberta+MTT-DNN (multi-task learning on many datasets) https://ibm.github.io/model-recycling/ is arguably sota.
-
[Discussion] ChatGPT and language understanding benchmarks
LAMA, truthfulQA, MMLU, and many others
-
[R] tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation (480 tasks+ sota encoder)
Found relevant code at https://github.com/sileod/tasksource + all code implementations here
What are some alternatives?
aphrodite-engine - PygmalionAI's large-scale inference engine
Neural-Scam-Artist - Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
datasets - TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
beir - A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
diffgram - The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
datasets - 🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
squirrel-core - A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut: