tasksource
zeroshot_topics
tasksource | zeroshot_topics | |
---|---|---|
3 | 3 | |
129 | 60 | |
- | - | |
6.6 | 0.0 | |
21 days ago | 12 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tasksource
-
[D] What are notable advances in NLU?
Technically, BERT (bert-base) is not sota anymore. deberta+MTT-DNN (multi-task learning on many datasets) https://ibm.github.io/model-recycling/ is arguably sota.
-
[Discussion] ChatGPT and language understanding benchmarks
LAMA, truthfulQA, MMLU, and many others
-
[R] tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation (480 tasks+ sota encoder)
Found relevant code at https://github.com/sileod/tasksource + all code implementations here
zeroshot_topics
What are some alternatives?
Neural-Scam-Artist - Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
BERTopic - Leveraging BERT and c-TF-IDF to create easily interpretable topics.
beir - A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
TabFormer - Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
dataset-viewer - Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
kogpt - KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)
cleanlab - The standard package for machine learning with noisy labels and finding mislabeled data. Works with most datasets and models. [Moved to: https://github.com/cleanlab/cleanlab]
frame-semantic-transformer - Frame Semantic Parser based on T5 and FrameNet
awesome-open-data-annotation - Open Source Data Annotation & Labeling Tools
cappr - Completion After Prompt Probability. Make your LLM make a choice
dgl-ke - High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
prosodic - Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.