Neural-Scam-Artist
tasksource
Neural-Scam-Artist | tasksource | |
---|---|---|
2 | 3 | |
23 | 122 | |
- | - | |
0.0 | 7.8 | |
over 2 years ago | 3 months ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Neural-Scam-Artist
tasksource
-
[D] What are notable advances in NLU?
Technically, BERT (bert-base) is not sota anymore. deberta+MTT-DNN (multi-task learning on many datasets) https://ibm.github.io/model-recycling/ is arguably sota.
-
[Discussion] ChatGPT and language understanding benchmarks
LAMA, truthfulQA, MMLU, and many others
-
[R] tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation (480 tasks+ sota encoder)
Found relevant code at https://github.com/sileod/tasksource + all code implementations here
What are some alternatives?
datasketch - MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
beir - A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
bertviz - BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
dataset-viewer - Lightweight web API for visualizing and exploring any dataset - computer vision, speech, text, and tabular - stored on the Hugging Face Hub
LSH - Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Transformers4Rec - Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
Extracting-Training-Data-from-Large-Langauge-Models - A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020
intertext - Detect and visualize text reuse