ThoughtSource
PLOD-AbbreviationDetection
ThoughtSource | PLOD-AbbreviationDetection | |
---|---|---|
1 | 1 | |
844 | 9 | |
1.5% | - | |
8.4 | 0.0 | |
10 months ago | over 1 year ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | Creative Commons Attribution Share Alike 4.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ThoughtSource
PLOD-AbbreviationDetection
-
Clustering to find abbreviations
Finally, the main problem with unsupervised learning is that you won't be able to reliably measure system performance or improvement. In my view, any time you can spend annotating and collecting data for a (semi-)supervised solution will be well-spent. Existing datasets can also get you started with model development, such as https://github.com/surrey-nlp/PLOD-AbbreviationDetection. Once you have a good model on a conventional dataset, you should be able to start generalizing it to your specific task/dataset.
What are some alternatives?
medmcqa - A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.
converse - Conversational text Analysis using various NLP techniques
hate-speech-and-offensive-language - Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
goodreads - code samples for the goodreads datasets
nlp - Repository for all things Natural Language Processing
datasets - 🎁 5,400,000+ Unsplash images made available for research and machine learning
transformers-interpret - Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
adaptnlp - An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.