masakhane-web
adaptnlp
masakhane-web | adaptnlp | |
---|---|---|
2 | 2 | |
35 | 414 | |
- | 0.0% | |
3.5 | 0.0 | |
10 months ago | over 2 years ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
masakhane-web
-
NLP Communities for Data Professionals to Join
Masakhane pushing to build datasets and tools to facilitate Natural Language Processing in African languages and pose new research problems to enrich the NLP research landscape. A research effort originally for Machine translation focused on African languages that are open-source, continent-wide, and distributed online. It aimed to build a community of Natural Language Processing researchers, connect and grow it, spurring and sharing further research to enable language preservation, tool building, and increasing its global visibility and relevance.
-
THE AMAZING WORKS DONE BY MASAKHANE IN NLP SPACE
Machine Translation for African languages: Masakhane has an open-source online web for machine translation services for solely African languages. Masakhane Web is the platform that aims at hosting the already trained machine translation models from the Masakhane community and allows contributions from users to create new data for retraining and improving the models. If you would like to contribute to this project, train a model in your language or want to collaborate and work with Masakhane, find out how in https://github.com/dsfsi/masakhane-web or reach out to any of the Masakhane Web contributors. The machine translation for the African language project now has 52 African languages with benchmarks which can be seen on the Masakhane project's Github page.
adaptnlp
-
Tools to use for Semantic-searching Question Answering System
Check out adaptnlp
-
Case Sensitivity using HuggingFace & Google's T5 model (base)
Yes, there are capitals in the tokenizer vocabulary of t5-base and t5-small, so both support capitalization. A few days ago I was using t5-small through adaptnlp for extractive summarization and capitalization was working fine (https://github.com/Novetta/adaptnlp). AdaptNLP is basically just a transformers wrapper, so if you can't figure out a solution, you could just dissect their source code.
What are some alternatives?
masakhane-wazobia-dataset - Some Nigerian Parallel Corpora: Yoroba, Igbo, Hausa, Urhobo
Basic-UI-for-GPT-J-6B-with-low-vram - A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.
keytotext - Keywords to Sentences
fastai - The fastai deep learning library
gector - Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
browser-ml-inference - Edge Inference in Browser with Transformer NLP model
Transformers-Tutorials - This repository contains demos I made with the Transformers library by HuggingFace.
ML-Workspace - 🛠 All-in-one web-based IDE specialized for machine learning and data science.
Deep-Learning-Experiments - Videos, notes and experiments to understand deep learning
BLOOM-fine-tuning - Finetune BLOOM
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
newsapi - webscraping api which gets news from different websites and display as a api