OPUS-MT-train
Opus-MT
OPUS-MT-train | Opus-MT | |
---|---|---|
1 | 3 | |
304 | 530 | |
3.6% | 4.9% | |
1.7 | 4.8 | |
about 2 months ago | 9 days ago | |
Makefile | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
OPUS-MT-train
-
Amazon releases 51-language dataset for language understanding
https://translatelocally.com/ is a nice gui around marian/bergamot. So far not very many bundled pairs, though I would guess any of the models from https://github.com/Helsinki-NLP/Opus-MT-train/tree/master/mo... and https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/maste... should be usable.
There is also Apertium, a rule-based system which is very good for some closely-related pairs that have had a lot of work put into them (especially translation between Romance languages, e.g. Spanish→Catalan, and Norwegian Bokmål→Nynorsk), and the only OK translator for some lesser-resourced languages (e.g. Northern Saami→Norwegian Bokmål), but very underdeveloped for anything to/from English (it feels a bit pointless writing rules for English where there is so much available data; RBMT shines where there's not enough available data, ie. most of the languages of the world)
Opus-MT
-
“sync,corrected by elderman” issue in ML translation datasets spread on internet
- mention on GitHub repo of a translation model https://github.com/Helsinki-NLP/Opus-MT/issues/62
I'm curious to see if anyone else has interesting encounters with this
-
How worried are you about AI taking over music?
Yes, most models these days, except the exceptionally large ones, are possible to train on a laptop. Of course it helps if your laptop has Nvidia CUDA GPU, but even if it doesn't you can rent an AWS 4 core/16GB GPU instance for 0.5 cents an hour. 24 hours of training time would be quite a lot for most models, unless you're trying to train a FB any to any language type model, but typically the big huge models are not the most interesting ones, and you can get very good results, and interesting models with substantially smaller sets of data. Opus MT models are only one language to one language, but they're about 300MB a model, and the quality rivals FB's models, and the speed is substantially faster. I don't have as many examples from the music space, as it's still a fairly under explored area, but Google has released Magenta which is a pretrained Tensorflow music model(actually a group of 3-4 models).
- Helsinki-NLP/Opus-MT: Open neural machine translation models and web services
What are some alternatives?
NLP-progress - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
OpenNMT-py - Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Tatoeba-Challenge
fastText - Library for fast text representation and classification.
tensor2tensor - Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Neural-Machine-Translated-communication-system - The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.
Face-Recognition_Flutter - A sample Face recognition app using Flutter and Firebase ML Kit
klpt - The Kurdish Language Processing Toolkit
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration