Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Tatoeba-Challenge Alternatives
Similar projects and alternatives to Tatoeba-Challenge
-
edenai-apis
Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
fastseq
An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/pdf/2106.04718.pdf
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Tatoeba-Challenge reviews and mentions
-
OpenAI GPT-3 vs Other Models [Benchmark] - Should AI companies be really worried ?
Automatically translate a text from a language A to a language B. 1/ Dataset : we chose a dataset from the Language Technology Research Group at the University of Helsinki’s Tatoeba Translation Challenge . We took 100 of examples from different latin languages pairs : deu-fra, eng-fra, fra -ita, deu-spa , deu-swe which constitutes a 500 example test dataset.
-
Amazon releases 51-language dataset for language understanding
https://translatelocally.com/ is a nice gui around marian/bergamot. So far not very many bundled pairs, though I would guess any of the models from https://github.com/Helsinki-NLP/Opus-MT-train/tree/master/mo... and https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/maste... should be usable.
There is also Apertium, a rule-based system which is very good for some closely-related pairs that have had a lot of work put into them (especially translation between Romance languages, e.g. Spanish→Catalan, and Norwegian Bokmål→Nynorsk), and the only OK translator for some lesser-resourced languages (e.g. Northern Saami→Norwegian Bokmål), but very underdeveloped for anything to/from English (it feels a bit pointless writing rules for English where there is so much available data; RBMT shines where there's not enough available data, ie. most of the languages of the world)
-
[P] What we learned by accelerating by 5X Hugging Face generative language models
#1: University of Helsinki language technology professor Jörg Tiedemann has released a dataset with over 500 million translated sentences in 188 languages | 0 comments #2: The NLP Index: 3,000+ code repos for hackers and researchers. [self-promotion] #3: A Python library to boost T5 models speed up to 5x & reduce the model size by 3x.
-
Labelling of Text (NLP)
#1: Matching GPT-3's performance with just 0.1% of its parameters #2: University of Helsinki language technology professor Jörg Tiedemann has released a dataset with over 500 million translated sentences in 188 languages | 0 comments #3: Trained a Markov Chain on a bunch of r/WSB posts and comments. Only 2-word conditional probabilities but honestly, that's all that's necessary 🚀🚀
- Helsinki professor Jörg Tiedemann – 500M translations in 188 languages
- Thought it could be useful to someone
- University of Helsinki language technology professor Jörg Tiedemann has released a dataset with over 500 million translated sentences in 188 languages
- Translated language database released by Helsinki scientist
- 500 million sentences in 188 languages
-
A note from our sponsor - InfluxDB
www.influxdata.com | 9 May 2024
Stats
Helsinki-NLP/Tatoeba-Challenge is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of Tatoeba-Challenge is Makefile.
Sponsored