machine-translation

Top 23 machine-translation Open-Source Projects

  • NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

  • NeMo

    NeMo: a framework for generative AI

  • Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

    I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • espnet

    End-to-End Speech Processing Toolkit

  • Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

    You might check out this list from espnet. They list the different corpuses they use to train their models sorted by language and task (ASR, TTS etc):

    https://github.com/espnet/espnet/blob/master/egs2/README.md

  • OpenNMT-py

    Open Source Neural Machine Translation and (Large) Language Models in PyTorch

  • Project mention: Making a custom Google Translate equivalent / web translation filter for my conlang? | /r/conlangs | 2023-04-26

    I already tried this with OpenNMT.

  • manga-image-translator

    Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/

  • Project mention: [DISC] - The angel who came to pick me up is a Gal (Oneshot by Shiraishi Kouhei) | /r/manga | 2023-09-06

    OCR works pretty good. ocr.space, ocr.best and cotrans.touhou.ai/ are all pretty nice.

  • spark-nlp

    State of the Art Natural Language Processing

  • Project mention: Spark NLP 5.1.0: Introducing state-of-the-art OpenAI Whisper speech-to-text, OpenAI Embeddings and Completion transformers, MPNet text embeddings, ONNX support for E5 text embeddings, new multi-lingual BART Zero-Shot text classification, and much more! | /r/Python | 2023-09-06
  • argos-translate

    Open-source offline translation library written in Python

  • Project mention: Fast and secure translation on your local machine with a GUI | news.ycombinator.com | 2024-04-13

    Interestingly, I think this is actually related to the offline translation features built into Firefox. Both are products of "Project Bergamot", but the Mozilla-maintained version was later merged into the Firefox application:

    https://browser.mt/

    https://blog.mozilla.org/en/mozilla/local-translation-add-on...

    https://hacks.mozilla.org/2022/06/training-efficient-neural-...

    https://github.com/mozilla/firefox-translations

    https://firefox-source-docs.mozilla.org/toolkit/components/t...

    Extra webpage with screenshot and links, impossible to search for normally:

    https://translatelocally.com/downloads/

    Does one thing and does it well.

    Oh— For downloading models, it's much easier to pipe/`xargs` `translateLocally --available-models` into `translateLocally -d` than go through the GUI.

    ---

    Other self-hostable translation tools:

    https://www.apertium.org/index.eng.html

    - Traditional rule-based translation. Seems to work pretty well, but no good desktop frontend.

    https://www.argosopentech.com/

    - Works, but crashy desktop app.

    https://libretranslate.com/

    - API wrapping Argos Translate.

    https://lingva.thedaviddelta.com/

    - Google Translate scraper/privacy frontend.

    https://euroglot.com/

    - Proprietary, subscription trialware.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • lingvo

    Lingvo

  • CTranslate2

    Fast inference engine for Transformer models

  • Project mention: Distil-Whisper: distilled version of Whisper that is 6 times faster, 49% smaller | news.ycombinator.com | 2023-10-31

    Just a point of clarification - faster-whisper references it but ctranslate2[0] is what's really doing the magic here.

    Ctranslate2 is a sleeper powerhouse project that enables a lot. They should be up front and center and get the credit they deserve.

    [0] - https://github.com/OpenNMT/CTranslate2

  • RL4LMs

    A modular RL library to fine-tune language models to human preferences

  • Project mention: How To Setup a Model With Guardrails? | /r/LocalLLaMA | 2023-05-12

    I think of guardrails as another dimension of human preferences: whether you are training a model to answer questions more gooder or avoid saying horrifying stuff, you are teaching the model a preference. So I thinks it's a straightforward RLHF problem but from a different perspective.

  • BartyCrouch

    Localization/I18n: Incrementally update/translate your Strings files from .swift, .h, .m(m), .storyboard or .xib files.

  • fastText_multilingual

    Multilingual word vectors in 78 languages

  • Project mention: Ask HN: What's the coolest non standard application of LLMs you've seen? | news.ycombinator.com | 2023-12-23

    (6 years ago)

    Aligning the fastText vectors of 78 languages

    https://github.com/babylonhealth/fastText_multilingual/blob/...

  • nematus

    Open-Source Neural Machine Translation in Tensorflow

  • Opus-MT

    Open neural machine translation models and web services

  • COMET

    A Neural Framework for MT Evaluation (by Unbabel)

  • edenai-apis

    Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

  • Project mention: We're Building an Open-Source LLM/AI API Wrapper: Here's Why | news.ycombinator.com | 2023-08-28

    HackerNoon featured our latest article in the "Future of AI" category

    We explain how Eden AI contributes to the AI ecosystem in structuring AI and LLM APIs by creating the most accomplished Open-Source wrapper possible.

    You can support us in reaching 1000 stars on Github here: https://github.com/edenai/edenai-apis

  • dsnote

    Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

  • Project mention: Speech Note: offline Linux app for note taking, reading and translating | news.ycombinator.com | 2023-08-30
  • OPUS-MT-train

    Training open neural machine translation models

  • bergamot-translator

    Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.

  • Project mention: Fast and secure translation on your local machine with a GUI | news.ycombinator.com | 2024-04-13

    Interestingly, I think this is actually related to the offline translation features built into Firefox. Both are products of "Project Bergamot", but the Mozilla-maintained version was later merged into the Firefox application:

    https://browser.mt/

    https://blog.mozilla.org/en/mozilla/local-translation-add-on...

    https://hacks.mozilla.org/2022/06/training-efficient-neural-...

    https://github.com/mozilla/firefox-translations

    https://firefox-source-docs.mozilla.org/toolkit/components/t...

    Extra webpage with screenshot and links, impossible to search for normally:

    https://translatelocally.com/downloads/

    Does one thing and does it well.

    Oh— For downloading models, it's much easier to pipe/`xargs` `translateLocally --available-models` into `translateLocally -d` than go through the GUI.

    ---

    Other self-hostable translation tools:

    https://www.apertium.org/index.eng.html

    - Traditional rule-based translation. Seems to work pretty well, but no good desktop frontend.

    https://www.argosopentech.com/

    - Works, but crashy desktop app.

    https://libretranslate.com/

    - API wrapping Argos Translate.

    https://lingva.thedaviddelta.com/

    - Google Translate scraper/privacy frontend.

    https://euroglot.com/

    - Proprietary, subscription trialware.

  • bitextor

    Bitextor generates translation memories from multilingual websites

  • masakhane-mt

    Machine Translation for Africa

  • pantran.nvim

    Use your favorite machine translation engines without having to leave your favorite editor.

  • cybertron

    Cybertron: the home planet of the Transformers in Go (by nlpodyssey)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-13.

machine-translation related posts

Index

What are some of the best open-source machine-translation projects? This list will help you:

Project Stars
1 NLP-progress 22,290
2 NeMo 9,951
3 espnet 7,852
4 OpenNMT-py 6,550
5 manga-image-translator 4,127
6 spark-nlp 3,667
7 argos-translate 3,208
8 lingvo 2,781
9 CTranslate2 2,750
10 RL4LMs 2,074
11 BartyCrouch 1,353
12 fastText_multilingual 1,186
13 nematus 796
14 Opus-MT 515
15 COMET 391
16 edenai-apis 355
17 dsnote 321
18 OPUS-MT-train 302
19 bergamot-translator 297
20 bitextor 277
21 masakhane-mt 266
22 pantran.nvim 265
23 cybertron 258
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com