language-identification

Top 12 language-identification Open-Source Projects

  • mlkit

    A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS

  • lingua-go

    The most accurate natural language detection library for Go, suitable for short text and mixed-language text

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • lingua-py

    The most accurate natural language detection library for Python, suitable for short text and mixed-language text

  • Project mention: Typos — automatic language recognition and error detection in Word and Excel documents | /r/microsoftoffice | 2023-10-27

    ᅠ✅ Recognition of 75 languages

  • lingua-rs

    The most accurate natural language detection library for Rust, suitable for short text and mixed-language text

  • Project mention: I created a program that finds out which anki cards out of 50_000 are in english and deletes them in 2 minutes | /r/rust | 2023-10-23

    Discovery of Lingua: While working on a different project, I discovered the Lingua library.

  • lingua

    The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

  • simplemma

    Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

  • efficient-language-detector

    Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.

  • Project mention: My first public Repository: Efficient Language Detector. | /r/PHP | 2023-06-16

    Same with copyright: Having a legal.md or licence.md is sufficient. If you really feel like it you can add ONE line to each file pointing to the readme /** legal info: https://github.com/nitotm/efficient-language-detector/blob/main/readme.md */

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • fastlangid

    fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)

  • py3langid

    Faster, modernized fork of the language identification tool langid.py

  • efficient-language-detector-js

    Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.

  • Project mention: My first Node Package: Efficient Language Detector | /r/node | 2023-10-06
  • language-detection-cld2

    Natural language detection, Java bindings for CLD2

  • Language_Identifier

    Language Identification classification using XGBoost

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

language-identification related posts

  • Typos — automatic language recognition and error detection in Word and Excel documents

    1 project | /r/microsoftoffice | 27 Oct 2023
  • Lingua 1.2.0 - The most accurate natural language detection library for Go, now with support for detecting multiple languages in mixed-language text

    1 project | /r/golang | 12 Dec 2022
  • Lingua 1.1.0 - The most accurate natural language detection library for Go, suitable for long and short text alike

    1 project | /r/golang | 21 Nov 2022
  • Hacker News top posts: Feb 12, 2022

    5 projects | /r/hackerdigest | 12 Feb 2022
  • The most accurate natural language detection library for Go, suitable for long and short text alike

    1 project | /r/golang | 12 Feb 2022
  • Lingua-Go, the most accurate language detection for Go

    7 projects | news.ycombinator.com | 11 Feb 2022
  • Language Identification using XGBoost. Code for training and application of a language identification model. Trained on the WiLI-2018 database, the classifier achieves an accuracy of 85.97% on the WiLi test dataset for 235 languages.

    1 project | /r/LanguageTechnology | 17 Mar 2021
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 11 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source language-identification projects? This list will help you:

Project Stars
1 mlkit 3,344
2 lingua-go 1,103
3 lingua-py 925
4 lingua-rs 824
5 lingua 660
6 simplemma 125
7 efficient-language-detector 38
8 fastlangid 35
9 py3langid 33
10 efficient-language-detector-js 21
11 language-detection-cld2 13
12 Language_Identifier 1

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com