Python Linguistics

Open-source Python projects categorized as Linguistics

Top 11 Python Linguistic Projects

  • rime-cantonese

    Rime Cantonese input schema | 粵語拼音輸入方案

  • Project mention: How to type Jyutcitzi? 【RIME keyboard installation manual】? | /r/CantoneseScriptReform | 2023-12-07

    Please follow instructions at https://github.com/rime/rime-cantonese/wiki and https://github.com/rime/rime-cantonese/wiki/新手安裝教程 In a nutshell, download and install using the following files: Mac: mac-2021.05.16-installer.pkg Windows: windows-sfx-2021.05.16-installer.exe Linux: Download and run ibus-install.sh Please check to ensure that RIME Cantonese is properly installed before proceeding to Step 3.

  • wikipron

    Massively multilingual pronunciation mining

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ambuda

    Main application code for Ambuda, a breakthrough Sanskrit library (ambuda.org)

  • Project mention: The Theorist Who Sees Math in Art, Music and Writing | news.ycombinator.com | 2024-03-04

    >"Thousands of years ago in India, poets were trying to think about the possible meters. In Sanskrit poetry, you have long and short syllables. Long is twice as long as short. If you want to work out how many there are that take a length of time of three, you can have short, short, short, or long, short, or short, long. There are three ways to make three. There are five ways to make a length-four phrase. And there are eight ways to make a length-five phrase. This sequence you’re getting is one where every term is the sum of the previous two. You exactly reproduce what we nowadays call the Fibonacci sequence. But this was centuries before Fibonacci."

    Related:

    Ambuda: "Building the world's largest Sanskrit library":

    https://ambuda.org/

  • langstats

    A visual color bar of the programming languages in your directory, with percentages and labels

  • zeroshot_topics

    Topic Inference with Zeroshot models

  • iso639

    ISO 639 language codes (by jacksonllee)

  • google-books-ngram-frequency

    Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • syn

    🌾 Get synonyms and antonyms of words from Thesaurus.com and other sources in your terminal, with rich output. (by agmmnn)

  • grzegorz

    A comand-line phonetics tool for finding minimal pairs

  • top-open-subtitles-sentences

    Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code

  • Project mention: A colloquial (عامیانه) frequency list! Our prayers have been answered. | /r/farsi | 2023-08-03
  • loquax

    NLP framework for phonology

  • Project mention: Seeking your insights on "Loquax": A tool for phonological analysis | /r/latin | 2023-05-30

    Lovely - thanks so much for the feedback u/christmas_fan1 - it means a lot. I've created an issue with it linking back to your original comment: https://github.com/mattlianje/loquax/issues/11

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Linguistics related posts

Index

What are some of the best open-source Linguistic projects in Python? This list will help you:

Project Stars
1 rime-cantonese 492
2 wikipron 289
3 ambuda 79
4 langstats 61
5 zeroshot_topics 60
6 iso639 27
7 google-books-ngram-frequency 26
8 syn 26
9 grzegorz 12
10 top-open-subtitles-sentences 12
11 loquax 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com