Python Text processing

Open-source Python projects categorized as Text processing

Top 23 Python Text processing Projects

  • fuzzywuzzy

    Fuzzy String Matching in Python

    Latest mention: Comparing Strings Is Easy With FuzzyWuzzy | dev.to | 2021-01-13

    The code implemented by each of the functions described above, as well as other useful FuzzyWuzzy functions, can be found here.

  • pydantic

    Data parsing and validation using Python type hints

  • diff-match-patch

    Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.

    Latest mention: Get Diff and Patch Html | dev.to | 2021-01-24

    Photo by Markus Spiske on Diff.Match.Patch based on Google library.

  • python-pinyin

    汉字转拼音(pypinyin)

  • python-ftfy

    Fixes mojibake and other glitches in Unicode text, after the fact.

  • python-phonenumbers

    Python port of Google's libphonenumber

  • sqlparse

    A non-validating SQL parser module for Python

  • lark

    Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

    Latest mention: JSON parser | reddit.com/r/Python | 2021-01-14

    Writing parsers by hand is fun, but it's much easier to use a parser: https://github.com/lark-parser/lark/blob/master/examples/json_parser.py

  • textdistance

    Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

  • ply

    Python Lex-Yacc

    Latest mention: Good Resources for creating a programming language | dev.to | 2021-01-02

    dabeaz / ply

  • chardet

    Python character encoding detector

  • shortuuid

    A generator library for concise, unambiguous and URL-safe UUIDs.

    Latest mention: GUIDs Are Not the Only Answer | news.ycombinator.com | 2021-01-05

    https://github.com/skorokithakis/shortuuid#usage

  • jellyfish

    🎐 a python library for doing approximate and phonetic matching of strings.

  • python-user-agents

    A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.

  • pyparsing

    Python library for creating PEG parsers

  • pyparsing

    Python library for creating PEG parsers

    Latest mention: Perform mathematical operations based on a string from user - Best way? Any existing library? | reddit.com/r/learnpython | 2021-01-01
  • python-slugify

    Returns unicode slugs

    Latest mention: Simple Cli Tool To View Trending Repositories And | reddit.com/r/Python | 2020-12-28
  • pyfiglet

    An implementation of figlet written in Python

  • xpinyin

    Translate Chinese hanzi to pinyin (拼音) by Python, 汉字转拼音

  • construct

    Construct: Declarative data structures for python that allow symmetric parsing and building

  • awesome-slugify

    Python flexible slugify function

  • python-nameparser

    A simple Python module for parsing human names into their individual components

  • unicode-slugify

    A slugifier that works in unicode

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-01-24.

Index

What are some of the best open-source Text processing projects in Python? This list will help you:

Project Stars
1 fuzzywuzzy 7,779
2 pydantic 5,128
3 diff-match-patch 3,967
4 python-pinyin 3,074
5 python-ftfy 2,877
6 python-phonenumbers 2,607
7 sqlparse 2,305
8 lark 2,196
9 textdistance 1,866
10 ply 1,829
11 chardet 1,424
12 shortuuid 1,404
13 jellyfish 1,386
14 python-user-agents 1,113
15 pyparsing 995
16 pyparsing 990
17 python-slugify 968
18 pyfiglet 755
19 xpinyin 702
20 construct 625
21 awesome-slugify 454
22 python-nameparser 452
23 unicode-slugify 293