ftfy
TextDistance
ftfy | TextDistance | |
---|---|---|
2 | 6 | |
3,724 | 3,308 | |
1.0% | 0.7% | |
5.5 | 6.1 | |
2 days ago | 29 days ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ftfy
-
You can't just assume UTF-8
If you’re actually in a position where you need to guess the encoding, something like “ftfy” <https://github.com/rspeer/python-ftfy> (webapp: <https://ftfy.vercel.app/>) is a perfectly reasonable choice.
But, you should always do your absolute utmost not to be put in a situation where guessing is your only choice.
-
7 Useful Python Libraries You Should Use in Your Next Project
ftfy
TextDistance
- textdistance: Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
-
Near duplicate image detection
Now let's compare the hash to all the image hashes using Levenshtein distance. We'll use the textdistance library for that.
- life4/textdistance: Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
- Textdistance: Compute distance between sequences with 30 algorithms
What are some alternatives?
fuzzywuzzy - Fuzzy String Matching in Python
jellyfish - 🪼 a python library for doing approximate and phonetic matching of strings.
chardet - Python character encoding detector
xpinyin - Translate Chinese hanzi to pinyin (拼音) by Python, 汉字转拼音
Levenshtein - The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
pyfiglet - An implementation of figlet written in Python
pydantic - Data validation using Python type hints
Charset Normalizer - Truly universal encoding detector in pure Python
Python Left-Right Parser - Python Parser
pangu.py - Paranoid text spacing in Python
ceja - PySpark phonetic and string matching algorithms