fuzzywuzzy
pangu.py
Our great sponsors
fuzzywuzzy | pangu.py | |
---|---|---|
20 | 0 | |
9,067 | 234 | |
0.2% | - | |
0.0 | 1.9 | |
about 1 year ago | 12 months ago | |
Python | Python | |
GNU General Public License v2.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
fuzzywuzzy
-
Thanks to this sub, we now have an Anki deck for Persona 5 Royal. Spreadsheet with Jp and Eng side by side too.
Convert the original lines to full furigana and do a fuzzy match. (For reference, the original line is 貴方がこれまでに得てきた力、存分に発揮してくださいね。) You can do a regional search using the initial scene data (E60) first, and if the confidence is low, go for a slower full search.
-
Fuzzy search
It's now known as "thefuzz", see https://github.com/seatgeek/fuzzywuzzy
-
import fuzzywuzzy
fuzzywuzzy is actually just called the thefuzz now.
-
The Levenshtein Distance in Production
I used fuzzywuzzy [1], a python-based fuzzy string matching calculator that is based on Levenshtein's edit-distance for a parking enforcement product I built.
The product used an iOS client to capture license plates. The app would capture a single plate many times, de-duplicate using an edit-distance threshold matching plates up to a time period lookback. Then among those plates, send the one with the highest likely correct read up to the server.
The web app would look to see if that plate had been seen before or a plate similar had been seen before. If so, it would join a "plate group" which let you see the history of vehicle sightings.
A set of rules could be created to show vehicle in violation of a posted parking time limit. It worked when a person on patrol walked the lot twice and the app saw the "same" vehicle (thanks to edit distance)
I had Cummings Properties in New England and National Cathedral in DC as customers, but sold the product last year.
I wrote some about Vladimir Losifovich Levenshtein in a final blog entry high level blog entry about edit distance. There's a 2002 photo of Levenshtein at a conference if you're curious. [2]
-
[D] String matching
seatgeek/fuzzywuzzy: Fuzzy String Matching in Python (github.com)
pangu.py
We haven't tracked posts mentioning pangu.py yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
jellyfish - 🪼 a python library for doing approximate and phonetic matching of strings.
thefuzz - Fuzzy String Matching in Python
Levenshtein - The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
ftfy - Fixes mojibake and other glitches in Unicode text, after the fact.
TextDistance - 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
RapidFuzz - Rapid fuzzy string matching in Python using various string metrics
chardet - Python character encoding detector
pyfiglet - An implementation of figlet written in Python
汉字拼音转换工具(Python 版) - 汉字转拼音(pypinyin)
gensim - Topic Modelling for Humans
xpinyin - Translate Chinese hanzi to pinyin (拼音) by Python, 汉字转拼音