StringDistances.jl
mudderjs
StringDistances.jl | mudderjs | |
---|---|---|
2 | 1 | |
135 | 111 | |
- | - | |
2.3 | 0.0 | |
27 days ago | over 1 year ago | |
Julia | JavaScript | |
GNU General Public License v3.0 or later | The Unlicense |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
StringDistances.jl
-
How to do a fuzzy join of two datafrmaes in Julia ?
You'll need: * StringDistances.jl to calculate the distance between the strings. * FlexiJoins.jl to join the dataframes based on a predicate. * A custom function that you write that takes two strings as arguments and returns a boolean for if the strings are "close enough" based on their distance.
-
Getting the difference of two strings
If you need to know exactly what the diff is, you might want to use something like github.com/google/diff-match-patch. Otherwise, a simple Levenshtein distance would suffice. This library seems to have a whole bunch of string distances implemented. Hope this helps!
mudderjs
-
The surprisingly difficult problem of user-defined order in SQL
I have solved this problem for my own purposes — no claims of grand scalability or high efficiency — by storing indices or "ranks" of items as strings (Postgres TEXT) using a library called mudderjs[0] and a thin wrapper around it[1]. Sorted lexicographically (in dictionary order) arbitrary-length strings have arbitrary precision. You can always find a string between any two strings; for instance, between "a" and "b" is "am" and between "a" and "ab" is "aam". You do have to have the entire ordered collection in scope to generate a new rank for an item, but reordering an item only requires updating one row and isn't subject to floating point precision.
[0] https://github.com/fasiha/mudderjs
[1] https://github.com/pubpub/pubpub/blob/master/utils/rank.ts
What are some alternatives?
diff-match-patch - Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Awesome Nested Set - An awesome replacement for acts_as_nested_set and better_nested_set.
Cadmium - Natural Language Processing (NLP) library for Crystal
radbag-wallet - Radbag Wallet Github Repository
Quickenshtein - Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
raddish-wallet - Radbag Wallet Github Repository [Moved to: https://github.com/ZeroPointThree17/radbag-wallet]
Java String Similarity - Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
query-string - Parse and stringify URL query strings
NLP-Model-for-Corpus-Similarity - A NLP algorithm I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.
EsoMath.js - Esoteric Mathematic Library for Javascript (past names: more-math-for-JS, mostly_math, NTML.js)
go-edlib - 📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
radix-component-library - Sample component library built with Radix UI