Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 22 string-matching Open-Source Projects
-
StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
go-edlib
📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
-
strutil-go
Golang metrics for calculating string similarity and other string utility functions (by adrg)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
JaroWinkler
Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity
-
multi_string_replace
A fast multiple string replace library for ruby. Uses a C implementation of the Aho–Corasick Algorithm based on https://github.com/morenice/ahocorasick while adding support for on the fly multiple string replacement. Faster alternative to String.gsub when dealing with non-regex (exact match) use cases
-
boyermoore
Boyer-moore in pure python, search for unicode strings in large files quickly (by eriknyquist)
-
STS-Crawler
A python reddit bot for /r/SlayTheSpire that automatically (soft) finds cards and relics mentioned in post titles and comments with descriptions the help new players
-
Name-QuickSearch
Find the best fuzzy match for a natural language string in a set of hundreds of thousands of strings in a split second.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: RapidFuzz: Rapid fuzzy string matching in Python | news.ycombinator.com | 2024-02-14
Project mention: Measuring energy usage: regular code vs. SIMD code | news.ycombinator.com | 2024-02-19The 3.5x energy-efficiency gap between serial and SIMD code becomes even larger when
A. you do byte-level processing instead of float words;
B. you use embedded, IoT, and other low-energy devices.
A few years ago I've compared Nvidia Jetson Xavier (long before the Orin release), Intel-based MacBook Pro with Core i9, and AVX-512 capable CPUs on substring search benchmarks.
On Xavier one can quite easily disable/enable cores and reconfigure power usage. At peak I got to 4.2 GB/J which was an 8.3x improvement in inefficiency over LibC in substring search operations. The comparison table is still available in the older README: https://github.com/ashvardanian/StringZilla/tree/v2.0.2?tab=...
_name=jarowinkler pkgname=python-$_name pkgver=1.2.3 pkgrel=2 pkgdesc='A library for fast approximate string matching using Jaro and Jaro-Winkler similarity' arch=(x86_64) url='https://github.com/maxbachmann/JaroWinkler' license=(MIT) depends=(python) makedepends=(jarowinkler-cpp python-rapidfuzz-capi python-scikit-build)
Project mention: Show HN: A fast, accurate and multilingual fuzzy search lib for the front end | news.ycombinator.com | 2024-02-14
string-matching related posts
- RapidFuzz: Rapid fuzzy string matching in Python
- Manjaro Package Installation Error
- OVOS migration with docker containers ...
- Map columns from 2 data sources when colums are named differently
- finding common strings
- Show HN: An Excel Wordle Solver
- Snecko + pyramid has been completed, meteor strike is the mvp of the run. (Update from earlier post)
-
A note from our sponsor - InfluxDB
www.influxdata.com | 27 Apr 2024
Index
What are some of the best open-source string-matching projects? This list will help you:
Project | Stars | |
---|---|---|
1 | RapidFuzz | 2,348 |
2 | StringZilla | 1,776 |
3 | PolyFuzz | 716 |
4 | go-edlib | 444 |
5 | closestmatch | 416 |
6 | strutil-go | 276 |
7 | simplematch | 173 |
8 | trrex | 134 |
9 | LGenerics | 102 |
10 | wildmatch | 68 |
11 | JaroWinkler | 52 |
12 | ATGValidator | 51 |
13 | multi_string_replace | 21 |
14 | boyermoore | 19 |
15 | STS-Crawler | 18 |
16 | wordlexcel | 13 |
17 | libaca | 7 |
18 | Name-QuickSearch | 4 |
19 | wildmatch-go | 4 |
20 | mscs-thesis-project | 3 |
21 | wordle-solver | 3 |
22 | fuzzy-search | 0 |
Sponsored