Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 11 text-search Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
ugrep
NEW ugrep 5.1: an ultra fast, user-friendly, compatible grep. Ugrep combines the best features of other grep, adds new features, and searches fast. Includes a TUI and adds Google-like search, fuzzy search, hexdumps, searches nested archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
-
usearch
Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
-
qlever
Very fast SPARQL Engine, which can handle very large knowledge graphs like the complete Wikidata, offers context-sensitive autocompletion for SPARQL queries, and allows combination with text search. It's faster than engines like Blazegraph or Virtuoso, especially for queries involving large result sets.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
hotpdf
hotpdf is a fast PDF parsing library to extract text and find text within PDF documents built on top of pdfminer.six
-
powershell-grep
PowerShell-Grep brings the power and flexibility of the Linux grep command to Windows PowerShell. It's an essential tool for those who need advanced search capabilities in a familiar, command-line interface.
Project mention: Introducing pgzx: create PostgreSQL extensions using Zig | news.ycombinator.com | 2024-03-21And lots of interesting extensions use it, like
https://github.com/tembo-io/pgmq
Project mention: Character and Subsector generators for Classic Traveller, with TAS Forms! | /r/traveller | 2023-12-07I wrote an online catalog a while back (and I need to get back on adding graphics and products at some point). It’s written using Eleventy and the minisearch library. The source and data are available on Github if you want to see how I did things. I’m not a professional web designer either, but it was a fun project.
Project mention: [D] Is it better to create a different set of Doc2Vec embeddings for each group in my dataset, rather than generating embeddings for the entire dataset? | /r/MachineLearning | 2023-10-28I'm using Top2Vec with Doc2Vec embeddings to find topics in a dataset of ~4000 social media posts. This dataset has three groups:
Project mention: Ugrep – a more powerful, ultra fast, user-friendly, compatible grep | news.ycombinator.com | 2023-12-30
Project mention: USearch SQLite Extensions for Vector and Text Search | news.ycombinator.com | 2024-02-22
Project mention: Show HN: Hotpdf – Search and Extract text within PDFs | news.ycombinator.com | 2024-02-27
Project mention: Introducing Grep and Which for Powershell: Essential Tools for Windows, Linux, and Azure | /r/developers | 2023-05-24Grep for Powershell: https://github.com/The-da-vinci/powershell-grep Which for Powershell: [link]
text-search related posts
- Introducing pgzx: create PostgreSQL extensions using Zig
- QLever – Fast Sparql Engine
- Distributed RDF Query Processing
- Integrate PostgreSQL and Elasticsearch – ZomboDB
- ZomboDB: Making Postgres and Elasticsearch work together like it's 2022
- State of the art for serde-compatible CBOR encoding/decoding?
- Full text search PG and elastic [High-level questions]
-
A note from our sponsor - InfluxDB
www.influxdata.com | 19 Apr 2024
Index
What are some of the best open-source text-search projects? This list will help you:
Project | Stars | |
---|---|---|
1 | zombodb | 4,608 |
2 | minisearch | 4,055 |
3 | Top2Vec | 2,833 |
4 | ugrep | 2,422 |
5 | usearch | 1,611 |
6 | fuzzysearch | 280 |
7 | qlever | 272 |
8 | hotpdf | 162 |
9 | mongo-search | 10 |
10 | powershell-grep | 4 |
11 | kawadi | 3 |