Top 17 C++ Search Projects

C-Plus-Plus

3 29,094 6.4 C++

Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
Typesense

129 17,876 9.8 C++

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

Project mention: Website Search Hurts My Feelings | news.ycombinator.com | 2023-12-26

There are actually plenty of non-ES products that are way easier to integrate and tune (and get better results with less effort).
- Typesense (https://github.com/typesense/typesense)
- Algolia
- Google Programmable Search Engine (https://programmablesearchengine.google.com/about/)

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
manticoresearch

33 8,289 9.9 C++

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon

Project mention: Building and testing Manticore Search | dev.to | 2024-03-05

Note, you need to do it in the root folder of a clone from https://github.com/manticoresoftware/manticoresearch

ugrep

24 2,429 9.1 C++

NEW ugrep 5.1: an ultra fast, user-friendly, compatible grep. Ugrep combines the best features of other grep, adds new features, and searches fast. Includes a TUI and adds Google-like search, fuzzy search, hexdumps, searches nested archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more

Project mention: Ugrep – a more powerful, ultra fast, user-friendly, compatible grep | news.ycombinator.com | 2023-12-30

usearch

20 1,629 9.8 C++

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

Project mention: USearch SQLite Extensions for Vector and Text Search | news.ycombinator.com | 2024-02-22

pisa

1 855 8.2 C++

PISA: Performant Indexes and Search for Academia

Project mention: A Compressed Indexable Bitset | news.ycombinator.com | 2023-07-01

The EF core algorithm implemented in folly [3] may be a bit faster, and implementing partitioning on top of that is relatively easy.
It would definitely compress much better than roaring bitmaps. In terms of performance, it depends on the access patterns. If very sparse (large jumps) PEF would likely be faster, if dense (visit a large fraction of the bitmap) it'd be slower.
It is possible to squeeze a bit more compression out of PEF by introducing a chunk type for Elias-Fano of the chunk complement (for very dense chunks), but you lose the operation of skipping to a given position, which is however not needed in inverted indexes (you only need to skip past a given id, and that can be supported efficiently). That is not mentioned in the paper because at the time I thought the skip-to-position operation was a non-negotiable.
[1] https://github.com/ot/ds2i/
[2] https://github.com/pisa-engine/pisa
[3] https://github.com/facebook/folly/blob/main/folly/experiment...

clp

2 715 9.3 C++

Compressed Log Processor (CLP) is a free tool capable of compressing text logs and searching the compressed logs without decompression. (by y-scope)
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ustore

15 486 9.6 C++

Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
VanitySearch

3 390 0.0 C++

Bitcoin Address Prefix Finder
fccf

7 345 0.0 C++

fccf: A command-line tool that quickly searches through C/C++ source code in a directory based on a search string and prints relevant code snippets that match the query.
grab

2 257 0.0 C++

experimental and very fast implementation of a grep (by stealth)

Project mention: Ugrep – a more powerful, ultra fast, user-friendly, compatible grep | news.ycombinator.com | 2023-12-30

Also look at https://github.com/stealth/grab from Sebastian Krahmer.

hypergrep

5 164 9.3 C++

Recursively search directories for a regex pattern

Project mention: Ugrep – a more powerful, ultra fast, user-friendly, compatible grep | news.ycombinator.com | 2023-12-30

Another issue with Hyperscan is that if you enable HS_FLAG_UTF8[1], which hypergrep does[2,3], and then search invalid UTF-8, then the result is UB.
> This flag instructs Hyperscan to treat the pattern as a sequence of UTF-8 characters. The results of scanning invalid UTF-8 sequences with a Hyperscan library that has been compiled with one or more patterns using this flag are undefined.
That's another issue you'll need to grapple with if you use Hyperscan. PCRE2 used to have this issue[4], but they've since defined the semantics of searching invalid UTF-8 with Unicode mode enabled. ripgrep 14 uses that new mode, but I haven't updated that FAQ answer yet.
[1]: https://intel.github.io/hyperscan/dev-reference/api_files.ht...
[2]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...
[3]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...
[4]: https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#why...

ds2i

1 141 0.0 C++

A library of inverted index data structures

Project mention: A Compressed Indexable Bitset | news.ycombinator.com | 2023-07-01

The EF core algorithm implemented in folly [3] may be a bit faster, and implementing partitioning on top of that is relatively easy.
It would definitely compress much better than roaring bitmaps. In terms of performance, it depends on the access patterns. If very sparse (large jumps) PEF would likely be faster, if dense (visit a large fraction of the bitmap) it'd be slower.
It is possible to squeeze a bit more compression out of PEF by introducing a chunk type for Elias-Fano of the chunk complement (for very dense chunks), but you lose the operation of skipping to a given position, which is however not needed in inverted indexes (you only need to skip past a given id, and that can be supported efficiently). That is not mentioned in the paper because at the time I thought the skip-to-position operation was a non-negotiable.
[1] https://github.com/ot/ds2i/
[2] https://github.com/pisa-engine/pisa
[3] https://github.com/facebook/folly/blob/main/folly/experiment...

Katalog

2 56 9.2 C++

Katalog is an application to manage catalogs of disks and files to search and get statistics.
faiss-mobile

1 34 5.5 C++

FAISS library compiled for iOS, macOS, tvOS, watchOS

Project mention: FAISS Mobile for iOS / macOS | news.ycombinator.com | 2024-03-19

looqs

5 20 5.2 C++

FTS desktop file search with previews
Spectacle

1 12 0.0 C++

Spectacle is the first global search for Unreal Engine 4 specifiers. Check out the finished product at https://unrealistic.dev/spectacle. (by UnrealisticDev)
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Search related posts

Remote Machine Learning and Searching on a Raspberry Pi 5
2 projects | /r/immich | 11 Dec 2023
You Shouldn't Invest in Vector Databases?
4 projects | news.ycombinator.com | 25 Nov 2023
Win-Vind: Vim powers with speed of thought in Windows 11
5 projects | news.ycombinator.com | 11 Nov 2023
ugrep 4.3.2 with updated TUI
3 projects | /r/linux | 7 Nov 2023
DNS record "hn.algolia.com" is gone
3 projects | news.ycombinator.com | 9 Oct 2023
A Compressed Indexable Bitset
6 projects | news.ycombinator.com | 1 Jul 2023
Obsidian Publish full text search
1 project | /r/ObsidianMD | 28 Jun 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 26 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Search projects in C++? This list will help you:

	Project	Stars
1	C-Plus-Plus	29,094
2	Typesense	17,876
3	manticoresearch	8,289
4	ugrep	2,429
5	usearch	1,629
6	pisa	855
7	clp	715
8	ustore	486
9	VanitySearch	390
10	fccf	345
11	grab	257
12	hypergrep	164
13	ds2i	141
14	Katalog	56
15	faiss-mobile	34
16	looqs	20
17	Spectacle	12