SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Search Projects
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
searxng
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
Project mention: Leta – privacy focused search engine from Mullvad | news.ycombinator.com | 2025-05-28What would be the difference from duckduckgo lite?
https://docs.searxng.org/
-
txtai
💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
-
Project mention: Whoogle Search no longer available after recent Google JavaScript requirement | news.ycombinator.com | 2025-01-18
-
Project mention: Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs | news.ycombinator.com | 2025-06-18
https://github.com/Future-House/paper-qa?tab=readme-ov-file#... :
> PaperQA2 is engineered to be the best agentic RAG model for working with scientific papers.
> [ Semantic Scholar, CrossRef, ]
paperqa-zotero: https://github.com/lejacobroy/paperqa-zotero
The Oracle of Zotero is a fork of paperqa-zotero fork FAISS and langchain:
-
R2R
SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
Project mention: Show HN: Toller – A Python library for robust async calls | news.ycombinator.com | 2025-05-13I built this after a painful incident with one of my R2R (https://github.com/SciPhi-AI/R2R) clients where Azure OpenAI went down unexpectedly. While we were technically propagating errors correctly, we lacked clean, accessible error patterns that would allow the client to implement proper mitigation strategies. They were fully dependent on our infrastructure to handle the outage, with no way to gracefully degrade or implement custom fallbacks.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Netflix will show generative AI ads midway through streams in 2026 | news.ycombinator.com | 2025-05-15
https://github.com/Jackett/Jackett | https://github.com/qbittorrent/search-plugins/wiki/How-to-co...
-
-
-
-
-
Windrecorder
Windrecorder is a memory search app by records everything on your screen in small size, to let you rewind what you have seen, query through OCR text or image description, and get activity statistics. (by yuka-friends)
-
-
swirl-search
Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously and return AI-ranked results. And provides summaries of your answers from searches using LLMs. It's a one-click, easy-to-use Retrieval Augmented Generation (RAG) Solution.
Project mention: How These Free Open Source Projects Can Jumpstart Your Career (No Experience? No Problem!) | dev.to | 2024-12-13Give SWIRL a try: https://github.com/swirlai/swirl-search
-
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
twikit
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
Project mention: Show HN: I made a free tool that analyzes SEC filings and posts detailed reports | news.ycombinator.com | 2025-04-14Unpopular opinion here... If you tread carefully you'll most likely not succeed. I am not American and I know you guys like to sue eachother for putting cats in microwaves and stuff so maybe this is not great opinion to have in America at the current moment.
I would go for it and put a disclaimer, or I would just incorporate in a country where there's no issues with these things.
all this is hard of course to provide good value, but worthwhile.
Twitters' cost is insane right now, I had quite a few ideas for twitter integrations but they would easily cost thousands per month just to access their API.
I looked into https://github.com/d60/twikit - might not be suitable but you can definitely play around with it. Just don't use your official account as I got shadow banned using it unfortunately.
-
openrecall
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.
Another similar project: https://github.com/openrecall/openrecall
-
Project mention: Show HN: Scraper for job listings directly from company websites | news.ycombinator.com | 2024-12-07
jobfunnel is FOSS and accepting contributions: https://github.com/PaulMcInnis/JobFunnel
Currently supports indeed, in the past supported glassdoor and others.
-
-
RecoverPy
Interactively find and recover deleted or :point_right: overwritten :point_left: files from your terminal
-
Project mention: Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs | news.ycombinator.com | 2025-06-18
https://github.com/neuml/paperai :
> paperai is a combination of a txtai embeddings index and a SQLite database with the articles. Each article is parsed into sentences and stored in SQLite along with the article metadata. Embeddings are built over the full corpus.
paperai has a
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Search discussion
Python Search related posts
-
Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs
-
Leta – privacy focused search engine from Mullvad
-
Show HN: Toller – A Python library for robust async calls
-
Ingest (almost) any non-PDF document in a vector database, effortlessly
-
Kagi Is Bringing Orion Web Browser to Linux
-
Chunking your data for RAG
-
Analyzing LinkedIn Company Posts with Graphs and Agents
-
A note from our sponsor - SaaSHub
www.saashub.com | 18 Jun 2025
Index
What are some of the best open-source Search projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | algorithms | 24,571 |
2 | searxng | 19,666 |
3 | txtai | 11,078 |
4 | whoogle-search | 10,755 |
5 | paper-qa | 7,477 |
6 | R2R | 6,973 |
7 | buku | 6,728 |
8 | search-plugins | 5,217 |
9 | tribler | 5,005 |
10 | elasticsearch-py | 4,303 |
11 | elasticsearch-dsl-py | 3,868 |
12 | django-haystack | 3,653 |
13 | Windrecorder | 3,259 |
14 | image-match | 2,965 |
15 | swirl-search | 2,790 |
16 | datasketch | 2,704 |
17 | twikit | 2,718 |
18 | openrecall | 2,208 |
19 | JobFunnel | 2,026 |
20 | twitter-api-client | 1,795 |
21 | RecoverPy | 1,498 |
22 | paperai | 1,414 |
23 | Memacs | 1,059 |