Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 23 Python search-engine Projects
Privacy-respecting metasearch engineProject mention: Little Tricky: Fetching Youtube URLs of List of Song Titles WITHOUT Youtube API? | reddit.com/r/webscraping | 2023-01-29
Scrape the search page? Check Searx
A free & open modern, fast email client with user-friendly encryption and privacy featuresProject mention: My slow progression towards and away from NextCloud | reddit.com/r/selfhosted | 2022-11-12
Have a look at mailpile if you are after a web interface; or, the ever-dependable Thunderbird if you are fine with a desktop application.
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.Project mention: The 10 Trending Python Repositories on GitHub (May 2022) | dev.to | 2022-06-23
A self-hosted, ad-free, privacy-respecting metasearch engineProject mention: Tell us about the most underrated & useful FOSS Apps you are using!! | reddit.com/r/fossdroid | 2023-01-27
:mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.Project mention: New free tool that uses fine-tuned BERT model to surface answers from research papers | reddit.com/r/LanguageTechnology | 2022-10-28
Some cool tools like HayStack that would be useful in putting some of these together.
Search plugins for the search featureProject mention: RARBG website not showing magnet symbol, thus enabling to download from RARBG | reddit.com/r/torrents | 2023-01-14
So you're saying this doesn't exist?
Elasticsearch with BERT for advanced document search.
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
Maryam: Open-source Intelligence(OSINT) Framework
An open source, non-profit search engine implemented in pythonProject mention: Introduction! | reddit.com/r/u_mwmbl | 2022-12-14
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Yuno is context based search engine for anime.
NeoVerse/HyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags
Natural Language Search Engine for your Org-Mode and Markdown notes, Beancount transactions and PhotosProject mention: AI model for retrieving files from Org-Roam directory? | reddit.com/r/emacs | 2023-01-25
You might want to have a look at Khoj (https://github.com/debanjum/khoj) and the post about it in this subreddit.
An advanced graphical search engine for Exploit-DB
Search through all your personal data efficiently like web search.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
The Openverse API allows programmatic access to search for CC-licensed and public domain digital media.Project mention: Recommend Django Great Projects | news.ycombinator.com | 2022-12-03
Project mention: Getting started with Jina AI | dev.to | 2022-02-19
Financial Question Answering System
🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library
domhttpx is a google search engine dorker with HTTP toolkit built with python, can make it easier for you to find many URLs/IPs at once with fast time.
SWIRL queries any number of data sources - search engines, databases, noSQL engines, cloud/SaaS services with APIs etc - and uses Large Language Models to re-rank the unified results without extracting and indexing anything. Includes connectors to apache solr, elastic, PostgreSQL and generic web/json.Project mention: I wrote a federated search engine called SWIRL SEARCH http://swirl.today/ | reddit.com/r/Python | 2022-11-05
BTW here is an overview of Federated Search, how it differs from traditional indexing approaches - and why it can solve multi-silo search problems in a fraction of the time *without* moving data... https://github.com/sidprobstein/swirl-search/wiki
searchmysite.net is an open source search engine and search as a serviceProject mention: Almost all searches on my independent search engine are now from SEO spam bots | news.ycombinator.com | 2022-05-16
Thanks V. I'm seeing a similar number of problem search requests (although nowhere near as many real search requests:-), so it is probably the same "SEO practitioners" running the same "scraping footprints" against different search engines around the same time.
I was kind-of hoping that somewhere in this discussion there would be an "And the answer to your problem is...", but I suppose it is a very specific problem which only a search engine would encounter. I think the Cloudflare solution you have is probably the best to block the requests as early as possible. The reverse proxy config I've got seems to be mostly holding out for now though.
Identifies and collects data on cc-licensed content across web crawl data and public apis.Project mention: In Over My Head | dev.to | 2022-11-20
Like with any other issue, I kind of look at it at large and think either "This seems do-able" or "Pass", this one was in the first category: openverse-catalog. I saw that I just had to add a string to some header and thought maybe this is something I can actually do. Maybe it was, I won't be able to find out because I could not get the project to run.
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python search-engine related posts
Little Tricky: Fetching Youtube URLs of List of Song Titles WITHOUT Youtube API?
1 project | reddit.com/r/webscraping | 29 Jan 2023
RARBG website not showing magnet symbol, thus enabling to download from RARBG
1 project | reddit.com/r/torrents | 14 Jan 2023
Manually add a custom theme to Searxng?
1 project | reddit.com/r/Searx | 6 Jan 2023
Seeking advice for a little-more-than-beginner python guy
1 project | reddit.com/r/learnpython | 26 Dec 2022
21 December 2022 - Daily Chat Thread
2 projects | reddit.com/r/indonesia | 20 Dec 2022
1 project | reddit.com/r/u_mwmbl | 14 Dec 2022
Is it just me or is it next to impossible to search for anything on Google or duckduckgo that doesn't have a left leaning bias?
1 project | reddit.com/r/conspiracy | 6 Dec 2022
A note from our sponsor - Sonar
www.sonarsource.com | 30 Jan 2023
What are some of the best open-source search-engine projects in Python? This list will help you:
|10||Search Engine Parser||369|