Python search-engine

Open-source Python projects categorized as search-engine

Top 23 Python search-engine Projects

  • Searx

    Privacy-respecting metasearch engine

    Project mention: Little Tricky: Fetching Youtube URLs of List of Song Titles WITHOUT Youtube API? | reddit.com/r/webscraping | 2023-01-29

    Scrape the search page? Check Searx

  • Mailpile

    A free & open modern, fast email client with user-friendly encryption and privacy features

    Project mention: My slow progression towards and away from NextCloud | reddit.com/r/selfhosted | 2022-11-12

    Have a look at mailpile if you are after a web interface; or, the ever-dependable Thunderbird if you are fine with a desktop application.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • PaddleNLP

    👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis and 🖼 Diffusion AIGC system etc.

    Project mention: The 10 Trending Python Repositories on GitHub (May 2022) | dev.to | 2022-06-23

    PaddleNLP

  • haystack

    :mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.

    Project mention: New free tool that uses fine-tuned BERT model to surface answers from research papers | reddit.com/r/LanguageTechnology | 2022-10-28

    Some cool tools like HayStack that would be useful in putting some of these together.

  • search-plugins

    Search plugins for the search feature

    Project mention: RARBG website not showing magnet symbol, thus enabling to download from RARBG | reddit.com/r/torrents | 2023-01-14

    So you're saying this doesn't exist?

  • bertsearch

    Elasticsearch with BERT for advanced document search.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • Maryam

    Maryam: Open-source Intelligence(OSINT) Framework

  • mwmbl

    An open source, non-profit search engine implemented in python

    Project mention: Introduction! | reddit.com/r/u_mwmbl | 2022-12-14
  • Search Engine Parser

    Lightweight package to query popular search engines and scrape for result titles, links and descriptions

  • Yuno

    Yuno is context based search engine for anime.

  • HyperTag

    NeoVerse/HyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags

  • khoj

    Natural Language Search Engine for your Org-Mode and Markdown notes, Beancount transactions and Photos

    Project mention: AI model for retrieving files from Org-Roam directory? | reddit.com/r/emacs | 2023-01-25

    You might want to have a look at Khoj (https://github.com/debanjum/khoj) and the post about it in this subreddit.

  • houndsploit

    An advanced graphical search engine for Exploit-DB

  • achoz

    Search through all your personal data efficiently like web search.

  • PatZilla

    PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.

  • openverse-api

    The Openverse API allows programmatic access to search for CC-licensed and public domain digital media.

    Project mention: Recommend Django Great Projects | news.ycombinator.com | 2022-12-03
  • horapy

    🐍 Python bidding for the Hora Approximate Nearest Neighbor Search Algorithm library

  • domhttpx

    domhttpx is a google search engine dorker with HTTP toolkit built with python, can make it easier for you to find many URLs/IPs at once with fast time.

  • searchmysite.net

    searchmysite.net is an open source search engine and search as a service

    Project mention: Almost all searches on my independent search engine are now from SEO spam bots | news.ycombinator.com | 2022-05-16

    Thanks V. I'm seeing a similar number of problem search requests (although nowhere near as many real search requests:-), so it is probably the same "SEO practitioners" running the same "scraping footprints" against different search engines around the same time.

    I was kind-of hoping that somewhere in this discussion there would be an "And the answer to your problem is...", but I suppose it is a very specific problem which only a search engine would encounter. I think the Cloudflare solution you have is probably the best to block the requests as early as possible. The reverse proxy config[0] I've got seems to be mostly holding out for now though.

    [0] https://github.com/searchmysite/searchmysite.net/issues/55

  • openverse-catalog

    Identifies and collects data on cc-licensed content across web crawl data and public apis.

    Project mention: In Over My Head | dev.to | 2022-11-20

    Like with any other issue, I kind of look at it at large and think either "This seems do-able" or "Pass", this one was in the first category: openverse-catalog. I saw that I just had to add a string to some header and thought maybe this is something I can actually do. Maybe it was, I won't be able to find out because I could not get the project to run.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-01-29.

Python search-engine related posts

Index

What are some of the best open-source search-engine projects in Python? This list will help you:

Project Stars
1 Searx 12,465
2 Mailpile 8,683
3 PaddleNLP 7,177
4 whoogle-search 6,955
5 haystack 6,515
6 search-plugins 2,605
7 bertsearch 843
8 Maryam 744
9 mwmbl 652
10 Search Engine Parser 369
11 Yuno 355
12 HyperTag 167
13 khoj 159
14 houndsploit 99
15 achoz 70
16 PatZilla 70
17 openverse-api 64
18 jina-financial-qa-search 61
19 horapy 59
20 domhttpx 58
21 swirl-search 57
22 searchmysite.net 55
23 openverse-catalog 44
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com