Python Search

Open-source Python projects categorized as Search

Top 23 Python Search Projects

  • algorithms

    Minimal examples of data structures and algorithms in Python

  • Project mention: So I deployed Whoogle on my NAS.... | /r/selfhosted | 2023-12-08
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • searxng

    SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.

  • Project mention: Mobile Ad Blocker Will No Longer Stop YouTube's Ads | news.ycombinator.com | 2024-04-16

    Don't use Youtube without going through a proxy like Invidious [1] or Newpipe

    Don't use {site} Search without going through a proxy like SearxNG [2]

    Don't use TwiXXer without going through a proxy like Nitter - this has gotten more difficult lately but it still works as long as you feed the daemon some registered accounts. Video does not work at the moment but that seems to be fixable.

    Don't use Reddit without going through a proxy like libreddit [4]

    Start noticing the pattern? Maybe it is time to start producing promotional posters:

    The only thing to come between you and ADS could be a proxy / ADS. I'ts just not worth the risk

    ADS / New rules for a sane net / Sane net protects you, your partner and your community

    A proxy here and a filter there, ADS nowhere

    The more you tighten your grip, ${site}, the more viewers will slip through your fingers

    [1] https://github.com/iv-org/invidious

    [2] https://github.com/searxng/searxng

    [3] https://github.com/zedeus/nitter

    [4] https://github.com/libreddit/libreddit

  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

  • Project mention: Build knowledge graphs with LLM-driven entity extraction | dev.to | 2024-02-21

    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

  • buku

    :bookmark: Personal mini-web in text

  • Project mention: Buku: Personal Mini-Web in Text | news.ycombinator.com | 2024-01-29
  • tribler

    Privacy enhanced BitTorrent client with P2P content discovery

  • Project mention: Tribler: An attack-resilient micro-economy for media | news.ycombinator.com | 2024-04-25

    Indeed, its not about the tech. Changing the business model is key.

    It might be hard to re-imagine the content industry without the current monopolists. Linux showed how disruptive an open model can be.

    See here a description + full implementation of a music industry based on Creative Commons content. Artists release their music and receive direct Bitcoin donations from fans. 100% artists, 0% music label, 0% Big Tech, 0% credit card fee. It's a Bitcoin DAO with Spotify-inspired music discovery.

    [1] https://github.com/Tribler/tribler/files/11814767/First.Depl...

  • elasticsearch-py

    Official Python client for Elasticsearch

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • elasticsearch-dsl-py

    High level Python client for Elasticsearch

  • django-haystack

    Modular search for Django

  • search-plugins

    Search plugins for the search feature

  • Project mention: Whats the best browser for torrenting? | /r/torrents | 2023-12-10

    here

  • image-match

    🎇 Quickly search over billions of images

  • datasketch

    MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

  • JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

  • Project mention: GitHub - swirlai/swirl-search: Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously, finds the best results using a reader LLM, then prompts Generative AI, enabling you to get answers based on your data. | /r/programming | 2023-12-05
  • twitter-api-client

    Implementation of X/Twitter v1, v2, and GraphQL APIs (by trevorhobenshield)

  • Project mention: Reverse Engineering Twitter Spaces - Capture 500 Audio Streams/Live Transcripts per IP | /r/programming | 2023-06-11
  • paperai

    📄 🤖 Semantic search and workflows for medical/scientific papers

  • Project mention: Oracle of Zotero: LLM QA of Your Research Library | news.ycombinator.com | 2023-11-26

    Nice project!

    I've spent quite a lot of time in the medical/scientific literature space. With regards to LLMs, specifically RAG, how the data is chunked is quite important. With that, I have a couple projects that might be beneficial additions.

    paperetl (https://github.com/neuml/paperetl) - supports parsing arXiv, PubMed and integrates with GROBID to handle parsing metadata and text from arbitrary papers.

    paperai (https://github.com/neuml/paperai) - builds embeddings databases of medical/scientific papers. Supports LLM prompting, semantic workflows and vector search. Built with txtai (https://github.com/neuml/txtai).

    While arbitrary chunking/splitting can work, I've found that integrating parsing that has knowledge of medical/scientific paper structure increases the overall accuracy and experience of downstream applications.

  • R2R

    The framework for fast development and deployment of RAG backends. (by SciPhi-AI)

  • Project mention: Show HN: R2R – Open-source framework for production-grade RAG | news.ycombinator.com | 2024-02-26
  • RecoverPy

    Interactively find and recover deleted or :point_right: overwritten :point_left: files from your terminal

  • Project mention: RecoverPy 2.1.3: A Linux tool to recover deleted or overwritten files | /r/opensource | 2023-10-23
  • Memacs

    What did I do on February 14th 2007? Visualize your (digital) life in Org-mode

  • Project mention: Show HN: Khoj – Chat Offline with Your Second Brain Using Llama 2 | news.ycombinator.com | 2023-07-30

    Might look into some of the tools like novoids Memacs. Notion here is to build tools that push feeds, history data, into Emacs. Using org in your use case with the Khoj tool, could be the "glue" you need to tie it all together. https://github.com/novoid/Memacs#readme.

  • notion-search-alfred-workflow

    An Alfred workflow to search Notion with instant results

  • pysolr

    Pysolr — Python Solr client

  • twikit

    Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot

  • Project mention: Show HN: Twitter API Wrapper for Python – No API Keys Needed | news.ycombinator.com | 2024-02-03
  • stweet

    Advanced python library to scrap Twitter (tweets, users) from unofficial API

  • Project mention: Failed using the new twitter API or alternatives | /r/learnpython | 2023-05-11
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Search related posts

Index

What are some of the best open-source Search projects in Python? This list will help you:

Project Stars
1 algorithms 23,540
2 whoogle-search 8,789
3 searxng 8,263
4 txtai 6,953
5 buku 6,136
6 tribler 4,476
7 elasticsearch-py 4,136
8 elasticsearch-dsl-py 3,767
9 django-haystack 3,543
10 search-plugins 3,443
11 image-match 2,911
12 datasketch 2,348
13 JobFunnel 1,740
14 swirl-search 1,509
15 twitter-api-client 1,334
16 paperai 1,194
17 R2R 1,180
18 RecoverPy 1,168
19 Memacs 963
20 notion-search-alfred-workflow 815
21 pysolr 659
22 twikit 600
23 stweet 568

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com