Need help with an OD indexer that I am writing in Python

This page summarizes the projects mentioned and recommended in the original post on /r/opendirectories

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. spider

    spider is an OD crawler that crawls through opendirectories and indexes the urls (by pyDiablo)

    If any of you is willing to help, I've just uploaded the code to Github. I've added as many comments as I can to help you understand the code.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ODmovieindexer

    Extract and index movie information of movies found in open directories posted on r/opendirectories.

    For my indexer (https://github.com/LaundroMat/ODmovieindexer) I tried crawling by myself too, but I gave up because there were too many special cases to take into account. I used the text files generated by ODScanner as a basis for the URL's to index.

  4. open-directory-downloader

    A NodeJS wrapper around KoalaBear84/OpenDirectoryDownloader

    I also wrote a NodeJS wrapper for ODD (https://github.com/Chaphasilor/open-directory-downloader) so that I could easily use ODD in my other projects, you might wanna do the same with Python? This way everyone who knows Python could make use of ODDs edge-case handling and stability!

  5. calishot

    This way you can also evolve your application to become async. As your using requests rather than aiohttp, may I suggest you to use gevent with a pool of requests in parallel (not too much ~ 10). You can look at this file as an example.

  6. OpenDirectoryDownloader

    Indexes open directories

    See: https://github.com/KoalaBear84/OpenDirectoryDownloader/tree/master/OpenDirectoryDownloader.Tests/Samples

  7. odcrawler-scanner

    A reddit bot that scans ODs over at /r/OpenDirectories and submits the results to the ODCrawler discovery server

  8. DiskCache

    Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

    Do you know this project which covers most your needs ? http://www.grantjenks.com/docs/diskcache/

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • curl, it's got u

    1 project | dev.to | 14 May 2025
  • Dumbproxy got Redis auth back end

    1 project | news.ycombinator.com | 12 May 2025
  • Show HN: Hypermode Model Router Preview – OpenRouter Alternative

    1 project | news.ycombinator.com | 11 May 2025
  • A2A Python Tutorial - Comprehensive Guide

    2 projects | dev.to | 2 May 2025
  • Complete Get started with tensorflow-metal in bahasa Indonesia.

    1 project | dev.to | 20 Apr 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?