Our great sponsors
- Sonar - Write Clean Python Code. Always.
- InfluxDB - Access the most powerful time series database as a service
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
spider is an OD crawler that crawls through opendirectories and indexes the urls (by pyDiablo)
If any of you is willing to help, I've just uploaded the code to Github. I've added as many comments as I can to help you understand the code.
Extract and index movie information of movies found in open directories posted on r/opendirectories.
For my indexer (https://github.com/LaundroMat/ODmovieindexer) I tried crawling by myself too, but I gave up because there were too many special cases to take into account. I used the text files generated by ODScanner as a basis for the URL's to index.
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
A NodeJS wrapper around KoalaBear84/OpenDirectoryDownloader
I also wrote a NodeJS wrapper for ODD (https://github.com/Chaphasilor/open-directory-downloader) so that I could easily use ODD in my other projects, you might wanna do the same with Python? This way everyone who knows Python could make use of ODDs edge-case handling and stability!
This way you can also evolve your application to become async. As your using requests rather than aiohttp, may I suggest you to use gevent with a pool of requests in parallel (not too much ~ 10). You can look at this file as an example.
Indexes open directories
A reddit bot that scans ODs over at /r/OpenDirectories and submits the results to the ODCrawler discovery server
Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
Do you know this project which covers most your needs ? http://www.grantjenks.com/docs/diskcache/
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
pyaction 4.20.0 Released
2 projects | dev.to | 31 May 2023
Unwired updated to 0.9.x
2 projects | reddit.com/r/dartlang | 31 May 2023
The Gemini protocol seen by this HTTP client person (curl dev)
2 projects | reddit.com/r/programming | 30 May 2023
Simple HTTP server using C++
1 project | reddit.com/r/Cplusplus | 29 May 2023
Debug Browser Redirects Without Ruining Your Day
1 project | dev.to | 28 May 2023