scrapyrt
jarchive-clues
scrapyrt | jarchive-clues | |
---|---|---|
3 | 2 | |
814 | 29 | |
0.0% | - | |
6.8 | 0.0 | |
3 months ago | over 2 years ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrapyrt
- New to python and scrapy stuff but need this project to work so that I can do my data research and stuff easily in the future.
-
Scrap data and create a Rest API
Alternatively if you want to use scrapy there's a brilliant API addition called scrapyRT which wraps http API on your scrapy project.
-
Scraping name and location info from Linkedin Profile URL using Apps scripts
Put ScrapyRT in place to expose the scraper via web service
jarchive-clues
-
Jeopardy! recap for Tue., Jul. 13
I can't remember the source of the j-archive scrape I used. It might have been this. I then made a Jupyter notebook using Python to make an interface for quizzing. I may clean this up and host it somewhere, someday... That didn't help much, but let me know if you try building something!
- sqlite database of 400,000+ Jeopardy! clues from j-archive.com
What are some alternatives?
twisted-iocpsupport - `twisted-iocpsupport` is an extension module for the Twisted `iocp` reactor to use the Windows I/O Completion Ports (IOCP) networking API. You should not need to install it directly or interact with its API; it is a dependency of Twisted on Windows platforms.
scrapy-yle-kuntavaalit2021 - Fetch YLE kuntavaalit 2021 data
scrapy-proxycrawl-middleware - Scrapy middleware interface to scrape using ProxyCrawl proxy service
limnoria-plugins - Limnoria plugins I wrote or forked.
cryptoCMD - Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
google-play-scraper - Google play scraper for Python inspired by <facundoolano/google-play-scraper>
courlan - Clean, filter and sample URLs to optimize data collection – includes spam, content type and language filters
amazon_price_tracker - A cool Scrapy spider that notifies price drop in a product you crave to buy!
newspaperjs - News extraction and scraping. Article Parsing
alltheplaces - A set of spiders and scrapers to extract location information from places that post their location on the internet.
Spidey - A multi threaded web crawler library that is generic enough to allow different engines to be swapped in.
OpenWebCrawler - This is an open source Python web crawler which is meant to crawl the entire internet starting from a single URL, the goal of this project is to make an efficient, open source, powerful internet-scale web crawler which can be used in any applications and forked in any way as long as the forked project is also open source. Enjoy!