webscraping-from-0-to-hero
scrapyrt
webscraping-from-0-to-hero | scrapyrt | |
---|---|---|
1 | 3 | |
1,467 | 816 | |
2.0% | 0.2% | |
5.8 | 6.8 | |
11 months ago | 3 months ago | |
Python | ||
- | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webscraping-from-0-to-hero
scrapyrt
- New to python and scrapy stuff but need this project to work so that I can do my data research and stuff easily in the future.
-
Scrap data and create a Rest API
Alternatively if you want to use scrapy there's a brilliant API addition called scrapyRT which wraps http API on your scrapy project.
-
Scraping name and location info from Linkedin Profile URL using Apps scripts
Put ScrapyRT in place to expose the scraper via web service
What are some alternatives?
scrapy-fake-useragent - Random User-Agent middleware based on fake-useragent
twisted-iocpsupport - `twisted-iocpsupport` is an extension module for the Twisted `iocp` reactor to use the Windows I/O Completion Ports (IOCP) networking API. You should not need to install it directly or interact with its API; it is a dependency of Twisted on Windows platforms.
advertools - advertools - online marketing productivity and analysis tools
scrapy-proxycrawl-middleware - Scrapy middleware interface to scrape using ProxyCrawl proxy service
GoodreadsScraper - Scrape data from Goodreads using Scrapy and Selenium :books:
cryptoCMD - Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
Webscraping Open Project - The web scraping open project repository aims to share knowledge and experiences about web scraping with Python [Moved to: https://github.com/TheWebScrapingClub/webscraping-from-0-to-hero]
google-play-scraper - Google play scraper for Python inspired by <facundoolano/google-play-scraper>
courlan - Clean, filter and sample URLs to optimize data collection – includes spam, content type and language filters
jarchive-clues - Web crawler to collect Jeopardy! clues from https://j-archive.com
amazon_price_tracker - A cool Scrapy spider that notifies price drop in a product you crave to buy!
newspaperjs - News extraction and scraping. Article Parsing