scrapy-fake-useragent
web-poet
scrapy-fake-useragent | web-poet | |
---|---|---|
3 | 1 | |
683 | 90 | |
- | - | |
2.3 | 8.7 | |
8 months ago | 2 months ago | |
Python | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrapy-fake-useragent
-
Looking for suggestions for a web scraper
User-Agents: Your user-agent list is pretty small, and you aren't adding the other headers that real browsers typically have. For a bigger list of user-agents you could use the scrapy-fake-user-agent middleware.
-
Apple AppStore Apps Dataset with 1.2 million apps
Use the following config Scrapy + https://github.com/aivarsk/scrapy-proxies + https://github.com/alecxe/scrapy-fake-useragent with a free random proxy list but beware of securing your database since (MongoDB) like are prone to ransomware attacks
web-poet
-
Is there a method to web scrape similar type of information from hundreds of websites with a single code or application?
Check out the web-poet pattern: https://github.com/scrapinghub/web-poet
What are some alternatives?
scrapy-playwright - 🎭 Playwright integration for Scrapy
dude - dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators
scrapy-splash - Scrapy+Splash for JavaScript integration
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python
scrapy-rotating-proxies - use multiple proxies with Scrapy
Grab - Web Scraping Framework
WikiMapper - Create maps of wiki links on how they interconnect with each other.
selenium-python-helium - Lighter web automation for Python [Moved to: https://github.com/mherrmann/helium]
hltv-scraping - Scraping data from hltv.org
google-search-results-python - Google Search Results via SERP API pip Python Package
viviner - 🍷 Scraps data from Vivino and collects outstanding wine-based meta-data.
helium - Selenium-python but lighter: Helium is the best Python library for web automation. [Moved to: https://github.com/mherrmann/selenium-python-helium]