Automate the mundane part of your day, with live actionable messages for your GitHub & Jira tasks. Learn more →
Top 23 Python web-scraping Projects
A Smart, Automatic, Fast and Lightweight Web Scraper for PythonProject mention: Scrapping - How to deal with page changes Ai | reddit.com/r/webscraping | 2022-03-25
It depends on the website, but autoscraper was used to calculate similar nodes given the text to search. Not sure how it works now but it's open source.
Selenium-python but lighter: Helium is the best Python library for web automation.Project mention: Automating some process with PyAutoGui? | reddit.com/r/automation | 2022-07-21
You can, though it might not be the best tool for this. Automation of web entry is better done with selenium, or my favorite variation helium.
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
Web Scraping Framework
Snoop — инструмент разведки на основе открытых данных (OSINT world)Project mention: Tool das alle mit E-Mail verknüpfte Accounts auflistet? | reddit.com/r/de_EDV | 2022-06-22
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, commentsProject mention: Testing fast installation in tear-down environment | reddit.com/r/learnpython | 2022-07-06
I want to test how easy it is to install a package plus special extra dependencies to run a certain script in that package: https://github.com/adbar/trafilatura
Random User-Agent middleware based on fake-useragentProject mention: Looking for suggestions for a web scraper | reddit.com/r/learnpython | 2022-09-01
User-Agents: Your user-agent list is pretty small, and you aren't adding the other headers that real browsers typically have. For a bigger list of user-agents you could use the scrapy-fake-user-agent middleware.
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The EconomistProject mention: web-scraping: NEW Data - star count:324.0 | reddit.com/r/algoprojects | 2022-08-13
Workflow assistant built for devs & their teams. Automate the mundane part of your day, with live actionable messages for your GitHub & Jira tasks.
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.Project mention: Anyone have a simple useful guide so I can get this scraper working . | reddit.com/r/github | 2022-11-06
Bulk Downloader for Reddit
Automate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elementsProject mention: Can anyone share some cool projects done with Python? | reddit.com/r/Python | 2022-02-13
Scraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd usernameProject mention: Any tools for adding a list of recommendations? | reddit.com/r/radarr | 2022-07-17
Picking stocks through various screening methods. Focus on Northern Europe.
Compensation analysis on the posts scraped from leetcode.com/discuss/compensation. At present, the reports have been generated only for Indian cities.
Scrape data from Goodreads using Scrapy and Selenium :books:Project mention: [OC] Top 10 Fantasy Books of the 21st Century According to GoodReads | reddit.com/r/dataisbeautiful | 2022-05-20
Used this code to get the data: https://github.com/havanagrawal/GoodreadsScraper
Python's package to scrap Twitter's front-end easily
Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV
Fast and robust date extraction from web pages, with Python or on the command-lineProject mention: How does Firefox's Reader View work? | news.ycombinator.com | 2022-03-30
Web scraping Page Objects core libraryProject mention: Is there a method to web scrape similar type of information from hundreds of websites with a single code or application? | reddit.com/r/webscraping | 2022-07-21
Check out the web-poet pattern: https://github.com/scrapinghub/web-poet
Python script for creating Mobile Phones Dataset on GSMArena website.Project mention: Need web scraping script for phones specifications and phone dataset | reddit.com/r/learnpython | 2022-04-11
but searching GitHub for python specific scrapers for GSMARENA look at gsmarena-scraper and Mobile-Phone-Dataset-GSMArena for reference how to scrape.
cli for evaluating css and xpath selectorsProject mention: Web Scraping With Python (An Ultimate Guide) | reddit.com/r/Python | 2022-09-15
I like it so much that I even wrote a REPL for it parsel-cli :) (it's a bit of a Frankenstein though as I'm working on a 2.0 release)
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
Builds a Tagalog dictionary by collecting Tagalog words from tagalog.pinoydictionary.com
A micro-framework for asynchronous deep crawls and web scraping with Python
Truly a developer’s best friend. Scout APM is great for developers who want to find and fix performance issues in their applications. With Scout, we'll take care of the bugs so you can focus on building great things 🚀.
Python web-scraping related posts
Anyone have a simple useful guide so I can get this scraper working .
1 project | reddit.com/r/github | 6 Nov 2022
Looking for suggestions for a web scraper
1 project | reddit.com/r/learnpython | 1 Sep 2022
Any tools for adding a list of recommendations?
3 projects | reddit.com/r/radarr | 17 Jul 2022
Testing fast installation in tear-down environment
1 project | reddit.com/r/learnpython | 6 Jul 2022
wayback-machine-scraper: NEW Data - star count:295.0
1 project | reddit.com/r/algoprojects | 26 Jun 2022
Advice on standard design pattern for comparison test script
1 project | reddit.com/r/learnpython | 24 May 2022
Automate dependency installation
1 project | reddit.com/r/learnpython | 9 Apr 2022
A note from our sponsor - Zigi
www.zigi.ai | 28 Nov 2022
What are some of the best open-source web-scraping projects in Python? This list will help you: