webtric
dude
webtric | dude | |
---|---|---|
1 | 28 | |
11 | 412 | |
- | - | |
3.4 | 9.0 | |
about 1 year ago | 10 days ago | |
Shell | Python | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webtric
-
Giving up a scraping script in Docker
I've made this script for myself and then posted it on Medium and people seem to use it for their own good. As this is my first somewhat successful attempt in open source, I'd love to share it with a broad community: https://github.com/destilabs/webtric
dude
-
Webscraping beginner here ready to start leveling up to intermediate. Looking for some good webscraping repositories (e.g any of your GitHub repos/projects) that I can use as learning tools, and general recommendations for what to do next
Please check https://github.com/roniemartinez/dude
-
Need help with downloading a section of multiple sites as pdf files.
You can use my library which also uses Playwright. I have an example here: https://github.com/roniemartinez/dude/discussions/116
-
Why do you use python for web scraping?
I also built a framework so I can easily switch between these libraries with less code change (still on hiatus for a few months before going back to it): https://github.com/roniemartinez/dude
-
Thank GOD for Poetry!
There's a lot of options but I am quite happy with Github Actions workflows + Poetry as it handles tests and publish to PyPI. Just an example, in my workflows, I deploy to TestPyPI and PyPI here: https://github.com/roniemartinez/dude/tree/master/.github/workflows
-
What stack or tools are you using for ensuring code quality and best practices in medium and large codebases ?
But for documentation, I use mkdocs-material as it can easily be used with minor customization and changes can be easily deployed in Github: https://roniemartinez.github.io/dude/
- Is there any thing Beautifulsoup can do that Scrapy can not?
-
Screenshotting site, but remove all popups.
Add an adblocker. I implemented Dude/pydude with the this and page results are clean without ads and pop-ups. For the screenshot, here is an example: https://github.com/roniemartinez/dude/discussions/116
-
which Python Library is best for scraping?
You can also use my library if you want things to be simpler:) https://github.com/roniemartinez/dude
-
For those of you using Python, what is your go to library to build your scraper?
I use my own library, Dude! https://github.com/roniemartinez/dude
-
Building a (relatively) easily adaptable, flexible web scraper (seeking conceptual advice)
I built a simple web scraper that is simple to use but this is still a work-in-progress - https://github.com/roniemartinez/dude
What are some alternatives?
Edu-Mail-Generator - Generate Free Edu Mail(s) within minutes
python-web-scraping-primjeri - web scraping stranica posta.hr, konzum.hr, index.hr, njuskalo.hr, neostar.com, DasWeltAuto.hr, ...
scrapy-playwright - 🎠Playwright integration for Scrapy
FastDepends - FastDepends - FastAPI Dependency Injection system extracted from FastAPI and cleared of all HTTP logic. Async and sync modes are both supported.
HomeHarvest - Python package for real estate scraping of MLS listing data [Moved to: https://github.com/Bunsly/HomeHarvest]
dnd-roll-parser - Python project that will take the saved html chat log and calculate the average rolls per player.
Cascadia.jl - A CSS Selector library in Julia
Slowly_Letter_Downloader - Automates the process of downloading letters from slowly in PDF form.
HomeHarvest - Python package for real estate scraping of MLS listing data
olx-web-scraper - Use python to scrap for listings on olx.in based on search query.
web-poet - Web scraping Page Objects core library
g2-scraper - G2 Scraper helps you collect G2 product data, including names, product descriptions, reviews, ratings, comparisons, alternatives, and more.