scrapyteer
awesome-web-scraping
scrapyteer | awesome-web-scraping | |
---|---|---|
1 | 6 | |
18 | 6,345 | |
- | - | |
4.0 | 4.6 | |
2 months ago | 8 days ago | |
TypeScript | Makefile | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrapyteer
-
Low-code Node.js web scraping tool
Hi guys, I've created an open-source low-code Node.js web scraping tool on top of the Puppeteer - https://github.com/miroshnikov/scrapyteer. It offers a small set of functions that are combined in pipelines to define a crawling workflow and a shape of output data. Maybe somebody will find it useful.
awesome-web-scraping
-
Ask HN: LinkedIn sent me a cease and desist for my Chrome extension. Help?
>I can scrape linkedin with a python script. That doesn't mean linkedin can shut down python.
Well said!
Also, what about copy-and-paste? The last time I checked, data could be highlighted in the browser, copied, and pasted...
Does that mean that LinkedIn can shut down the copy-and-paste capability of your browser and/or operating system?
What about "Save Page As..." functionality (the ability of a browser to save a page offline?)
Can LinkedIn shut down "Save Page As..." ?
Also, what about the Print Screen (take a screen snapshot) capabilities of your operating system?
Can LinkedIn shut down that?
Finally, there's literally oodles of software that can be used for web scraping; what follows below is just one non-canonical list:
https://github.com/lorien/awesome-web-scraping
Is LinkedIn going to shut down all of that, at the same time?
Anyway, an excellent point about Python!
- Awesome-web-scraping – List of libraries, tools and APIs for web scraping
-
How does webscraping a website work and putting the data into my website?
Because at least for the scraping part there are open-source and paid services that will probably get you the data today if you need it (unless these are some really hard-to-scrape websites you're targeting) But if you are keen on learning yourself just scroll down this subreddit you will find many guides users shared along the years...
-
Russian Flag in Readme
E.g. how would a Ukrainian dev feel having his project showcased in this list, under the Russian flag?
[0] https://github.com/lorien/awesome-web-scraping/issues/136
- A central repository for scrapping scripts
What are some alternatives?
Philia - An easy to use imageboard scraper.
proxy-list - A list of free, public, forward proxy servers. UPDATED DAILY!
outlook-account-generator - Outlook Account Generator helps you create outlook accounts.
Proxyman - Modern. Native. Delightful Web Debugging Proxy for macOS, iOS, and Android ⚡️
crawler - Library for Rapid (Web) Crawler and Scraper Development
awesome-micropython - A curated list of awesome MicroPython libraries, frameworks, software and resources.
squirm - This was the night of the crawling terror!
TabNine - AI Code Completions
Dataflow kit - Extract structured data from web sites. Web sites scraping.
Awesome-Warez - All your base are belong to us!
ayakashi - :zap: Ayakashi.io - The next generation web scraping framework
cookiecutter-poetry-pypackage - Cookiecutter template for poetry managed python package