webtric
Universal Python script to scrape many typical websites (by destilabs)
document-dl
Command line program to download documents from web portals (by heeplr)
webtric | document-dl | |
---|---|---|
1 | 1 | |
11 | 21 | |
- | - | |
3.4 | 6.0 | |
about 1 year ago | 7 months ago | |
Shell | Python | |
- | The Unlicense |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
webtric
Posts with mentions or reviews of webtric.
We have used some of these posts to build our list of alternatives
and similar projects.
-
Giving up a scraping script in Docker
I've made this script for myself and then posted it on Medium and people seem to use it for their own good. As this is my first somewhat successful attempt in open source, I'd love to share it with a broad community: https://github.com/destilabs/webtric
document-dl
Posts with mentions or reviews of document-dl.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-07-13.
What are some alternatives?
When comparing webtric and document-dl you can also consider the following projects:
Scrapster - Scrape images from google for your next ML project.
FirstCyclingAPI - An unofficial Python API wrapper for firstcycling.com
trafilatura - Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
PyInotify - An efficient and elegant inotify (Linux filesystem activity monitor) library for Python. Python 2 and 3 compatible.