w3lib
Python library of web-related functions (by scrapy)
universal_pathlib
pathlib api extended to use fsspec backends (by fsspec)
w3lib | universal_pathlib | |
---|---|---|
1 | 1 | |
382 | 184 | |
0.3% | 4.3% | |
6.7 | 7.8 | |
about 1 month ago | about 1 month ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
w3lib
Posts with mentions or reviews of w3lib.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-16.
-
Parsing URLs in Python
A great initiative!
We need a better URL parser in Scrapy, for similar reasons. Speed and WHATWG standard compliance (i.e. do the same as web browsers) are the main things.
It's possible to get closer to WHATWG behavior by using urllib and some hacks. This is what https://github.com/scrapy/w3lib does, which Scrapy currently uses. But it's still not quite compliant.
Also, surprisingly, on some crawls URL parsing can take CPU amounts similar to HTML parsing.
Ada / can_ada look very promising!
universal_pathlib
Posts with mentions or reviews of universal_pathlib.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-16.
-
Parsing URLs in Python
You might be interested in https://github.com/fsspec/universal_pathlib