voyager
crawl and scrape web pages in rust (by mattsse)
select.rs
A Rust library to extract useful data from HTML documents, suitable for web scraping. (by utkarshkukreti)
voyager | select.rs | |
---|---|---|
3 | 2 | |
704 | 937 | |
- | - | |
3.0 | 3.9 | |
11 months ago | 4 months ago | |
Rust | Rust | |
Apache License 2.0 | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
voyager
Posts with mentions or reviews of voyager.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-03-02.
-
Any decent web crawler?
Have you had a look at voyager yet?
- Show HN: Voyager – write your own web crawler/scraper as a state machine in rust
- voyager 0.1 - write your own web crawler/scraper
select.rs
Posts with mentions or reviews of select.rs.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-11-21.
-
Oops, I Did It Again...I Made A Rust Web API And It Was Not That Difficult
Once we have our string response, we can use the select.rs library to ensure the structure matches our intent. In this case, we are asserting we've received an h1 element with a text body matching the string NOT FOUND!.
-
Show HN: Voyager – write your own web crawler/scraper as a state machine in rust
Standalone html5ever can be a cumbersome to work with directly, scraper is basically an implementation of the html5ever's `TreeSink` trait, where as `select.rs` uses the hmtl5ever `RcDom` to parse the document but stores it in a more convenient way. If you look for a minimal approach you should at select.rs which basicially only depends on html5ever
[0] https://github.com/utkarshkukreti/select.rs
What are some alternatives?
When comparing voyager and select.rs you can also consider the following projects:
Crawler4j - Open Source Web Crawler for Java
todo-mvp - The non-SPA version of the todo list app
colly - Elegant Scraper and Crawler Framework for Golang
lazy-static.rs - A small macro for defining lazy evaluated static variables in Rust.
anyhow - Flexible concrete Error type built on std::error::Error
fenix - Rust toolchains and rust-analyzer nightly for Nix [maintainer=@figsoda]