Web scraping with Playwright?

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • chromiumoxide

    Chrome Devtools Protocol rust API

  • Re rust-headless-chrome: yes, being handwritten it probably has a nicer API/DX but is missing some features. I would give it a go if you are struggling with chromiumoxide. The flakoness could be due to chromiumoxide going faster than the browser, ie trying to get elements before they have loaded. Most web automation tools will have ways to slow down themselves down, eg wait for 1 second, or keep checking whether an element exists. This is what things like Playwright provide, not sure what chomiumoxide does. Also not sure if this is something that is part of the devtools spec and thus should already be in chomiumoxide, or features implemented on top, in which case chomiumoxide might not have it (though I suspect it probably does) eg see https://github.com/mattsse/chromiumoxide/issues/42 In hindsight chomiumoxide might not be the best choice for you if it is lacking these kinds of features. I'm just using it for executing Javascript, not acually interacting with the Dom. I prefer it because I think the codegen approach is best and is the project most likely to succeed long term.

  • rust-headless-chrome

    A high-level API to control headless Chrome or Chromium over the DevTools Protocol. It is the Rust equivalent of Puppeteer, a Node library maintained by the Chrome DevTools team.

  • Thanks, I was looking into that as well and got their example up and running. I also saw that chromiumoxide mentions rust-headless-chrome in its references section in the README, which is also updated recently, any differences between the two? Seems like chromiumoxide is async with code gen whereas rust-headless-chrome is not, is that right?

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts