Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 18 browser-automation Open-Source Projects
-
playwright-go
Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
xk6-browser
k6 extension that adds support for browser automation and end-to-end web testing via the Chrome Devtools Protocol
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
demo.playwright
This repo is used to demo various testing scenarios with Playwright 🎭, using the official test-runner and scripts authored in TypeScript.
-
BrowserBruter
BrowserBruter is a powerful web form fuzzing automation tool designed for web security professionals and penetration testers. This Python-based tool leverages Selenium and Selenium-Wire to automate web form fuzzing, making it easier to identify potential vulnerabilities in web applications.
-
VBAChromeDevProtocol
VBA (Excel) based wrapper for Chrome Developer Protocol (CDP) - sorta a VBA version of Puppeteer/Selenium
-
insta_delete
Selenium powered script to delete old instagram posts, upload images, like posts in feed.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Automa – Automate the browser by connecting blocks | news.ycombinator.com | 2023-11-10
Project mention: LaVague: Open-source Large Action Model to automate Selenium browsing | news.ycombinator.com | 2024-03-13
Project mention: The Browser Bruter – First Ever Browser based web application fuzzing tool | news.ycombinator.com | 2024-04-08
Project mention: Ask HN: Is anybody getting value from AI Agents? How so? | news.ycombinator.com | 2024-03-31Full disclaimer up top: I have been working on agents for about a year now building what would eventually become HDR [1][2].
The first issue is that agents have extremely high failure rates. Agents really don't have the capacity to learn from either success or failure since their internal state is fixed after training. If you ask an agent to repeatedly do some task it has a chance of failing every single time. We have been able to largely mitigate this by modeling agentic software as a state machine. At every step we have the model choose the inputs to the state machine and then we record them. We then 'compile' the resulting state-transition table down into a program that we can executed deterministically. This isn't totally fool proof since the world state can change between program runs, so we have methods that allow the LLM to make slight modifications to the program as needed. The idea here is that agents should never have to solve the same problem twice. The cool thing about this approach is that smarter models make the entire system work better. If you have a particularly complex task, you can call out to gp4-turbo or claude3-opus to map out the correct action sequence and then fall back to less complex models like mistral 7b.
The second issue is that almost all software is designed for people, not LLMs. What is intuitive for human users may not be intuitive for non-human users. We're focused on making agents reliably interact with the internet so I'll use web pages as an example. Web pages contain tons of visually encoded information in things like the layout hierarchy, images, etc. But most LLMs rely on purely text inputs. You can try exposing the underling HTML or the DOM to the model, but this doesn't work so well in practice. We get around this by treating LLMs as if they were visually impaired users. We give them a purely text interface by using ARIA trees. This interface is much more compact than either the DOM or HTML so responses come back faster and cost way less.
The third issue I see with people building agents is they go after the wrong class of problem. I meet a lot of people who want to use agents for big ticket items such as planning an entire trip + doing all the booking. The cost of a trip can run into the thousands of dollars and be a nightmare to undo if something goes wrong. You really don't want to throw agents at this kind of problem, at least not yet, because the downside to failure is so high. Users generally want expensive things to be done well and agents can't do that yet.
However there are a ton of things I would like someone to do for me that would cost less than five dollars of someones time and the stakes for things going wrong are low. My go to example is making reservations. I really don't want to spend the time sorting through the hundreds of nearby restaurants. I just want to give something the general parameters of what I'm looking for and have reservations show up in my inbox. These are the kinds of tasks that agents are going to accelerate.
[1] https://github.com/hdresearch/hdr-browser
Project mention: Beachpatrol CLI tool to replace and automate your everyday Linux web browser | news.ycombinator.com | 2023-12-29
Project mention: NPi – An Open Source project for enhancing AI Agents in taking action | news.ycombinator.com | 2024-05-02
browser-automation related posts
-
LaVague: Open-source Large Action Model to automate Selenium browsing
-
Get Started with xk6-browser
-
Top 3 things you've made on Excel in 1 sentence
-
A modern load testing tool, using Go and JavaScript !!
-
Puppeteersharp - Captcha solving within an iFrame not working
-
How to use rotating proxies with Puppeteer
-
A note from our sponsor - InfluxDB
www.influxdata.com | 7 May 2024
Index
What are some of the best open-source browser-automation projects? This list will help you:
Project | Stars | |
---|---|---|
1 | automa | 9,835 |
2 | playwright-go | 1,797 |
3 | Mink | 1,595 |
4 | stealth | 992 |
5 | awesome-playwright | 704 |
6 | browserpilot | 356 |
7 | xk6-browser | 315 |
8 | Python-Scripts | 299 |
9 | demo.playwright | 224 |
10 | Puppeteer-sharp-extra | 167 |
11 | BrowserBruter | 128 |
12 | VBAChromeDevProtocol | 53 |
13 | nolita | 50 |
14 | beachpatrol | 40 |
15 | insta_delete | 24 |
16 | webdriver-w3c | 23 |
17 | selenium-wrapper-vba | 17 |
18 | npi | 15 |
Sponsored