Show HN: Skyvern – open-source browser automation tool

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

skyvern

7 3,517 9.3 Python

Automate browser-based workflows with LLMs and Computer Vision

https://github.com/Skyvern-AI/skyvern/blob/d0935755963b017ed...
We also spit out the cost for each step within the visualizer. Click on any task > Steps > there's a column that's dedicated to how much things cost to run
https://github.com/Skyvern-AI/skyvern/issues/70
2. We have a roadmap item to "cache" or "memorize" specific tasks, so you pay the cost once, and then just run it over and over again. We're going to get to it soon!!

LaVague

4 3,906 9.5 Python

Large Action Model framework to turn natural language into browser actions

We're quite different than LaVague. LaVague passes in the entire HTML DOM to the LLM to help it generate XPaths and valid Selenium code. (https://github.com/lavague-ai/LaVague/blob/main/src/lavague/...)
Try this at your own risk.. any reasonable website would result in extraordinarily high input token costs
We spend quite a bit of our time building a layer between the HTML and the LLM call to distill important pieces of information down to actions the LLM can take.. better weighing cost vs output. We're still not at 100% coverage.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
self-operating-computer

14 7,020 9.8 Python

A framework to enable multimodal models to operate a computer.

This is quite different than https://github.com/OthersideAI/self-operating-computer
Self-operating-computer uses pixel mapping to control your computer. This is a very good approach, but it's extremely unreliable. GPT-4V frequently hallucinates pixel outputs, causing it to miss interactions, or enter fail-loops
>The approach by AI Jason
AI Jason is using image-only methods to interact with the browser. This is a great first step, but this approach tends to be rife with hallucinations or errors. We do dom parsing in addition to image anaylsis to help GPT-4V correlate information in the image to the interactable elements within the DOM. This dramatically boosts its ability to perform the same task over and over again reliably (which proved impossible with the image-only approach)

vimGPT

6 2,437 7.4 Python

Browse the web with GPT-4V and Vimium
OpenAdapt

20 473 9.3 Python

AI-First Process Automation with Large [Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models

Congratulations on shipping!
Check out https://github.com/OpenAdaptAI/OpenAdapt for an open source (MIT license) alternative that also works on desktop (including Citrix!)

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Adapter between LMMs and traditional desktop and web GUI

1 project | news.ycombinator.com | 1 May 2024
I Witnessed the Future of AI, and It's a Broken Toy

1 project | news.ycombinator.com | 30 Apr 2024
Memary is a cutting-edge long-term memory system based on a knowledge graph

2 projects | news.ycombinator.com | 29 Apr 2024
Rabbit r1 source code [part 1]

3 projects | news.ycombinator.com | 23 Apr 2024
Survey Study on AI Agents Architectures(2024)

1 project | news.ycombinator.com | 22 Apr 2024

Show HN: Skyvern – open-source browser automation tool

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
process-automation Python Transformers
Post date: 14 Mar 2024

skyvern

LaVague

InfluxDB

self-operating-computer

vimGPT

OpenAdapt

SaaSHub

Related posts

Adapter between LMMs and traditional desktop and web GUI

I Witnessed the Future of AI, and It's a Broken Toy

Memary is a cutting-edge long-term memory system based on a knowledge graph

Rabbit r1 source code [part 1]

Survey Study on AI Agents Architectures(2024)

Show HN: Skyvern – open-source browser automation tool

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com process-automation Python Transformers Post date: 14 Mar 2024

skyvern

LaVague

InfluxDB

self-operating-computer

vimGPT

OpenAdapt

SaaSHub

Related posts

Adapter between LMMs and traditional desktop and web GUI

I Witnessed the Future of AI, and It's a Broken Toy

Memary is a cutting-edge long-term memory system based on a knowledge graph

Rabbit r1 source code [part 1]

Survey Study on AI Agents Architectures(2024)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
process-automation Python Transformers
Post date: 14 Mar 2024