Shot-scraper Alternatives

Similar projects and alternatives to shot-scraper

Playwright

379 61,568 9.9 TypeScript shot-scraper VS Playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
uptime-kuma

351 49,253 9.8 JavaScript shot-scraper VS uptime-kuma

A fancy self-hosted monitoring tool
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
k3s

291 26,405 9.6 Go shot-scraper VS k3s

Lightweight Kubernetes
Healthchecks

208 7,291 9.7 Python shot-scraper VS Healthchecks

Open-source cron job and background task monitoring service, written in Python & Django
datasette

187 8,881 9.2 Python shot-scraper VS datasette

An open source multi-tool for exploring and publishing data
intellij-community

101 16,567 10.0 shot-scraper VS intellij-community

IntelliJ IDEA Community Edition & IntelliJ Platform
ImageOptim

83 8,918 7.9 HTML shot-scraper VS ImageOptim

GUI image optimizer for Mac
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Traccar

38 4,794 9.6 Java shot-scraper VS Traccar

Traccar GPS Tracking System
playwright-python

31 10,675 9.0 Python shot-scraper VS playwright-python

Python version of the Playwright testing and automation library.
awesome-django

22 8,471 7.8 HTML shot-scraper VS awesome-django

A curated list of awesome things related to Django
sbts-aru

22 92 9.4 Shell shot-scraper VS sbts-aru

Low cost Raspberry Pi sound localizing portable Autonomous Recording Unit (ARU)
Nokogiri

20 6,105 9.4 C shot-scraper VS Nokogiri

Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby.
SeleniumBase

9 4,215 9.8 Python shot-scraper VS SeleniumBase

📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
oxipng

14 2,618 8.0 Rust shot-scraper VS oxipng

Multithreaded PNG optimizer written in Rust
VimMode.spoon

14 657 0.0 Lua shot-scraper VS VimMode.spoon

Adds vim keybindings to all OS X inputs
scrape-hacker-news-by-domain

4 33 9.9 JavaScript shot-scraper VS scrape-hacker-news-by-domain

Scrape HN to track links from specific domains
gmail-sidebar-drive

4 4 0.0 JavaScript shot-scraper VS gmail-sidebar-drive

A simple gmail add on to display all the drive folders and files in sidebar.
scrape-san-mateo-fire-dispatch

1 1 0.0 HTML shot-scraper VS scrape-san-mateo-fire-dispatch
zettelkasten

5 87 0.0 Shell shot-scraper VS zettelkasten

Creating notes with the zettelkasten note taking method and storing all notes on github
openalternative

5 322 9.3 TypeScript shot-scraper VS openalternative

A community driven list of open source alternatives to proprietary software and applications.
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better shot-scraper alternative or higher similarity.

Suggest an alternative to shot-scraper

shot-scraper reviews and mentions

Posts with mentions or reviews of shot-scraper. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-15.

I want to create IMDB for Open source projects
6 projects | news.ycombinator.com | 15 Apr 2024

I had one of these recently! https://github.com/simonw/shot-scraper/pull/133/files
They're /incredibly/ rare though.
2024-03-01 listening in on the neighborhood
5 projects | news.ycombinator.com | 2 Mar 2024
If anyone wants the raw data, it's available in window._Flourish_data variable on https://flo.uri.sh/visualisation/16818696/embed
Which means you can extract it with my https://shot-scraper.datasette.io/ tool like this:
```
    shot-scraper javascript \
```
Web Scraping in Python – The Complete Guide
11 projects | news.ycombinator.com | 20 Feb 2024

I strongly recommend adding Playwright to your set of tools for Python web scraping. It's by far the most powerful and best designed browser automation tool I've ever worked with.
I use it for my shot-scraper CLI tool: https://shot-scraper.datasette.io/ - which lets you scrape web pages directly from the command line by running JavaScript against pages to extract JSON data: https://shot-scraper.datasette.io/en/stable/javascript.html
A command-line utility for taking automated screenshots of websites
1 project | news.ycombinator.com | 15 Dec 2023
Don’t Build a General Purpose API to Power Your Own Front End (2021)
3 projects | news.ycombinator.com | 20 Aug 2023

This is exactly what the `Accept` HTTP header is for https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ac...
I think the author is generally correct that all JSON should be provided in a single request, but if you want to prove it, then you should be able to change your accept header to and from `application/json`/`text/html seeing nearly identical data.
In fact, this is what both GitLab and Github do. Try it out!
`curl -L https://github.com/simonw/shot-scraper` (text/html)
`curl --header "Accept: application/json" -L https://github.com/simonw/shot-scraper` (application/json)
Git scraping: track changes over time by scraping to a Git repository
18 projects | news.ycombinator.com | 10 Aug 2023

Git is a key technology in this approach, because the value you get out of this form of scraping is the commit history - it's a way of turning a static source of information into a record of how that information changed over time.
I think it's fine to use the term "scraping" to refer to downloading a JSON file.
These days an increasing number of websites work by serving up JSON which is then turned into HTML by a client-side JavaScript app. The JSON often isn't a formally documented API, but you can grab it directly to avoid the extra step of processing the HTML.
I do run Git scrapers that process HTML as well. A couple of examples:
scrape-san-mateo-fire-dispatch https://github.com/simonw/scrape-san-mateo-fire-dispatch scrapes the HTML from http://www.firedispatch.com/iPhoneActiveIncident.asp?Agency=... and records both the original HTML and converted JSON in the repository.
scrape-hacker-news-by-domain https://github.com/simonw/scrape-hacker-news-by-domain uses my https://shot-scraper.datasette.io/ browser automation tool to convert an HTML page on Hacker News into JSON and save that to the repo. I wrote more about how that works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/
Web Scraping via JavaScript Runtime Heap Snapshots (2022)
1 project | news.ycombinator.com | 8 Aug 2023
Need help with downloading a section of multiple sites as pdf files.
2 projects | /r/webscraping | 25 Mar 2023

You can use shot-scraper: https://github.com/simonw/shot-scraper
Ask HN: Small scripts, hacks and automations you're proud of?
49 projects | news.ycombinator.com | 12 Mar 2023

I have a neat Hacker News scraping setup that I'm really pleased with.
The problem: I want to know when content from one of my sites is submitted to Hacker News, and keep track of the points and comments over time. I also want to be alerted when it happens.
Solution: https://github.com/simonw/scrape-hacker-news-by-domain/
This repo does a LOT of things.
It's an implementation of my Git scraping pattern - https://simonwillison.net/2020/Oct/9/git-scraping/ - in that it runs a script once an hour to check for more content.
It scrapes https://news.ycombinator.com/from?site=simonwillison.net (scraping the HTML because this particular feature isn't supported by the Hacker News API) using shot-scraper - a tool I built for command-line browser automation: https://shot-scraper.datasette.io/
The scraper works by running this JavaScript against the page and recording the resulting JSON to the Git repository: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
That solves the "monitor and record any changes" bit.
But... I want alerts when my content shows up.
I solve that using three more tools I built: https://datasette.io/ and https://datasette.io/plugins/datasette-atom and https://datasette.cloud/
This script here runs to push the latest scraped JSON to my SQLite database hosted using my in-development SaaS platform, Datasette Cloud: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
I defined this SQL view https://simon.datasette.cloud/data/hacker_news_posts_atom which shows the latest data in the format required by the datasette-atom plugin.
Which means I can subscribe to the resulting Atom feed (add .atom to that URL) in NetNewsWire and get alerted when my content shows up on Hacker News!
I wrote a bit more about how this all works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/
Show HN: Plus – Self Updating Screenshots
3 projects | news.ycombinator.com | 17 Jan 2023

Sounds a lot like Simon Willison's open source project shot-scraper
https://github.com/simonw/shot-scraper
A note from our sponsor - InfluxDB
www.influxdata.com | 25 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →