map-of-github VS shot-scraper

Compare map-of-github vs shot-scraper and see what are their differences.

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
map-of-github shot-scraper
16 16
968 1,535
- -
6.7 7.1
7 months ago about 1 month ago
Vue Python
- Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

map-of-github

Posts with mentions or reviews of map-of-github. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-15.
  • I want to create IMDB for Open source projects
    6 projects | news.ycombinator.com | 15 Apr 2024
    You may appreciate the map of github [0] it's a fantastic piece of work that trended here a while back.

    [0]: https://anvaka.github.io/map-of-github/

  • Ask HN: Favourite Data Visualization Examples?
    4 projects | news.ycombinator.com | 30 Jan 2024
    https://www.reddit.com/r/dataisbeautiful/comments/i8saks/ive...

    https://www.reddit.com/r/dataisbeautiful/comments/ic2i0k/oc_...

    Stock chart as a landscape image

    And I also like these "races" basically a timelapse of a chart

    https://www.reddit.com/r/dataisbeautiful/comments/f68fzr/oc_...

    And whatever anvaka does, all can be tried out in the browser

    https://github.com/anvaka?tab=repositories&q=&type=&language...

    e.g. map of reddit:

    https://github.com/anvaka/map-of-reddit

    Video Demo: https://www.reddit.com/r/dataisbeautiful/comments/12pem68/oc...

    Similarly map of github

    https://github.com/anvaka/map-of-github

    Or package manager visualizations:

    https://github.com/anvaka/pm

    Google search completion visualizations

    https://github.com/anvaka/vs

  • A Confession Exposes India’s Hacking Industry
    1 project | news.ycombinator.com | 6 Jun 2023
    I don't find these argument convincing

    - hackers in Shenzhen are underrepresented b/c of the language barrier. There is actually a huge Chinese language ecosystem (a lot of it off Github on Gitee) that is sort of impenetrable but you come across it all the time. I just typically have no way to use their work b/c it's under-documented in English. If you zip around this map you will see lots of "Zh" islands: https://anvaka.github.io/map-of-github/

    - While I understand that it's not as financial comfortable for a program in Indian compared wtih the US or Europe, Chinese and Russian hackers aren't significantly better off. Your typical Chinese program is also of course worried about working for a big company and making money and all of that jazz

    - "escape the hell that is 'unpaid open source work' " - I think maybe is the crux of it... Most people work on open source work b/c they enjoy it.. I don't personally know anyone who got hired from a deliberate effort to build an open source portfolio. I wouldn't recommend it to anyone as a way to try to find a job

  • Map of GitHub - can you find Tailwind here? :)
    1 project | /r/tailwindcss | 19 May 2023
  • [OC] I made a map of GitHub. It lets you find related projects with ease
    1 project | /r/u_Key_Young_9228 | 14 May 2023
    5 projects | /r/programming | 14 May 2023
    The source code with description of the method is available here: https://github.com/anvaka/map-of-github
  • [OC] A new map of GitHub made from 350M stars, shows 460,000 projects
    2 projects | /r/dataisbeautiful | 14 May 2023
    https://anvaka.github.io/map-of-github/ - here it is
  • Hundreds of millions of stars turned into a map of GitHub projects
    1 project | /r/patient_hackernews | 13 May 2023
    1 project | /r/hackernews | 13 May 2023
    10 projects | news.ycombinator.com | 13 May 2023
    I'm quite unhappy with SBCL located in Schemaria:

    https://anvaka.github.io/map-of-github/#12/13.469/-8.175

    It should be Lispaña.

shot-scraper

Posts with mentions or reviews of shot-scraper. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-15.
  • I want to create IMDB for Open source projects
    6 projects | news.ycombinator.com | 15 Apr 2024
    I had one of these recently! https://github.com/simonw/shot-scraper/pull/133/files

    They're /incredibly/ rare though.

  • 2024-03-01 listening in on the neighborhood
    5 projects | news.ycombinator.com | 2 Mar 2024
    If anyone wants the raw data, it's available in window._Flourish_data variable on https://flo.uri.sh/visualisation/16818696/embed

    Which means you can extract it with my https://shot-scraper.datasette.io/ tool like this:

        shot-scraper javascript \
  • Web Scraping in Python – The Complete Guide
    11 projects | news.ycombinator.com | 20 Feb 2024
    I strongly recommend adding Playwright to your set of tools for Python web scraping. It's by far the most powerful and best designed browser automation tool I've ever worked with.

    I use it for my shot-scraper CLI tool: https://shot-scraper.datasette.io/ - which lets you scrape web pages directly from the command line by running JavaScript against pages to extract JSON data: https://shot-scraper.datasette.io/en/stable/javascript.html

  • A command-line utility for taking automated screenshots of websites
    1 project | news.ycombinator.com | 15 Dec 2023
  • Don’t Build a General Purpose API to Power Your Own Front End (2021)
    3 projects | news.ycombinator.com | 20 Aug 2023
    This is exactly what the `Accept` HTTP header is for https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ac...

    I think the author is generally correct that all JSON should be provided in a single request, but if you want to prove it, then you should be able to change your accept header to and from `application/json`/`text/html seeing nearly identical data.

    In fact, this is what both GitLab and Github do. Try it out!

    `curl -L https://github.com/simonw/shot-scraper` (text/html)

    `curl --header "Accept: application/json" -L https://github.com/simonw/shot-scraper` (application/json)

  • Git scraping: track changes over time by scraping to a Git repository
    18 projects | news.ycombinator.com | 10 Aug 2023
    Git is a key technology in this approach, because the value you get out of this form of scraping is the commit history - it's a way of turning a static source of information into a record of how that information changed over time.

    I think it's fine to use the term "scraping" to refer to downloading a JSON file.

    These days an increasing number of websites work by serving up JSON which is then turned into HTML by a client-side JavaScript app. The JSON often isn't a formally documented API, but you can grab it directly to avoid the extra step of processing the HTML.

    I do run Git scrapers that process HTML as well. A couple of examples:

    scrape-san-mateo-fire-dispatch https://github.com/simonw/scrape-san-mateo-fire-dispatch scrapes the HTML from http://www.firedispatch.com/iPhoneActiveIncident.asp?Agency=... and records both the original HTML and converted JSON in the repository.

    scrape-hacker-news-by-domain https://github.com/simonw/scrape-hacker-news-by-domain uses my https://shot-scraper.datasette.io/ browser automation tool to convert an HTML page on Hacker News into JSON and save that to the repo. I wrote more about how that works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/

  • Web Scraping via JavaScript Runtime Heap Snapshots (2022)
    1 project | news.ycombinator.com | 8 Aug 2023
  • Need help with downloading a section of multiple sites as pdf files.
    2 projects | /r/webscraping | 25 Mar 2023
    You can use shot-scraper: https://github.com/simonw/shot-scraper
  • Ask HN: Small scripts, hacks and automations you're proud of?
    49 projects | news.ycombinator.com | 12 Mar 2023
    I have a neat Hacker News scraping setup that I'm really pleased with.

    The problem: I want to know when content from one of my sites is submitted to Hacker News, and keep track of the points and comments over time. I also want to be alerted when it happens.

    Solution: https://github.com/simonw/scrape-hacker-news-by-domain/

    This repo does a LOT of things.

    It's an implementation of my Git scraping pattern - https://simonwillison.net/2020/Oct/9/git-scraping/ - in that it runs a script once an hour to check for more content.

    It scrapes https://news.ycombinator.com/from?site=simonwillison.net (scraping the HTML because this particular feature isn't supported by the Hacker News API) using shot-scraper - a tool I built for command-line browser automation: https://shot-scraper.datasette.io/

    The scraper works by running this JavaScript against the page and recording the resulting JSON to the Git repository: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...

    That solves the "monitor and record any changes" bit.

    But... I want alerts when my content shows up.

    I solve that using three more tools I built: https://datasette.io/ and https://datasette.io/plugins/datasette-atom and https://datasette.cloud/

    This script here runs to push the latest scraped JSON to my SQLite database hosted using my in-development SaaS platform, Datasette Cloud: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...

    I defined this SQL view https://simon.datasette.cloud/data/hacker_news_posts_atom which shows the latest data in the format required by the datasette-atom plugin.

    Which means I can subscribe to the resulting Atom feed (add .atom to that URL) in NetNewsWire and get alerted when my content shows up on Hacker News!

    I wrote a bit more about how this all works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/

  • Show HN: Plus – Self Updating Screenshots
    3 projects | news.ycombinator.com | 17 Jan 2023
    Sounds a lot like Simon Willison's open source project shot-scraper

    https://github.com/simonw/shot-scraper

What are some alternatives?

When comparing map-of-github and shot-scraper you can also consider the following projects:

Anime-Girls-Holding-Programming-

gmail-sidebar-drive - A simple gmail add on to display all the drive folders and files in sidebar.

github-explorer - Everything You Always Wanted To Know About GitHub (But Were Afraid To Ask)

zettelkasten - Creating notes with the zettelkasten note taking method and storing all notes on github

city-roads - Visualization of all roads within any city

scrape-san-mateo-fire-dispatch

Anime-Girls-Holding-Programming-Books - Anime Girls Holding Programming Books

bbcrss - Scrapes the headlines from BBC News indexes every five minutes

Comcast - Simulating shitty network connections so you can build better systems.

scrape-hacker-news-by-domain - Scrape HN to track links from specific domains

map-of-github-data - This repository contains tiles and data for the map of github

SeleniumBase - 📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.