scrape-hacker-news-by-domain
dragon
scrape-hacker-news-by-domain | dragon | |
---|---|---|
4 | 26 | |
35 | 1,209 | |
- | - | |
9.9 | 0.0 | |
2 days ago | about 1 year ago | |
JavaScript | C | |
- | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrape-hacker-news-by-domain
-
London Street Trees
Yeah I have a bunch of these using pretty-printed JSON - here's one that scrapes Hacker News for mentions of my site, for example: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
-
Git scraping: track changes over time by scraping to a Git repository
Git is a key technology in this approach, because the value you get out of this form of scraping is the commit history - it's a way of turning a static source of information into a record of how that information changed over time.
I think it's fine to use the term "scraping" to refer to downloading a JSON file.
These days an increasing number of websites work by serving up JSON which is then turned into HTML by a client-side JavaScript app. The JSON often isn't a formally documented API, but you can grab it directly to avoid the extra step of processing the HTML.
I do run Git scrapers that process HTML as well. A couple of examples:
scrape-san-mateo-fire-dispatch https://github.com/simonw/scrape-san-mateo-fire-dispatch scrapes the HTML from http://www.firedispatch.com/iPhoneActiveIncident.asp?Agency=... and records both the original HTML and converted JSON in the repository.
scrape-hacker-news-by-domain https://github.com/simonw/scrape-hacker-news-by-domain uses my https://shot-scraper.datasette.io/ browser automation tool to convert an HTML page on Hacker News into JSON and save that to the repo. I wrote more about how that works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/
-
Ask HN: Small scripts, hacks and automations you're proud of?
I have a neat Hacker News scraping setup that I'm really pleased with.
The problem: I want to know when content from one of my sites is submitted to Hacker News, and keep track of the points and comments over time. I also want to be alerted when it happens.
Solution: https://github.com/simonw/scrape-hacker-news-by-domain/
This repo does a LOT of things.
It's an implementation of my Git scraping pattern - https://simonwillison.net/2020/Oct/9/git-scraping/ - in that it runs a script once an hour to check for more content.
It scrapes https://news.ycombinator.com/from?site=simonwillison.net (scraping the HTML because this particular feature isn't supported by the Hacker News API) using shot-scraper - a tool I built for command-line browser automation: https://shot-scraper.datasette.io/
The scraper works by running this JavaScript against the page and recording the resulting JSON to the Git repository: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
That solves the "monitor and record any changes" bit.
But... I want alerts when my content shows up.
I solve that using three more tools I built: https://datasette.io/ and https://datasette.io/plugins/datasette-atom and https://datasette.cloud/
This script here runs to push the latest scraped JSON to my SQLite database hosted using my in-development SaaS platform, Datasette Cloud: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
I defined this SQL view https://simon.datasette.cloud/data/hacker_news_posts_atom which shows the latest data in the format required by the datasette-atom plugin.
Which means I can subscribe to the resulting Atom feed (add .atom to that URL) in NetNewsWire and get alerted when my content shows up on Hacker News!
I wrote a bit more about how this all works here: https://simonwillison.net/2022/Dec/2/datasette-write-api/
-
Datasette’s new JSON write API: The first alpha of Datasette 1.0
I'm really pleased with the Hacker News scraping demo in this - it's an extension of the scraper I wrote back in March, using shot-scraper to execute JavaScript in headless Chrome and write the resulting JSON back to a Git repo: https://simonwillison.net/2022/Mar/14/scraping-web-pages-sho...
My new demo also then pipes that data up to Datasette using curl -X POST - this script here: https://github.com/simonw/scrape-hacker-news-by-domain/blob/...
dragon
-
Drag and drop support for gokcehan lf file manager
https://www.reddit.com/r/suckless/comments/13hr5zy/comment/jmlxizk https://github.com/mwh/dragon
-
Is there any way or kitten to drag and drop from kitten
https://github.com/mwh/dragon https://github.com/nik012003/ripdrag
-
Drag and drop support for st?
Have a look at dragon
-
Ask HN: Small scripts, hacks and automations you're proud of?
I write a lot of extremely simple but handy shell functions.
This one lets me drag/and drop things out of a terminal session (kind of) into applications with https://github.com/mwh/dragon and i use it way too often!
-
[OC] XFiles: A modular X11 file browser (WIP)
I'm used on a terminal workflow (ranger fm in the past, switched to lf) on a desktopless wm. I prefer it that way, the only thing missing is drag 'n' drop functionality, mainly for web apps. There is dragon but I'm considering installing a light gui fm for the job.
-
"Super Buffer File" and Dragon integration
Yeah, some amount of extra explanation would have helped. I'm using this with a local program (https://github.com/mwh/dragon) that creates a pop-up GUI window (independent of Emacs) for "drag and drop" functionality. It only works with files on the local system, so the purpose of super-buffer-file is to create a local file associated with a buffer if one doesn't already exist, and return the name of that file.
-
Is there a way to use an external file picker on Linux?
Not a direct answer, but maybe still useful… They way I handle this is using a drag and drop tool.
-
TUI file manager killer functionality that never gets implemented!
I know there is dragon and the feature would require a terminal that supports it, but being able to simply select files and drag-and-drop them into a browser upload without requiring an additional window would be awesome.
-
How to copy files from ranger into clipboard?
You can use Dragon
- Dragon – simple drag-and-drop source/sink for X or Wayland
What are some alternatives?
scrape-san-mateo-fire-dispatch
mpv - 🎥 Command line video player
shot-scraper - A command-line utility for taking automated screenshots of websites
warpd - A modal keyboard-driven virtual pointer
zettelkasten - Creating notes with the zettelkasten note taking method and storing all notes on github
activate-linux - The "Activate Windows" watermark ported to Linux
hun_law_rs - Tool for parsing hungarian laws (Rust version)
applications
sf-tree-history - Tracking the history of trees in San Francisco
ranger_udisk_menu - This script draws menu to choose, mount and unmount drives using udisksctl and ncurses for ranger file manager
queensland-traffic-conditions - A scraper that tracks changes to the published queensland traffic incidents data
stretchly - The break time reminder app