ArchiveBox
22120
Our great sponsors
ArchiveBox | 22120 | |
---|---|---|
248 | 13 | |
19,737 | 2,638 | |
3.1% | - | |
9.7 | 9.7 | |
8 days ago | over 2 years ago | |
Python | JavaScript | |
MIT | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ArchiveBox
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
Vice website is shutting down
If you really want to save the content for yourself, use something like https://archivebox.io/
I've been running a local instance for a few years now and download/save tech articles all time. I can search and find them as needed.
-
An Introduction to the WARC File
API is coming soon (relatively, it's still a one-man project)! Stay tuned https://github.com/ArchiveBox/ArchiveBox/issues/496
I have an event-sourcing refactor in progress now to allow us to pluginize functionality like the API (similar to Home Assistant with a plugin app sotre), it will take a month or two. Next up is the REST API using the new plugin system.
-
Ask HN: How can I back up an old vBulletin forum without admin access?
I guess your best chance is to use something like https://archivebox.io/.
-
ArchiveBox – open-source self-hosted web archiving
Yeah this is a cool project but it was discussed 2 days ago.
As mentioned by the maintainer there, they even maintain a list of alternatives, very classy:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-...
- ArchiveBox: Open-source self-hosted web archiving
- Linkhut: A Social Bookmarking Site
- Show HN: Rem: Remember Everything (open source)
- Bookmark manager with a focus on organization?
22120
-
Is there a browser addon which locally archives every website I visit?
Here. An archivist browser controller that caches everything you browse, a library server with full text search to serve your archive.
- Show HN: Irchiver, your full-resolution personal web archive
-
Ask HN: Full text search engine in JavaScript for English and and Chinese?
Following your "hilarious" and disrespectful answer here https://github.com/i5ik/22120/issues/63#issuecomment-7275272..., I would prefer that you remove any reference to SingleFile in the description of your project. I could not open an issue because you blocked me. And please don't accuse people without proof.
- 22120: self-host the Internet with an Offline Archive. Similar to ArchiveBox, SingleFile and WebMemex. Works well with WorldBrain/Memex to give you full-text search. Why not WARC? Uses Chrome DevTools protocol to intercept all requests, and caches responses against a key of (method, URL)
-
Request: Proxy caching all visited websites text in DB, making history searchable
https://github.com/i5ik/22120 is a tool that archives as you browse that you can then view offline later
- Is the there a way I can cache videos(reddit.4chan) I watch in browser (Linux)?
-
So you want to write a GUI framework
My solution to this (it's been done before), is to use the existing browser engine (not the system webview) installed. So far I only utilize Chrome, but as the way I connect to it is over the Chrome DevTools protocol which is somewhat fluent with the Remote Debugging Protocol[0] that Firefox is doing, this is a reasonable approach.
So far my "tool" to do this is simply a template repository with some conveniences, providing in essence a skeleton for these types of apps. I hope to flesh this out a little more, and expose a much richer API, as well as convert some of my existing popular apps (like 22120[1]) to the "framework".
The benefit of this is Graderjs has a built in 'app builder' that can create a cross-platform binary (excluding or ignoring the necessity (on MacOS) and near-necessity (on Windows) to sign your executable somehow, that lets you display your UI in JS/HTML/CSS using the already installed browser engine, as well as run code in NodeJS and using the rich APIs[2] of the browser engine itself. I'm really happy with this project and think that, even tho it's small now, it will in time become my most popular and powerful one: even bigger than my remote browser and popular web archiver.
Just give it time! :)
[0]: https://firefox-source-docs.mozilla.org/remote/index.html
[1]: https://github.com/i5ik/22120
[2]: https://chromedevtools.github.io/devtools-protocol/tot/Brows...
The GraderJS: https://github.com/i5ik/graderjs
-
Ask HN: Why saving webpages on hard disk has not got better?
I use this to backup pages automatically
https://github.com/i5ik/22120
-
Saving all browsed websites automatically
Does this potentially help? https://github.com/c9fe/22120
-
Make Your Own Internet Archive with Archive Box
From the blog comments, I think this is what you’re after https://github.com/c9fe/22120
What are some alternatives?
Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.
asciidoctor-latex - :triangular_ruler: Add LaTeX features to AsciiDoc & convert AsciiDoc to LaTeX
paimon-moe - Your best Genshin Impact companion! Help you plan what to farm with ascension calculator and database. Also track your progress with todo and wish counter.
pywb - Core Python Web Archiving Toolkit for replay and recording of web archives
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file
ArchivesSpace - The ArchivesSpace archives management tool
notes - A zero dependency shell script that makes it really simple to manage your text notes.
grab-site - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
DownloadNet - 💾 DownloadNet - All content you browse online available offline. Search through the full-text of all pages in your browser history. ⭐️ Star to support our work!
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
linux-surface - Linux Kernel for Surface Devices