ArchiveBox
Archivematica
Our great sponsors
ArchiveBox | Archivematica | |
---|---|---|
248 | 4 | |
19,433 | 400 | |
3.3% | 0.5% | |
9.7 | 9.0 | |
8 days ago | 2 days ago | |
Python | Python | |
MIT | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ArchiveBox
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
An Introduction to the WARC File
API is coming soon (relatively, it's still a one-man project)! Stay tuned https://github.com/ArchiveBox/ArchiveBox/issues/496
I have an event-sourcing refactor in progress now to allow us to pluginize functionality like the API (similar to Home Assistant with a plugin app sotre), it will take a month or two. Next up is the REST API using the new plugin system.
The ArchiveBox project (which gets reposted on the regular: e.g. https://news.ycombinator.com/item?id=38954189 ) also saves in WARC https://github.com/ArchiveBox/ArchiveBox#output-formats although I've personally not used it to comment further
-
Ask HN: How can I back up an old vBulletin forum without admin access?
I guess your best chance is to use something like https://archivebox.io/.
-
ArchiveBox – open-source self-hosted web archiving
Yeah this is a cool project but it was discussed 2 days ago.
As mentioned by the maintainer there, they even maintain a list of alternatives, very classy:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-...
-
ArchiveBox: Open-source self-hosted web archiving
Actually closer to 7 years ago :)
You can learn about the origin story / motivation here:
https://github.com/ArchiveBox/ArchiveBox#background--motivat...
https://2020.pycon.co/en/talks/5/ (a conference talk I gave about it)
Direct link: https://3xn.nl/projects/2022/02/17/archivebox-root-issue-in-...
note you no longer need to create a user manually though, so this shouldn't be an issue anymore. just set ADMIN_USERNAME and ADMIN_PASSWORD env vars and it'll autocreate the user and collection on first run.
https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#...
I may add an opt-in federation option at some point in the far future, it would be great to figure out a way to link willing donor's ArchiveBox instances together for public benefit.
Follow here for progress: https://github.com/ArchiveBox/ArchiveBox/issues/50
Archivematica
What are some alternatives?
Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.
paimon-moe - Your best Genshin Impact companion! Help you plan what to farm with ascension calculator and database. Also track your progress with todo and wish counter.
ArchivesSpace - The ArchivesSpace archives management tool
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file
grab-site - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
knowledge - Everything I know
Collective Access: Providence - Cataloguing and data/media management application
logseq - A local-first, non-linear, outliner notebook for organizing and sharing your personal knowledge base. Use it to organize your todo list, to write your journals, or to record your unique life.
CKAN - CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.