ArchiveBox
CKAN
Our great sponsors
ArchiveBox | CKAN | |
---|---|---|
248 | 6 | |
19,433 | 4,222 | |
3.3% | 1.6% | |
9.7 | 9.8 | |
7 days ago | 7 days ago | |
Python | Python | |
MIT | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ArchiveBox
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
An Introduction to the WARC File
API is coming soon (relatively, it's still a one-man project)! Stay tuned https://github.com/ArchiveBox/ArchiveBox/issues/496
I have an event-sourcing refactor in progress now to allow us to pluginize functionality like the API (similar to Home Assistant with a plugin app sotre), it will take a month or two. Next up is the REST API using the new plugin system.
The ArchiveBox project (which gets reposted on the regular: e.g. https://news.ycombinator.com/item?id=38954189 ) also saves in WARC https://github.com/ArchiveBox/ArchiveBox#output-formats although I've personally not used it to comment further
-
Ask HN: How can I back up an old vBulletin forum without admin access?
I guess your best chance is to use something like https://archivebox.io/.
-
ArchiveBox – open-source self-hosted web archiving
Yeah this is a cool project but it was discussed 2 days ago.
As mentioned by the maintainer there, they even maintain a list of alternatives, very classy:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-...
-
ArchiveBox: Open-source self-hosted web archiving
Actually closer to 7 years ago :)
You can learn about the origin story / motivation here:
https://github.com/ArchiveBox/ArchiveBox#background--motivat...
https://2020.pycon.co/en/talks/5/ (a conference talk I gave about it)
Direct link: https://3xn.nl/projects/2022/02/17/archivebox-root-issue-in-...
note you no longer need to create a user manually though, so this shouldn't be an issue anymore. just set ADMIN_USERNAME and ADMIN_PASSWORD env vars and it'll autocreate the user and collection on first run.
https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#...
I may add an opt-in federation option at some point in the far future, it would be great to figure out a way to link willing donor's ArchiveBox instances together for public benefit.
Follow here for progress: https://github.com/ArchiveBox/ArchiveBox/issues/50
CKAN
-
Open Source Flask-based web applications
CKAN The Open Source Data Portal Software
-
Metadata Store - Which one to Choose ? OpenMetadata vs Datahub ?
We use Kubernetes as our deployment platform. Any feedback on one of these open source data catalogs ? - https://atlas.apache.org/#/ - https://opendatadiscovery.org/ - https://open-metadata.org/ - https://marquezproject.github.io/marquez/ - https://datahubproject.io/ - https://www.amundsen.io/ - https://ckan.org/ - https://magda.io/
-
How to start Data Science and Machine Learning Career?
Ckan
-
We are digitisers at the Natural History Museum in London, on a mission to digitise 80 million specimens and free their data to the world. Ask us anything!
We publish all our data on the [Data Portal](https://data.nhm.ac.uk), a Museum project that's been running since 2014. Instead of MediaWiki it runs on an open-source Python framework called [CKAN](https://ckan.org), which is designed for hosting datasets - though we've had to adapt it in various ways so that it can handle such large amounts of data.
What are some alternatives?
Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.
paimon-moe - Your best Genshin Impact companion! Help you plan what to farm with ascension calculator and database. Also track your progress with todo and wish counter.
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file
ArchivesSpace - The ArchivesSpace archives management tool
grab-site - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
knowledge - Everything I know
logseq - A local-first, non-linear, outliner notebook for organizing and sharing your personal knowledge base. Use it to organize your todo list, to write your journals, or to record your unique life.
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
Shiori - Simple bookmark manager built with Go