CKAN
ArchiveBox
| CKAN | ArchiveBox | |
|---|---|---|
| 9 | 275 | |
| 5,044 | 27,614 | |
| 0.5% | 1.1% | |
| 9.9 | 9.6 | |
| 2 days ago | 2 days ago | |
| Python | Python | |
| GNU General Public License v3.0 or later | MIT |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CKAN
- CKAN – an open-source DMS (data management system)
- CKAN – The open source data management system
- Open Source takes center stage at United Nations
-
Open Source Flask-based web applications
CKAN The Open Source Data Portal Software
-
Metadata Store - Which one to Choose ? OpenMetadata vs Datahub ?
We use Kubernetes as our deployment platform. Any feedback on one of these open source data catalogs ? - https://atlas.apache.org/#/ - https://opendatadiscovery.org/ - https://open-metadata.org/ - https://marquezproject.github.io/marquez/ - https://datahubproject.io/ - https://www.amundsen.io/ - https://ckan.org/ - https://magda.io/
-
What 'tool' is used to build OpenData sites?
CKAN (https://ckan.org/) is what data.gov and most state governments use.
-
Software and tools for (non-human) genomics data platform
Our first instinct is to use [CKAN](https://ckan.org) for cataloging (and storage, with modifications), especially since we know it and know that it has been used successfully elsewhere. However, we suspect that more specialized/better tools exist for this, thus why I kindly ask for your insights.
-
How to start Data Science and Machine Learning Career?
Ckan
-
We are digitisers at the Natural History Museum in London, on a mission to digitise 80 million specimens and free their data to the world. Ask us anything!
We publish all our data on the [Data Portal](https://data.nhm.ac.uk), a Museum project that's been running since 2014. Instead of MediaWiki it runs on an open-source Python framework called [CKAN](https://ckan.org), which is designed for hosting datasets - though we've had to adapt it in various ways so that it can handle such large amounts of data.
ArchiveBox
- The Speed of Prototyping in the Age of AI
-
Wikipedia bans Archive.today after site executed DDoS and altered web captures
A bit off topic, but are there any self hosted open source archiving servers people are using for personal usage?
I think ArchiveBox[1] is the most popular. I will give it a shot, but it's a shame they don't support URL rewriting[2], which would be pretty important to me. I read a lot of blog and news articles that are split across multiple pages, and it's quite annoying to have to individually search through the archive for each page one by one instead of the "next page" button going to the next archived page.
1: https://archivebox.io/
2: https://github.com/ArchiveBox/ArchiveBox/discussions/1395
-
Internet Increasingly Becoming Unarchivable
I run an ArchiveBox instance locally. Recommended! https://archivebox.io/
-
Adguard DNS received suspicious pressure to block archive.is
Friendly reminder that archive box exists to let you self host your own archive service.
https://github.com/ArchiveBox/ArchiveBox
I dream of a day where archivebox becomes a fleet of homelabs all over the world making it drastically harder to block them all.
- Perkeep lets you permanently keep your stuff, for life
-
YouTube downloaders (and how Google silenced the press)
https://archivebox.io/ could be a solution for that
-
Linkwarden: FOSS self-hostable bookmarking with AI-tagging and page archival
I've used https://historio.us since 2011 and still pay for it to keep access to all the pages I've archived over the years. The price has been kept low enough that I can't bring myself to cancel it even though I've been using self-hosted https://archivebox.io/ for the last few years.
I always include an archived link whenever I reference something in documentation. That's my main use at the moment.
However, I also feel like I've gotten a lot of really good value when trying to learn a new development topic. Whenever I find something that looks like it might be useful, I archive it and, because everything is searchable, I end up with a searchable index of really high quality content once I actually know what I'm doing.
I find it hard to rediscover content via web search these days and there's so much churn that having a personal archive of useful content is going to increase in value, at least in my opinion.
-
Links copied from project READMEs now add "?tab=readme-ov-file" query parameter
The links the reporter are trying to use already don't work on mobile. If you want to link to the README file, link to the README file, e.g. https://github.com/ArchiveBox/ArchiveBox/blob/dev/README.md
I'll concede that this latter link is much longer than it perhaps should be, but I don't think the links the reporter used previous should have ever been used as they don't work for a lot of people.
- Small Archives
-
Ask HN: How Do You Bookmark?
2. Drop the link into my instance of ArchiveBox [0] and will return to it a few weeks/months later or, more often than not, never again
[0] https://archivebox.io/
What are some alternatives?
ArchivesSpace - ArchivesSpace, the archives management tool
browsertrix-crawler - Run a high-fidelity browser-based web archiving crawler in a single Docker container
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.
Collective Access: Providence - Cataloguing and data/media management application
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file