SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Archive Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
DownloadNet
💾 DownloadNet - All content you browse online available offline. Search through the full-text of all pages in your browser history. ⭐️ Star to support our work!
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
SingleFileZ
Web Extension to save a faithful copy of an entire web page in a self-extracting ZIP file
-
web-archives
Browser extension for viewing archived and cached versions of web pages, available for Chrome, Edge and Safari
-
webscrapbook
A browser extension that captures web pages to local device or backend server for future retrieval, organization, annotation, and edit. This project inherits from legacy Firefox add-on ScrapBook X.
-
p7zip
A new p7zip fork with additional codecs and improvements (forked from https://sourceforge.net/projects/sevenzip/ AND https://sourceforge.net/projects/p7zip/).
-
slackdump
Save or export your private and public Slack messages, threads, files, and users locally without admin privileges.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: How SingleFile Transformed My Obsidian Workflow | news.ycombinator.com | 2024-01-26That's interesting. I have been saving articles as PDF files, which is browser-independent, but useful just for search and reference, a nuisance to quote/copy-and-paste.
If I search only the computer, I don't get results from EBay and Amazon at the top. The idea of keeping the knowledge base separate from the primary notes is a good idea. In my case, that knowledge base is the file system, and the primary notes are whatever I choose.
When I was using Evernote, the inbox was the knowledge base and notebooks were the focus. I just had too many different potential projects going on to manage this well.
Looking to focus.
I'll revisit Firefox and SingleFile.
Explanation of the zip file inside.
https://github.com/gildas-lormeau/SingleFile/blob/master/faq...
Project mention: ArchiveBox: Open-source self-hosted web archiving | news.ycombinator.com | 2024-01-11For anyone who uses Chrome and wants to view their archived pages in the browser as if they were still online (URL and everything intact), and also full-text search through their browsing history that was archived (like AB plans to add in future, I think, right nikki?) you can check out DownloadNet: https://github.com/dosyago/DownloadNet
You can have multiple archives, and even use a mode where you only archive pages you bookmark rather than everything.
(I used yark to archive it, so just install it, use the command 'yark view ML_Andrew_Ng' on the folder you put everything and it'll bring up a webpage where you can see the videos with the correct thumbnails and names.)
So far my best option seem to be https://github.com/kanishka-linux/reminiscence(which I haven't seen in any list of these type of apps for some reason) but that received no updates in 5 years(the dev apparently has no free time to work on it in the foreseeable future) and it has a few active bugs so if I can find something more stable, it would be ideal.
I wish there was an alternative to the Internet Archive with collaborative curation. You share files and people who tag and sort them into albums can download them. And if it was federated it could be just as extensive as the Internet Archive by searching files on many instances at the same time. Sadly the closest thing are ArchiveBox and wayback which won't replace the Internet Archive.
For arbitrary state changes however, it's better to use something like casync. Note that there are a lot of tunables, implicit and explicit; for package indexing I would particularly think about "how is the index sorted" and "what is the desired chunk size".
Project mention: An Ugly Single-Page Website Makes $5k a Month with Affiliate Marketing | news.ycombinator.com | 2024-01-20
Project mention: Show HN: I made a tool to clean and convert any webpage to Markdown | news.ycombinator.com | 2024-04-14
nearly every main distro I am aware of has both available. The reason you still see p7zip is because the CLI incompatibilities vs the newer 7z/7zip executables and the general licensing issues. Most users of "old p7zip" are actually using the actively maintained https://github.com/p7zip-project/p7zip which is updated, supporting unix permissions and zstd and so on.
Project mention: Make backup of your all Slack messages, threads, files, and users locally | news.ycombinator.com | 2023-06-11
Archive related posts
- We Need to Rewild the Internet
- Arx: Store files and directory in an archive quickly and with random access
- The internet is slipping out of our reach
- How SingleFile Transformed My Obsidian Workflow
- An Ugly Single-Page Website Makes $5k a Month with Affiliate Marketing
- Portable Web Documents – An Alternative to PDF Based on HTML5 and Web Standards
- Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search [pdf]
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Apr 2024
Index
What are some of the best open-source Archive projects? This list will help you:
Project | Stars | |
---|---|---|
1 | SingleFile | 13,604 |
2 | KodExplorer | 6,164 |
3 | tubearchivist | 4,012 |
4 | DownloadNet | 3,637 |
5 | LinkAce | 2,418 |
6 | bulk-downloader-for-reddit | 2,203 |
7 | unblob | 2,044 |
8 | yark | 1,839 |
9 | SingleFileZ | 1,759 |
10 | Reminiscence | 1,717 |
11 | wayback | 1,642 |
12 | casync | 1,461 |
13 | goblin | 1,135 |
14 | box | 1,072 |
15 | web-archives | 1,062 |
16 | nyaa | 983 |
17 | webscrapbook | 824 |
18 | p7zip | 735 |
19 | slackdump | 714 |
20 | php-scoper | 671 |
21 | humblebundle-downloader | 507 |
22 | unp | 416 |
23 | spotify-playlist-archive | 381 |