22120 vs pywb

22120

💾 Diskernet - Your preferred backup solution. It's like you're still online! Full text search archive from your browsing and bookmarks. Weclome! to the Diskernet: an internet on yer disk. Disconnect with Diskernet, an internet for the post-online apocalypse. Or the airplane WiFi. Or the site goes down. Or ... You get the picture. Get Diskernet. 80s logo. Formerly 22120 (project codename) ;P ;) xx;p [Moved to: https://github.com/i5ik/Diskernet] (by c9fe)

DISCONTINUED

Suggest alternative

Edit details

pywb

Core Python Web Archiving Toolkit for replay and recording of web archives (by webrecorder)

Python wayback pywb web-archiving web-archives

Source Code

pypi.python.org

Suggest alternative

Edit details

Our great sponsors

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

Our great sponsors

22120		pywb
	Project
13	Mentions	7
2,638	Stars	1,300
-	Growth	1.8%
9.7	Activity	4.8
over 2 years ago	Latest Commit	10 days ago
JavaScript	Language	JavaScript
GNU General Public License v3.0 or later	License	GNU General Public License v3.0 only

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

22120

Posts with mentions or reviews of 22120. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-12-23.

Is there a browser addon which locally archives every website I visit?
4 projects | /r/DataHoarder | 23 Dec 2021

Here. An archivist browser controller that caches everything you browse, a library server with full text search to serve your archive.
Show HN: Irchiver, your full-resolution personal web archive
9 projects | news.ycombinator.com | 3 Dec 2021
Ask HN: Full text search engine in JavaScript for English and and Chinese?
1 project | news.ycombinator.com | 1 Nov 2021

Following your "hilarious" and disrespectful answer here https://github.com/i5ik/22120/issues/63#issuecomment-7275272..., I would prefer that you remove any reference to SingleFile in the description of your project. I could not open an issue because you blocked me. And please don't accuse people without proof.
22120: self-host the Internet with an Offline Archive. Similar to ArchiveBox, SingleFile and WebMemex. Works well with WorldBrain/Memex to give you full-text search. Why not WARC? Uses Chrome DevTools protocol to intercept all requests, and caches responses against a key of (method, URL)
1 project | /r/AltTech | 21 Oct 2021
Request: Proxy caching all visited websites text in DB, making history searchable
1 project | /r/selfhosted | 9 Oct 2021

https://github.com/i5ik/22120 is a tool that archives as you browse that you can then view offline later
Is the there a way I can cache videos(reddit.4chan) I watch in browser (Linux)?
1 project | /r/DataHoarder | 20 Aug 2021
So you want to write a GUI framework
13 projects | news.ycombinator.com | 11 Aug 2021

My solution to this (it's been done before), is to use the existing browser engine (not the system webview) installed. So far I only utilize Chrome, but as the way I connect to it is over the Chrome DevTools protocol which is somewhat fluent with the Remote Debugging Protocol[0] that Firefox is doing, this is a reasonable approach.
So far my "tool" to do this is simply a template repository with some conveniences, providing in essence a skeleton for these types of apps. I hope to flesh this out a little more, and expose a much richer API, as well as convert some of my existing popular apps (like 22120[1]) to the "framework".
The benefit of this is Graderjs has a built in 'app builder' that can create a cross-platform binary (excluding or ignoring the necessity (on MacOS) and near-necessity (on Windows) to sign your executable somehow, that lets you display your UI in JS/HTML/CSS using the already installed browser engine, as well as run code in NodeJS and using the rich APIs[2] of the browser engine itself. I'm really happy with this project and think that, even tho it's small now, it will in time become my most popular and powerful one: even bigger than my remote browser and popular web archiver.
Just give it time! :)
[0]: https://firefox-source-docs.mozilla.org/remote/index.html
[1]: https://github.com/i5ik/22120
[2]: https://chromedevtools.github.io/devtools-protocol/tot/Brows...
The GraderJS: https://github.com/i5ik/graderjs
Ask HN: Why saving webpages on hard disk has not got better?
1 project | news.ycombinator.com | 19 Mar 2021

I use this to backup pages automatically
https://github.com/i5ik/22120
Saving all browsed websites automatically
5 projects | /r/DataHoarder | 21 Jan 2021

Does this potentially help? https://github.com/c9fe/22120
Make Your Own Internet Archive with Archive Box
9 projects | news.ycombinator.com | 19 Jan 2021

From the blog comments, I think this is what you’re after https://github.com/c9fe/22120

pywb

Posts with mentions or reviews of pywb. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-10-07.

Is there any good software for deduping (deduplicating) content in WARC files?
1 project | /r/DataHoarder | 5 Apr 2023

I have thousands of bookmarks on raindrop.io that I've been wanting to archive for a while. However, I've archived ~150 pages so far with Pywb and it ended up being 500MB across two WARCs, even with the dedupe setting specified in my settings file. It dedupes while archiving pages. I want software to get any spots missed and be sure that WARCs are actually deduped.
Is there a way to easily and reliably SSH to my laptop no matter what wifi the laptop is connected to? I have no clue.
3 projects | /r/commandline | 7 Oct 2022

I don't know if the solution would be related or relevant to this, but I would also want to be able to remotely launch and access a web server, Pywb, on Safari on my iPad, also no matter what wifi I'm on. On a Mac, it would be launched with the command wayback and the server would be accessed on the Browser with localhost:8080.
I can't install a Python package, pywb, looks like a problem with brotlipy. What can I do?
1 project | /r/termux | 5 Oct 2022

Check their github site. I would try "git clone https://github.com/webrecorder/pywb `
Purevolume archives?
4 projects | /r/Archiveteam | 16 May 2022

I've been trying to open those large warc files these days. I've tried webrecorder, replayweb, pywb and warcat before but none of these worked well for me.
Ran grab-site now have some warc.gz files etc, the site in question was originally hosted in a mixture of html and javascript, what's the best and easiest way to make this accessible as a user for offline personal use?
1 project | /r/Archiveteam | 2 Apr 2022

pywb, but it requires creating a full copy of the data: https://github.com/webrecorder/pywb/issues/408
How good is ArchiveWeb.page?
2 projects | /r/Archiveteam | 16 Apr 2021

I found it to be good with loading small WARCs quickly, but it can longer if the WARC is larger. Webarchive player, while it's old and discontinued, I've found it work better than Webrecorder Player and replayweb.page. If you want newer software to replay WARCs, try Pywb. I find it to be the best WARC player.
Saving all browsed websites automatically
5 projects | /r/DataHoarder | 21 Jan 2021

I use pywb in proxy recording mode.

What are some alternatives?

When comparing 22120 and pywb you can also consider the following projects:

ArchiveBox - 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

conifer - Collect and revisit web pages.

asciidoctor-latex - :triangular_ruler: Add LaTeX features to AsciiDoc & convert AsciiDoc to LaTeX

warcio - Streaming WARC/ARC library for fast web archive IO

SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file

awesome-selfhosted - A list of Free Software network services and web applications which can be hosted on your own servers

notes - A zero dependency shell script that makes it really simple to manage your text notes.

replayweb.page - Serverless replay of web archives directly in the browser

DownloadNet - 💾 DownloadNet - All content you browse online available offline. Search through the full-text of all pages in your browser history. ⭐️ Star to support our work!

SingleFileZ - Web Extension to save a faithful copy of an entire web page in a self-extracting ZIP file

linux-surface - Linux Kernel for Surface Devices

webarchiveplayer - NOTE: This project is no longer being actively developed.. Check out Webrecorder Player for the latest player. https://github.com/webrecorder/webrecorderplayer-electron) (Legacy: Desktop application for browsing web archives (WARC and ARC)