warcprox
archiveweb.page
warcprox | archiveweb.page | |
---|---|---|
7 | 7 | |
364 | 739 | |
1.1% | 3.1% | |
6.4 | 6.3 | |
7 months ago | about 1 month ago | |
Python | JavaScript | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
warcprox
-
Offpunk 2.0
I've looked into archiving all the pages i visit as well and warcprox[1] has been bookmarked for a while now
Hard drive storage space being so cheap in the ~$15/TB range makes this more feasible even for video archival
[1] https://github.com/internetarchive/warcprox
- What is warcprox ?
-
r18 database of metadata
wget also supports WARC options if you don't need javascript etc. If you do, there's also Warcprox (https://github.com/internetarchive/warcprox), brozzler (https://github.com/internetarchive/brozzler) (which uses warcprox internally), and others.
- [HELP] I´m looking for some self-hosted solution where users can connect to the website, connect to a page, browse it and save it.
-
tofuproxy – web proxy, TLS terminator, X.509 TOFU manager, WARC/gemini browser
Wow, it's really rare these days to see a tool that supports WARC.
Despite being an ISO standard [1] and the default archive format of the internet archive, and despite a handfull of lovingly crafted tools (such as webrecorder [2], warcprox etc.), it never seems to have caught on in a broader context.
Really a shame - I' deeply convinced that the ability to archive and replay requests is a technique for defending and strengthening user rights.
Links:
[1] https://www.iso.org/standard/44717.html
[2] https://github.com/webrecorder/webrecorder-desktop
[3] https://github.com/internetarchive/warcprox
- Browser Extension for Saving Images As While Browsing
-
How to archive the tweets and replies of my own terminated twitter account(s)
However, if you prefer something open-source, you could accomplish the same thing with a tool like https://archiveweb.page/ or https://github.com/internetarchive/warcprox
archiveweb.page
- Webrecorder: Capture interactive websites and replay them at a later time
- r18 database of metadata
-
Ask HN: What is going on at archive.ph?
I use this from time to time to archive web page for my own use
https://github.com/webrecorder/archiveweb.page
I also convert web snippets I find useful to the markdown format and store it in my Joplin notebook, so that way it lives on even if the website is gone.
- "scrape" a javascript object from a website?
-
GitNoter – An open source alternative to Evernote (Self Hosted)
There's also ArchiveWeb.page, which records in the same WARC format as archive.org
https://github.com/webrecorder/archiveweb.page
- Archiveweb.page – A High-Fidelity Web Archiving Extension for Chromium Browsers
What are some alternatives?
replayweb.page - Serverless replay of web archives directly in the browser
TWINT - An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
oldweb-today - Browse emulated browsers connected to old web sites in your browser!
brozzler - brozzler - distributed browser-based web crawler
Surfingkeys - Map your keys for web surfing, expand your browser with javascript and keyboard.
webrecorder-desktop - Webrecorder Desktop App!
ungoogled-chromium-extension-installer - Extension for Ungoogled Chromium that allows easy installation of extensions from Chrome webstore.
auto-save-html - Firefox extension that automatically dumps HTML when browsing a specified site
kdeconnect-chrome-extension - A browser extension to send pages and content from your browser to connected KDE Connect devices.
conifer - Collect and revisit web pages.
extension.js - 🧩 Plug-and-play, zero-config, cross-browser extension development tool.