warcprox
webcrystal
warcprox | webcrystal | |
---|---|---|
7 | 3 | |
364 | 24 | |
1.1% | - | |
6.4 | 10.0 | |
7 months ago | over 1 year ago | |
Python | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
warcprox
-
Offpunk 2.0
I've looked into archiving all the pages i visit as well and warcprox[1] has been bookmarked for a while now
Hard drive storage space being so cheap in the ~$15/TB range makes this more feasible even for video archival
[1] https://github.com/internetarchive/warcprox
- What is warcprox ?
-
r18 database of metadata
wget also supports WARC options if you don't need javascript etc. If you do, there's also Warcprox (https://github.com/internetarchive/warcprox), brozzler (https://github.com/internetarchive/brozzler) (which uses warcprox internally), and others.
- [HELP] I´m looking for some self-hosted solution where users can connect to the website, connect to a page, browse it and save it.
-
tofuproxy – web proxy, TLS terminator, X.509 TOFU manager, WARC/gemini browser
Wow, it's really rare these days to see a tool that supports WARC.
Despite being an ISO standard [1] and the default archive format of the internet archive, and despite a handfull of lovingly crafted tools (such as webrecorder [2], warcprox etc.), it never seems to have caught on in a broader context.
Really a shame - I' deeply convinced that the ability to archive and replay requests is a technique for defending and strengthening user rights.
Links:
[1] https://www.iso.org/standard/44717.html
[2] https://github.com/webrecorder/webrecorder-desktop
[3] https://github.com/internetarchive/warcprox
- Browser Extension for Saving Images As While Browsing
-
How to archive the tweets and replies of my own terminated twitter account(s)
However, if you prefer something open-source, you could accomplish the same thing with a tool like https://archiveweb.page/ or https://github.com/internetarchive/warcprox
webcrystal
-
SearXNG is a free internet metasearch engine
While it lacks a search feature last I checked there's always https://github.com/davidfstr/webcrystal
One .py file. Only one dependency (urllib3).n with a little love the concept could become a full transparent proxy.
-
Offpunk 2.0
From the the project page it says:
> The offline content is stored in ~/.cache/offpunk/ as plain .gmi/.html files. The structure of the Gemini-space is tentatively recreated. One key element of the design is to avoid any database. The cache can thus be modified by hand, content can be removed, used or added by software other than offpunk.
One ambition I have it to setup
https://github.com/davidfstr/webcrystal
> An archiving HTTP proxy and on-disk archival format for websites.
so that all my regular web browsing is auto archived at some level.
It would sure be neat if the archive formats could be compatible. It would allow for a setup where everything I’ve seen with my eyes is then immediately accessible programmatically or in a terminal. I feel that could open some significant productive advantages, especially in the age of LLMs also in the terminal.
-
Auto-scraping web browser?
Webcrystal?
What are some alternatives?
replayweb.page - Serverless replay of web archives directly in the browser
wikiteam - Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.
TWINT - An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
diskimageprocessor - Tool for automated processing of disk images in BitCurator
brozzler - brozzler - distributed browser-based web crawler
proxy.py - ↔️ Ngrok Alternative • ⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework
webrecorder-desktop - Webrecorder Desktop App!
tor-proxy - Run your any python service over tor using tor-proxy
auto-save-html - Firefox extension that automatically dumps HTML when browsing a specified site
http-proxy-list - It is a lightweight project that, every 10 minutes, scrapes lots of free-proxy sites, validates if it works, and serves a clean proxy list. [GET https://api.github.com/repos/mertguvencli/http-proxy-list: 403 - Repository access blocked]
conifer - Collect and revisit web pages.
mpiv - A fully reworked fork of Mouseover Popup Image Viewer