Top 4 TypeScript web-archiving Projects
-
archivebox-browser-extension
Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
browsertrix-cloud
Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
Project mention: Ask HN: How can I back up an old vBulletin forum without admin access? | news.ycombinator.com | 2024-01-29You can try https://replayweb.page/ as a test for viewing a WARC file. I do think you'll run into problems though with wanting to browse interconnected links in a forum format, but try this as a first step.
One potential option but definitely a bit more work would be, once you have all the warc files downloaded, you can open them all in python using the warctools module and maybe beautifulsoup and potentially parse/extract all of the data embedded in the WARC archives into your own "fresh" HTML webserver.
https://github.com/internetarchive/warctools
https://chromewebstore.google.com/detail/habonpimjphpdnmcfka... (or https://github.com/tjhorner/archivebox-exporter for source)
Pushes your history to ArchiveBox, which does the heavy lifting storing/processing the content.
Alas, might not work with Epiphany because there's no complete extension support.
But IIRC, it stores its urls in $XDG_DATA_HOME/epiphany/ephy-history.db - so a bit of sqlite and ArchiveBox might do the trick for you.
Note: I'm running something similar, but find that I'd rather not rely on my history, I tend to click on a lot of garbage ;) You might want to curate a bit.
Index
What are some of the best open-source web-archiving projects in TypeScript? This list will help you:
Project | Stars | |
---|---|---|
1 | replayweb.page | 611 |
2 | archivebox-browser-extension | 158 |
3 | browsertrix-cloud | 118 |
4 | Collect | 75 |
Sponsored