TypeScript web-archiving

Open-source TypeScript projects categorized as web-archiving

Top 4 TypeScript web-archiving Projects

  • replayweb.page

    Serverless replay of web archives directly in the browser

  • Project mention: Ask HN: How can I back up an old vBulletin forum without admin access? | news.ycombinator.com | 2024-01-29

    You can try https://replayweb.page/ as a test for viewing a WARC file. I do think you'll run into problems though with wanting to browse interconnected links in a forum format, but try this as a first step.

    One potential option but definitely a bit more work would be, once you have all the warc files downloaded, you can open them all in python using the warctools module and maybe beautifulsoup and potentially parse/extract all of the data embedded in the WARC archives into your own "fresh" HTML webserver.

    https://github.com/internetarchive/warctools

  • archivebox-browser-extension

    Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

  • Project mention: Hyperlink Maximalism (2022) | news.ycombinator.com | 2023-07-25

    https://chromewebstore.google.com/detail/habonpimjphpdnmcfka... (or https://github.com/tjhorner/archivebox-exporter for source)

    Pushes your history to ArchiveBox, which does the heavy lifting storing/processing the content.

    Alas, might not work with Epiphany because there's no complete extension support.

    But IIRC, it stores its urls in $XDG_DATA_HOME/epiphany/ephy-history.db - so a bit of sqlite and ArchiveBox might do the trick for you.

    Note: I'm running something similar, but find that I'd rather not rely on my history, I tend to click on a lot of garbage ;) You might want to curate a bit.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • browsertrix-cloud

    Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!

  • Collect

    A server to collect & archive websites that also supports video downloads (by xarantolus)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Index

What are some of the best open-source web-archiving projects in TypeScript? This list will help you:

Project Stars
1 replayweb.page 611
2 archivebox-browser-extension 158
3 browsertrix-cloud 118
4 Collect 75

Sponsored
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com