web-archive

Open-source projects categorized as web-archive

Top 3 web-archive Open-Source Projects

  • DownloadNet

    💾 DownloadNet - All content you browse online available offline. Search through the full-text of all pages in your browser history. ⭐️ Star to support our work!

  • Project mention: ArchiveBox: Open-source self-hosted web archiving | news.ycombinator.com | 2024-01-11

    For anyone who uses Chrome and wants to view their archived pages in the browser as if they were still online (URL and everything intact), and also full-text search through their browsing history that was archived (like AB plans to add in future, I think, right nikki?) you can check out DownloadNet: https://github.com/dosyago/DownloadNet

    You can have multiple archives, and even use a mode where you only archive pages you bookmark rather than everything.

  • replayweb.page

    Serverless replay of web archives directly in the browser

  • Project mention: Ask HN: How can I back up an old vBulletin forum without admin access? | news.ycombinator.com | 2024-01-29

    You can try https://replayweb.page/ as a test for viewing a WARC file. I do think you'll run into problems though with wanting to browse interconnected links in a forum format, but try this as a first step.

    One potential option but definitely a bit more work would be, once you have all the warc files downloaded, you can open them all in python using the warctools module and maybe beautifulsoup and potentially parse/extract all of the data embedded in the WARC archives into your own "fresh" HTML webserver.

    https://github.com/internetarchive/warctools

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • browsertrix

    Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

web-archive related posts

  • You're Gonna Need a Bigger Browser

    2 projects | news.ycombinator.com | 4 Nov 2023
  • Google Chrome pushes browser history-based ad targeting

    4 projects | news.ycombinator.com | 6 Sep 2023
  • Webrecorder: Capture interactive websites and replay them at a later time

    6 projects | news.ycombinator.com | 1 Aug 2023
  • Show HN: DiskerNet – Browse the Internet from Your Disk, Now Open Source

    1 project | /r/hypeurls | 19 Jul 2023
  • Show HN: DiskerNet – Browse the Internet from Your Disk, Now Open Source

    3 projects | news.ycombinator.com | 16 Jul 2023
  • phpBB3 forum owner dead. Webhost purging soon. Need to quickly archive a site

    1 project | /r/DataHoarder | 23 May 2023
  • Is there such a thing as a " Master Search Engine " for desktops and websites that can search for any keyword on the site and on the PC?

    2 projects | /r/DataHoarder | 4 Apr 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 4 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source web-archive projects? This list will help you:

Project Stars
1 DownloadNet 3,648
2 replayweb.page 620
3 browsertrix 123

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com