Looking for something like ArchiveBox but with the recursive functionality of HTTrack

This page summarizes the projects mentioned and recommended in the original post on /r/selfhosted

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • sosse

    Selenium Open Source Search Engine & crawler

  • You could try Sosse (https://github.com/biolds/sosse), it has the crawling capabilities you're looking for and can filter on filetype.. Though it does not provide as much archiving option as ArchiveBox (Sosse can only do screenshot or HTML). Let me know if you have trouble doing the configuration, or you feel like some feature would be great adding.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • SeekStorm VS tantivy - a user suggested alternative

    2 projects | 22 Mar 2024
  • Open-source Rust-based RAG

    3 projects | news.ycombinator.com | 10 Mar 2024
  • YaCy, a distributed Web Search Engine, based on a peer-to-peer network

    9 projects | news.ycombinator.com | 5 Mar 2024
  • Open Source Search Engine as an Alternative to Google Built in Spare Time

    1 project | news.ycombinator.com | 9 Feb 2024
  • StractOrg/stract: web search done right

    1 project | news.ycombinator.com | 8 Feb 2024