grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns (by ArchiveTeam)

Grab-site Alternatives

Similar projects and alternatives to grab-site

  1. docker-swag

    297 grab-site VS docker-swag

    Nginx webserver and reverse proxy with php support and a built-in Certbot (Let's Encrypt) client. It also contains fail2ban for intrusion prevention.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ArchiveBox

    🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

  4. hunter-dkim

    Discusses how to verify DKIM signatures in old emails, namely one of the Hunter Biden emails in the news

  5. LinkAce

    LinkAce is a self-hosted archive to collect links of your favorite websites.

  6. linkwarden

    38 grab-site VS linkwarden

    ⚡️⚡️⚡️ Self-hosted collaborative bookmark manager to collect, organize, and preserve webpages, articles, and documents.

  7. replayweb.page

    Serverless replay of web archives directly in the browser

  8. browsertrix-crawler

    Run a high-fidelity browser-based web archiving crawler in a single Docker container

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. briefkasten

    7 grab-site VS briefkasten

    📮 Self hosted bookmarking app

  11. win32

    8 grab-site VS win32

    Public mirror for win32-pr

  12. urlwatch

    10 grab-site VS urlwatch

    Watch (parts of) webpages and get notified when something changes via e-mail, on your phone or via other means. Highly configurable.

  13. good-karma-kit

    😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...

  14. vectordb

    A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search. (by kagisearch)

  15. httrack

    HTTrack Website Copier, copy websites to your computer (Official repository)

  16. awesome-datahoarding

    List of data-hoarding related tools

  17. bitextor

    Bitextor generates translation memories from multilingual websites

  18. wpull

    Wget-compatible web downloader and crawler.

  19. forum-dl

    Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

  20. Collect

    1 grab-site VS Collect

    A server to collect & archive websites that also supports video downloads (by xarantolus)

  21. wget2

    The successor of GNU Wget. Contributions preferred at https://gitlab.com/gnuwget/wget2. But accepted here as well 😍

  22. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better grab-site alternative or higher similarity.

grab-site discussion

Log in or Post with

grab-site reviews and mentions

Posts with mentions or reviews of grab-site. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-10-16.

Stats

Basic grab-site repo stats
34
1,479
4.0
11 months ago

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?