Looking for some help in downloading a few thousand files from archive.org on ubuntu. wget is estimated to take 2 months... I figured I should ask the fellow data-hoarders!

This page summarizes the projects mentioned and recommended in the original post on /r/DataHoarder

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • iadownloader

    Auto-download files and collections from Internet Archive

  • I've used this [https://github.com/rsvensson/iadownloader] for a similar use case, but you might also consider one of these...

  • archive-downloader

    A downloader for archive.org

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • BaseCase-3

    Discontinued This is a Python Application that can be used to gather all files of a certain type from any archive.com repository

  • internetarchive-downloader

    Simultaneous, resumable and hash-verified downloads from Internet Archive (archive.org)

  • internetarchive

    A Python and Command-Line Interface to Archive.org

  • GGet

    Multithreaded download accelerator written in Go

  • This sounds pretty easy to do with Go, spawn hundreds of threads (which have a ridiculously small ram footprint) and off you go. A google search yielded this: https://github.com/ashwinGokhale/GGet

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What would be the best way to archive an archive.org account? This person has been uploading thousands of high quality rare vinyl rips with lossless high-resolution scans. I don't wanna lose it

    2 projects | /r/DataHoarder | 20 Sep 2021
  • Ask HN: Modern Day Equivalent to HyperCard?

    8 projects | news.ycombinator.com | 1 May 2024
  • Google Search results polluted by buggy AI-written code frustrate coders

    1 project | news.ycombinator.com | 1 May 2024
  • Typer: Python library for building CLI applications

    1 project | news.ycombinator.com | 25 Apr 2024
  • We Need to Rewild the Internet

    2 projects | news.ycombinator.com | 16 Apr 2024