Using Python for Internet Archive Bulk Upload

This page summarizes the projects mentioned and recommended in the original post on /r/internetarchive

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • internetarchive

    A Python and Command-Line Interface to Archive.org

  • first, i've tried python and internetarchive scripts only on XP/Vista with the corresponding version for those OS, without success. I moved to linux, instead. While I have a Raspberry Pi (RPi), I tried first on a Virtual Machine, under Windows. I chose Debian (that's what I run on the RPi) but also had a go at FreeBSD. Both have packages (binaries) ready to go and worked flawlessly. From your post, you have enough skills to set up a virtual machine and install a mainstream linux distro, which is basically downloading an iso, mounting it on the VM, clicking next,next,next,ok,done. You then would boot into the desktop and open the CLI (command line interface). Installing internet archive and python is just a matter of copy pasting a couple of commands. On linux, the internet archive package is https://packages.debian.org/stable/utils/internetarchive and I find it easier than grabbing the binaries through cURL, setting up permissions and whatnot. same for python3. it'll do it's thing (grabs all the files it needs, installs, cleans, all automated, and when it's done you're back at the prompt ($ <-- you asked what this operator means in Python but I think you mean when it shows on the documentation; it's just a command prompt, like it would be on windows cmd, for example c:\archives\uploads> waiting for a command) and ready to throw commands. you first need to setup with your credentials. just ia configure it'll ask all it needs and you're ready to upload stuff. mass uploading different items s basically entering the same command for as many times as it's needed. ia does this for you, using a CSV file -- this involves a bit of pre-processing but when set and done it'll save you a lot of time and wait.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Rabbit R1 can be run on a Android device

    1 project | news.ycombinator.com | 5 May 2024
  • Flags Are Not Languages

    1 project | news.ycombinator.com | 5 May 2024
  • Download your Learn course content with this free and open-source tool. All you need is a working computer and basic Python knowledge, and you can save a local copy of your Learn courses' content for future reference after the end of the term.

    1 project | /r/uwaterloo | 9 Dec 2023
  • Ask HN: How do you develop and maintain a good note-taking habit?

    1 project | news.ycombinator.com | 5 May 2024
  • What Are HTML Meta Tags And What Is Their Importance?

    2 projects | dev.to | 5 May 2024