Best way to feed Wayback Machine a list of URLs?

This page summarizes the projects mentioned and recommended in the original post on /r/Archiveteam

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • archivenow

    A Tool To Push Web Resources Into Web Archives

  • I crawled a website I want to make sure is completely captured by Wayback Machine but now I need to figure out how to efficiently "feed" all the URLs into Wayback. I found archivenow but I'm terrible at Python so I'm not sure the best way to direct the program at the txt file and preferably create another txt/csv file listing the original url with the new archived url. Any help would be greatly appreciated!

  • ArchiveBox

    🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

  • Could grab https://archivebox.io/ to crawl the site and it'll automatically submit it to the IA, and you get a local warc too. :-)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • wayback-machine-spn-scripts

    Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now

  • I use this https://github.com/overcast07/wayback-machine-spn-scripts

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts