-
library-of-alexandria
Library of Alexandria (LoA in short) is a project that aims to collect and archive documents from the internet.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Unfortunately, this is the "new" norm. I created a project called Library of Alexandria for exactly these scenarios. I have a great deal of citing and source links to this newspaper in my archives, but not any of the newspaper issues. Better luck next time I guess.
Unfortunately, I don't have a "spidering" app because I don't need to have one. There are already a lot of ways to get PDF documents (for example here is a file with 240 million links to PDF and other types of documents: https://github.com/bottomless-archive-project/document-location-database/releases/tag/2021-july-august).
Related posts
-
What do you do when your PC ran out internal HDD cables?
-
Putting 5,998,794 books on IPFS
-
r/DataHoarder community is mentioned in this: The Enduring Allure of the Library of Alexandria | On the Media | WNYC Studios
-
Anyone here with 50TB,100TB+ of personal storage that isn't mostly movies/TV/porn ??
-
Archive for software / comp sci books / ebooks?