Our great sponsors
-
ArchiveBox
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
You're kind of using the best solution. SingleFile and SingleFileZ are perfect solutions for personal collecting.
I use SingleFileZ and use adapted ISO timestamps as a prefix for the filename. This then gets moved to an archive folder per month and indexed by the filename module of Memax. This way, I get all web pages archived including their content and the event gets into my calendar for temporal retrieval.
I use https://github.com/Y2Z/monolith ; Keep in mind that the HTML can be quite large, so you might want to process it a little if you care about the size. But the great thing about monolith is that you download the whole HTML, CSS, JS and images in a single file, perfect for offline archival even if the website is gone.
Related posts
- What sites do you guys use for archiving?
- Destiny should back up all the manifesto videos and images and not rely on Streamable, YouTube, or Imgur
- Omnivore – free, open source, read-it-later App
- Pocket: It gets worse the more you use it
- Lost something? Search through 91.7 million files from the 80s, 90s, and 2000s