Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
ArchiveBox
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
wikiteam
Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.
Reddit: yes, we are on reddit and there is considerable amount of science and art. Already archiving all saved posts using reddit-save and some low-traffic subreddits in their entirety using unknown tool.
Youtube: bunch of science there (Cody et.al). The idea is to first download all my personal playlists using youtube-dl archive mode and then perhaps start downloading select entire channels in low-res mode. Many videos were already deleted from my playlist. Youtube comments should also be saved, but not sure how yet.
Bookmarks: these are at less risk, because archive.org has been doing a good job of keeping sites alive, but still. Bookmarks will be synchronized to selfhosted webdav using floccus addon and then archived using Archive Box. There are few notes to it:
Bookmarks: these are at less risk, because archive.org has been doing a good job of keeping sites alive, but still. Bookmarks will be synchronized to selfhosted webdav using floccus addon and then archived using Archive Box. There are few notes to it:
Web site crawls: some personal websites are so packed with info that it is worth saving them in their entirety. But on a more complex sites, a naive crawl would produce ungodly amount of duplicate data due to cgi parameters like pagination and sorting. I have only like 3 websites crawled with wget. This will require more though and reading.
Wikis - looks easy with Wiki Team.
Related posts
- What sites do you guys use for archiving?
- Destiny should back up all the manifesto videos and images and not rely on Streamable, YouTube, or Imgur
- Reflections as the Internet Archive turns 25
- Need help downloading a Gofundme page before it gets deleted!
- Do you download single webpages? If so, how? And how do you organize them?