reddit-html-archiver
ArchiveBox
reddit-html-archiver | ArchiveBox | |
---|---|---|
12 | 248 | |
165 | 19,790 | |
- | 1.4% | |
1.8 | 9.8 | |
almost 4 years ago | 8 days ago | |
Python | Python | |
MIT License | MIT |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
reddit-html-archiver
-
/r/planetside will be going private on June 12th, and will not be coming back until Reddit reverses course on API pricing
Other options, like https://github.com/libertysoft3/reddit-html-archiver are not working anymore (I tried it to create a self-hosted /r/planetside backup).
-
This Reddit Community Has Been Archived
Well done, now you should make it sane. No need to reinvent the wheel here. Just rewrite reddit-html-archiver to use the raw json from redarcs rather than the pushshift api.
-
r/okbuddyretard will be "completely wiped from existence" according to one of the mods
I've seen several banned subs archived using https://github.com/libertysoft3/reddit-html-archiver
- What are Your favorite tools to backup reddit data? (Text Posts, Media Content, Comments..)
-
Archiving as much of Soundgasm as possible
https://github.com/libertysoft3/reddit-html-archiver can accomplish step 1 out of the box Parse for every line including soundgasm and/or other domains you are targeting, and maybe run a dedupe on the list before download to lighten the load on yt-dl since it wasnt optimized for that last I checked that deep (which is YEEEEARS ago fwiw)
- I’m leaving Reddit. If there’s a mass movement to do something about what’s happening, let me know.
- /r/NoNewNormal has been banned by Reddit. A good reminder that Reddit is run by fascists, and that all the subreddits that petitioned for this are book-burners. Are you a developer? Help us program the alternative. See comments for details.
- Welcome my r/NoNewNormal bretheren
- r/NoNewNormal has been banned!
-
Is there a way I can archieve the r/lounge subreddit?
You could try using https://github.com/libertysoft3/reddit-html-archiver which is the software we use to power our reddit archiving efforts over at https://the-eye.eu/r/
ArchiveBox
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
Vice website is shutting down
If you really want to save the content for yourself, use something like https://archivebox.io/
I've been running a local instance for a few years now and download/save tech articles all time. I can search and find them as needed.
-
An Introduction to the WARC File
API is coming soon (relatively, it's still a one-man project)! Stay tuned https://github.com/ArchiveBox/ArchiveBox/issues/496
I have an event-sourcing refactor in progress now to allow us to pluginize functionality like the API (similar to Home Assistant with a plugin app sotre), it will take a month or two. Next up is the REST API using the new plugin system.
-
Ask HN: How can I back up an old vBulletin forum without admin access?
I guess your best chance is to use something like https://archivebox.io/.
-
ArchiveBox – open-source self-hosted web archiving
Yeah this is a cool project but it was discussed 2 days ago.
As mentioned by the maintainer there, they even maintain a list of alternatives, very classy:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-...
- ArchiveBox: Open-source self-hosted web archiving
- Linkhut: A Social Bookmarking Site
- Show HN: Rem: Remember Everything (open source)
- Bookmark manager with a focus on organization?
What are some alternatives?
redscarepod-archive
Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.
saidit - The reddit open source fork powering SaidIt
paimon-moe - Your best Genshin Impact companion! Help you plan what to farm with ascension calculator and database. Also track your progress with todo and wish counter.
redditPostArchiver - Easily archive important Reddit post threads onto your computer
SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file
eternity - bypass Reddit's 1000-item listing limits by externally storing your Reddit items (saved, created, upvoted, downvoted, hidden) in your own database
ArchivesSpace - The ArchivesSpace archives management tool
ripme - Downloads albums in bulk
grab-site - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
gwaripper - Tool for conveniently downloading audios from r/gonewildaudio and similar subreddits
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.