wayback-machine-downloader
go-readability
Our great sponsors
wayback-machine-downloader | go-readability | |
---|---|---|
48 | 4 | |
5,034 | 643 | |
- | 4.2% | |
0.0 | 5.1 | |
2 months ago | 17 days ago | |
Ruby | HTML | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wayback-machine-downloader
-
Ask HN: Cool Useful GitHub Repos?
I just found this https://github.com/hartator/wayback-machine-downloader
anyone have anything similarly interesting/cool/niche-useful ?
-
ArchiveTeam is saving Blogger from Google deletion
Send ArchiveTeam the link on IRC or here and we can save it to archive.org, then later you can use wayback-machine-downloader to grab it from archive.org.
-
My TikTok was Hacked & Deleted and I GOT IT BACK!
This is where it gets tricky, you need to download the code from the wayback machine and he was able to do that by following these steps: https://github.com/hartator/wayback-machine-downloader
-
Is there a way to quick download twitter images on the wayback machine?
Not sure if it will work for twitter, but I have used wayback-machine-downloader to batch download stuff.
-
Forgot to backup my WordPress files before I swapped webhosting provider, am I screwed?
Adding to archive.org, there is a github repo to fetch website data. You can give a try too. Here is the repo link
- Can I please get help downloading and saving a website for offline use?
-
Hey guys, looks like we have a potential hacker on our hands. All of our company's files were deleted from our FTP. :( Is there any way we can get a cache of our website and restore everything? Any help or advice would be greatly appreciated. Thanks in advance!
Edit: Good news! I found a solution that saved me. I was able to download the full website (including images, JS, and CSS files) using this tool: https://github.com/hartator/wayback-machine-downloader
-
Hey guys, so a potential hacker managed to delete all of our company's files from our FTP. Yikes! Is there a way to retrieve a cache of our website and restore it? Any advice or tips would be greatly appreciated. Thanks in advance!
Edit: Thank you to everyone who suggested the Wayback Machine Downloader! It saved the day and allowed me to download the full website, including images, JS, and CSS files.
-
Have a lengthy flight: how to seamlessly mirror couple websites
I've used https://github.com/hartator/wayback-machine-downloader but it sometimes messes up CSS badly
- what Do YOU Recommend?
go-readability
-
Ask HN: Full-text browser history search forever?
I've had a lot of success by running HTML pages through mozilla's readability[0] tool (actually the go port of it[1]) before indexing it.
-
Which library/project do you wish was ported to golang?
https://github.com/go-shiori/go-readability https://github.com/mauidude/go-readability
-
Show HN: Forlater.email – an email-based bookmarking service
I'm using https://github.com/go-shiori/go-readability -- a Go re-implementation of Mozilla's readability-js library. It does a pretty good job.
-
Show HN: Hackernews_tui – A Terminal UI to Browse Hacker News Discussions
Two projects that do this with nearly identical output:
- https://github.com/eafer/rdrview
- https://github.com/go-shiori/go-readability
Pipe the filtered HTML output into your favorite textual web browser for an ideal reading experience.
What are some alternatives?
savepagenow - A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
Readability4J - A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.
warrick - Recover lost websites from the Web Infrastructure
rdrview - Firefox Reader View as a command line tool
neocities - Neocities.org - the web site. The entire thing. Yep, we're completely open source.
hnrss - Custom, realtime RSS feeds for Hacker News
Hexo - A fast, simple & powerful blog framework, powered by Node.js.
readability - A standalone version of the readability lib
wayback-machine-spn-scripts - Bash scripts which interact with Internet Archive Wayback Machine's Save Page Now
nb - CLI and local web plain text note‑taking, bookmarking, and archiving with linking, tagging, filtering, search, Git versioning & syncing, Pandoc conversion, + more, in a single portable script.
gba-remote-play - 📡 Stream Raspberry Pi games to a GBA via Link Cable.
awesome-hackernews - A curated list of FOSS tools to improve the Hacker News experience.