wget-lua
ArchiveBox
Our great sponsors
wget-lua | ArchiveBox | |
---|---|---|
2 | 248 | |
81 | 19,790 | |
- | 3.4% | |
6.1 | 9.8 | |
3 months ago | 4 days ago | |
C | Python | |
GNU General Public License v3.0 only | MIT |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wget-lua
-
Alternative to HTTrack (website copier) as of 2023?
You're using it wrong, rtfm, wget is still the standard. It's also extensible beyond the base feature set, take for example wget-lua ArchiveTeams well maintained go to for near all scraping projects by the group.
-
Kiwix - Access Wikipedia (And More) With no Internet
There are updates changed names but still use more frequent updates than the dumps to get started. I know there is kiwix and xowa. Could probably build it up to current and use wget-at to scrap wikipedia solo. If you want it in html it'll proboy only be a hundred 100TB give or take. I'm wondering if any of the groups are still active on IRC. Saw mentions of a few but I lost my place in all the mobile chrome tabs.
ArchiveBox
-
Ask HN: What Underrated Open Source Project Deserves More Recognition?
Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
Vice website is shutting down
If you really want to save the content for yourself, use something like https://archivebox.io/
I've been running a local instance for a few years now and download/save tech articles all time. I can search and find them as needed.
-
An Introduction to the WARC File
API is coming soon (relatively, it's still a one-man project)! Stay tuned https://github.com/ArchiveBox/ArchiveBox/issues/496
I have an event-sourcing refactor in progress now to allow us to pluginize functionality like the API (similar to Home Assistant with a plugin app sotre), it will take a month or two. Next up is the REST API using the new plugin system.
-
Ask HN: How can I back up an old vBulletin forum without admin access?
I guess your best chance is to use something like https://archivebox.io/.
-
ArchiveBox – open-source self-hosted web archiving
Yeah this is a cool project but it was discussed 2 days ago.
As mentioned by the maintainer there, they even maintain a list of alternatives, very classy:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-...
- ArchiveBox: Open-source self-hosted web archiving
- Linkhut: A Social Bookmarking Site
- Show HN: Rem: Remember Everything (open source)
- Bookmark manager with a focus on organization?