distributed-wikipedia-mirror
internetarchive-downloader
distributed-wikipedia-mirror | internetarchive-downloader | |
---|---|---|
11 | 7 | |
603 | 121 | |
1.5% | - | |
3.6 | 3.6 | |
3 months ago | 4 months ago | |
TypeScript | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
distributed-wikipedia-mirror
- Distributed Wikipedia Mirror Project: Putting Wikipedia Snapshots on IPFS
-
Is it possible (and does it make sense) to self host, openstreetmaps, Wikipedia and a complete search engine ?
You might like this repo. This tech was/is used in Turkey since they banned access to wikipedia. The read-only is a feature because nobody should be able to manipulate the contents of this distributed copy.
-
Uhhh wtf is this? 'Distributed Wikipedia Mirror Project' built on GME blockchain???
Link to the github
- Wikiless: A free open source alternative Wikipedia front-end focused on privacy
-
An idea about permanent hosting SCIHub on IPFS
So I thought there is a very suitable way to enhance the availability of SCIHub --- to store SCIHub papers on IPFS network through Crust, and develop a SCIHub-IPFS-Mirror for this to facilitate user access (similar to the project [distributed-wikipedia-mirror](https://github.com/ipfs/distributed-wikipedia-mirror) ).
-
What are the odds of the Internet Archive getting shut in the next 5 years and what will we do after it is shut?
follow the cohost steps https://github.com/ipfs/distributed-wikipedia-mirror
-
Internet in a Box
For my wikipedia cache I use IPFS companion and https://en.wikipedia-on-ipfs.org/wiki/. All the devices that use this approach on a local network can share data. And to make sure unused wikipedia pages aren't garbage collected, https://github.com/ipfs/distributed-wikipedia-mirror#cohost-...
-
Tantivy v0.15 released! Now backed by Quickwit Inc.!
Well spotted. Like IPFS, there's a comment about that here: https://github.com/tantivy-search/tantivy/pull/1067#issuecomment-853139923 that points to the distributed wikipedia mirror project https://github.com/ipfs/distributed-wikipedia-mirror/issues/76
internetarchive-downloader
- Does anyone know how to download the images from borrow-only Internet Archive books?
-
Is there a way to download all files in the URLs list for an archived site?
this tool works well for what you're asking for. https://github.com/john-corcoran/internetarchive-downloader
- Looking for some help in downloading a few thousand files from archive.org on ubuntu. wget is estimated to take 2 months... I figured I should ask the fellow data-hoarders!
-
How to view more than 25 results in an archive collection?
Another option to get all items in a collection that I used for a script I put together for Internet Archive downloads is the Internet Archive Python Library - official documentation on the relevant function is at https://archive.org/services/docs/api/internetarchive/quickstart.html#searching - and example of using it in code is around line 839 of https://github.com/john-corcoran/internetarchive-downloader/blob/61395ae4fbc826d9578678ed3299ada45d5ec3fd/ia_downloader.py
-
Pause Downloading of Collection From the Internet Archive?
Using the ‘-r’ flag with my Python script will allow resuming in-progress files, and if you run the script with the same command line arguments each time, you can pick up a collection where you left off - it’s at https://github.com/john-corcoran/internetarchive-downloader
-
Extracting all links from a webpage without html?
You may want to try this Python script I’ve finished recently for Internet Archive downloads: https://github.com/john-corcoran/internetarchive-downloader - collections should work fine if you pass it with the prefix ‘collection:’, e.g. ‘collection:nasa’ - if you want to give it a try, let me know if any questions!
-
What are the odds of the Internet Archive getting shut in the next 5 years and what will we do after it is shut?
I’ve made a Python script for this at https://github.com/john-corcoran/internetarchive-downloader which may assist?
What are some alternatives?
ipfs - Peer-to-peer hypermedia protocol
archive-downloader - A downloader for archive.org
tantivy - Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust [Moved to: https://github.com/quickwit-oss/tantivy]
BaseCase-3 - This is a Python Application that can be used to gather all files of a certain type from any archive.com repository
tantivy-wasm
GGet - Multithreaded download accelerator written in Go
iiab - Internet-in-a-Box - Build your own LIBRARY OF ALEXANDRIA with a Raspberry Pi !
pup - Parsing HTML at the command line
search-benchmark-game - Search engine benchmark (Tantivy, Lucene, PISA, ...)
internetarchive - A Python and Command-Line Interface to Archive.org
ipfs-backup - Backup encrypted files on ipfs