SaaSHub helps you find the best software and product alternatives Learn more →
PlainTextWikipedia Alternatives
Similar projects and alternatives to PlainTextWikipedia
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a better PlainTextWikipedia alternative or higher similarity.
PlainTextWikipedia reviews and mentions
Posts with mentions or reviews of PlainTextWikipedia.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-11.
-
How to download all wikipedia articles in plaintext ( no links, images, talk, revision , SQL, XML etc. ).
You'd have to convert the dump yourself. I found this project, but it was last updated two years ago, so who knows if it still works. They uploaded a dump from 2020 if that is still useful for you. (note, while plaintext, the output is still encapsulated in JSON) Here's another project that converted the dump to plaintext, but the last one was from 2014. You can probably find more by Googling "Wikipedia plaintext dump".
-
What the fuck
Funny enough, "Simplified English Wikipedia" dump file is about 1GB, as stationed here: https://github.com/daveshap/PlainTextWikipedia
-
Update: Indexing Wikipedia offline with SOLR
Great news everyone! You can now index Wikipedia offline with a power indexing engine! This is not meant to be a replacement for KIWIX or anything that, this is more for programmatic use. Say, for instance, you wanted to write your own search engine. Here's the repo: https://github.com/daveshap/PlainTextWikipedia
-
Help with indexing offline wikipedia with SOLR
Here's my base project: https://github.com/daveshap/PlainTextWikipedia
- PlainTextWikipedia: Convert Wikipedia database dumps into plain text JSON files
- Updated: I've saved all of Wikipedia into a SQLITE database!
-
A note from our sponsor - SaaSHub
www.saashub.com | 2 May 2024
Stats
Basic PlainTextWikipedia repo stats
6
261
1.2
almost 3 years ago
daveshap/PlainTextWikipedia is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of PlainTextWikipedia is Python.
Popular Comparisons
Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com