wikibase-cli
relatedhow
wikibase-cli | relatedhow | |
---|---|---|
1 | 2 | |
218 | 3 | |
- | - | |
7.9 | 5.7 | |
29 days ago | 6 months ago | |
JavaScript | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wikibase-cli
-
Data-Mining Wikipedia for Fun and Profit
I learned SPARQL recently, and would agrre its complicated to get info out of Wikidata.
However, having read the article, they didnt have an easy time with scraping Wikipedia either.
So I'd probably still recommend people look into wikidata and SPARQL if they want to do this kind of thing.
Theres a few tools that generate queries for you, and some cli tools as well:
https://github.com/maxlath/wikibase-cli#readme
It makes Wikipedia better too, in a virtuous cycle, with some infoboxes like those that he scraped being converted to be automatically populated from wikidata.
relatedhow
-
Tree of Life Explorer
Also check out my hobby project: https://relatedhow.kodare.com/
It's not as fancy looking, but it's a lot more complete.
-
Data-Mining Wikipedia for Fun and Profit
I am doubtful. I tried for a long time to use it to get data or for my taxonomic graph project (https://relatedhow.kodare.com/) and SPARCQL was just not usable at all. The biggest problem was the 60s time limit. Totally not workable for what I wanted. I also had issues with seemingly inconsistent results, but it was hard to tell.
I ended up loading the full nightly db dump and filtering it streaming from the zip instead. Faster and it actually worked.
The code to do that is at https://github.com/boxed/relatedhow
What are some alternatives?
EasierRDF - Making RDF easy enough for most developers
pkg - Package your Node.js project into an executable
qlever - Very fast SPARQL Engine, which can handle very large knowledge graphs like the complete Wikidata, offers context-sensitive autocompletion for SPARQL queries, and allows combination with text search. It's faster than engines like Blazegraph or Virtuoso, especially for queries involving large result sets.
atomically - Read and write files atomically and reliably.