wtf_wikipedia VS wikipedia_ql

Compare wtf_wikipedia vs wikipedia_ql and see what are their differences.

wtf_wikipedia

a pretty-committed wikipedia markup parser (by spencermountain)

wikipedia_ql

Query language for efficient data extraction from Wikipedia (by zverok)
Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
wtf_wikipedia wikipedia_ql
1 3
743 357
- -
8.0 0.0
13 days ago about 2 years ago
JavaScript Python
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

wtf_wikipedia

Posts with mentions or reviews of wtf_wikipedia. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-25.
  • Experimental library for scraping websites using OpenAI's GPT API
    7 projects | news.ycombinator.com | 25 Mar 2023
    This may finally be a solution for scraping wikipedia and turning it into structured data. (Or do we even need structured data in the post-AI age?)

    Mediawiki is notorious for being hard to parse:

    * https://github.com/spencermountain/wtf_wikipedia#ok-first- - why it's hard

    * https://techblog.wikimedia.org/2022/04/26/what-it-takes-to-p... - an entire article about parsing page TITLES

    * https://osr.cs.fau.de/wp-content/uploads/2017/09/wikitext-pa... - a paper published about a wikitext parser

wikipedia_ql

Posts with mentions or reviews of wikipedia_ql. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-25.

What are some alternatives?

When comparing wtf_wikipedia and wikipedia_ql you can also consider the following projects:

sdow - Six Degrees of Wikipedia

scrapeghost - 👻 Experimental library for scraping websites using OpenAI's GPT API.

anon - tweet about anonymous Wikipedia edits from particular IP address ranges

pastevents - A structured, searchable archive of Wikipedia's "Current Events" portal

duckling - Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.

weheartit - A fast, reliable API wrapper for weheartit.com [Moved to: https://github.com/aswinnnn/weheartpy]

autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python

react-relay - Relay is a JavaScript framework for building data-driven React applications.

artwork - GraphQL Foundation artwork

reality - Comprehensive data proxy to knowledge about real world