Python Wikipedia

Open-source Python projects categorized as Wikipedia

Top 23 Python Wikipedium Projects

  • mwparserfromhell

    A Python parser for MediaWiki wikicode

  • Project mention: Processing Wikipedia Dumps With Python | /r/programming | 2023-05-18

    There's also https://github.com/earwig/mwparserfromhell, if you don't want to roll your own.

  • wikiteam

    Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.

  • Project mention: Miraheze to Shut Down | news.ycombinator.com | 2023-06-18

    WikiTeam is working on the archival, with the usual XML dumps and image dumps. You can follow updates and see how to help:

    https://github.com/WikiTeam/wikiteam/issues/465#issuecomment...

    https://wiki.archiveteam.org/index.php/Miraheze

    Already before the announcement we had XML dumps for thousands of Miraheze wikis.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • pywikibot

    A Python library that interfaces with the MediaWiki API. This is a mirror from gerrit.wikimedia.org. Do not submit any patches here. See https://www.mediawiki.org/wiki/Developer_account for contributing.

  • wik

    wik is use to get information about anything on the shell using Wikipedia.

  • Wikipedia-API

    Python wrapper for Wikipedia

  • wikipedia_ql

    Query language for efficient data extraction from Wikipedia

  • WordDumb

    A calibre plugin that generates Kindle Word Wise and X-Ray files for KFX, AZW3, MOBI and EPUB eBook.

  • Project mention: Create Kindle X-ray with calibre? | /r/Calibre | 2023-07-08

    Manual here: https://xxyzz.github.io/WordDumb/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mwclient

    Python client library to interface with the MediaWiki API

  • isbntools

    python app/framework for 'all things ISBN' including metadata, descriptions, covers...

  • codex

    CoDEx: A set of knowledge graph Completion Datasets Extracted from Wikidata and Wikipedia (by tsafavi)

  • Mediawiker

    A plugin for Sublime Text editor that adds possibility to use it as Wiki Editor on MediaWiki-based sites like Wikipedia and many other.

  • japanese-words-to-vectors

    Word2vec (word to vectors) approach for Japanese language using Gensim and Mecab.

  • danker

    Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.

  • wistalk

    Wistalk : Analyze Wikipedia User's Activity

  • Wikipedia-Article-Scraper

    A complete Python text analytics package that allows users to search for a Wikipedia article, scrape it, conduct basic text analytics and integrate it to a data pipeline without writing excessive code.

  • wikifunctions

    Python functions for retrieving data from the MediaWiki/Wikipedia API

  • wiki_dump

    A library that assists in traversing and downloading from Wikimedia Data Dumps and their mirrors.

  • NLP-Model-for-Corpus-Similarity

    A NLP algorithm I developed to determine the similarity or relation between two documents/Wikipedia articles. Inspired by the cosine similarity algorithm and built from WordNet.

  • witokit

    A Python toolkit to generate a tokenized dump of Wikipedia for NLP

  • taxopedia

    Taxonomic trees (cladograms) from Wikipedia-scraped data.

  • movie-blog-automation

    Movie plot Blog Automation Project

  • Project mention: Blog Automation using Python๐Ÿ& Blogger | /r/PythonProjects | 2023-06-22

    Here is the github : https://github.com/pj8912/wiki-blog-automation clone it and follow the instructions to automate the process of creating your own movie plots website and have fun! ๐Ÿ˜‰

  • MediaWiki-Tools

    Tools for getting data from MediaWiki websites

  • pastevents

    A structured, searchable archive of Wikipedia's "Current Events" portal

  • Project mention: 68k.news: Basic HTML Google News for Vintage Computers | news.ycombinator.com | 2023-06-16

    I share the frustration with the major online news portals, and have in fact built my own portal powered by Wikipedia[1].

    But eventually I realized that my biggest gripe with news today isn't the presentation but the content. And I'm not talking about biases or sensationalism โ€“ I'm talking about the news items themselves.

    Much of what passes as news today is stuff like "15 people die when a copper mine collapses in Chile". I'm trying to get a big picture view of the world, and I don't believe that such stories are at all conducive to that endeavor. News as we know it is just an endless stream of random events, apparently selected according to a handful of crude criteria, the most important one being dead people. I've been a keen follower of global news for many years, and I don't feel that I'm understanding anything.

    Where are the truly novel approaches to painting a picture of what the world is today? Where are the quantitative news portals, the event pattern search engines, the automatically derived trends? I'm still looking.

    [1] https://pastevents.org

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Wikipedia related posts

Index

What are some of the best open-source Wikipedium projects in Python? This list will help you:

Project Stars
1 mwparserfromhell 698
2 wikiteam 686
3 pywikibot 612
4 wik 607
5 Wikipedia-API 532
6 wikipedia_ql 357
7 WordDumb 332
8 mwclient 305
9 isbntools 202
10 codex 136
11 Mediawiker 134
12 japanese-words-to-vectors 83
13 danker 53
14 wistalk 24
15 Wikipedia-Article-Scraper 17
16 wikifunctions 13
17 wiki_dump 9
18 NLP-Model-for-Corpus-Similarity 9
19 witokit 9
20 taxopedia 7
21 movie-blog-automation 6
22 MediaWiki-Tools 4
23 pastevents 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com