witokit

A Python toolkit to generate a tokenized dump of Wikipedia for NLP (by akb89)

Witokit Alternatives

Similar projects and alternatives to witokit based on common topics and language

  • wit

    WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages. (by google-research-datasets)

  • wiki_dump

    A library that assists in traversing and downloading from Wikimedia Data Dumps and their mirrors.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • wikiteam

    Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2023, WikiTeam has preserved more than 350,000 wikis.

  • wp2git

    Downloads and imports Wikipedia page histories to a git repository

  • trankit

    Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better witokit alternative or higher similarity.

witokit reviews and mentions

Posts with mentions or reviews of witokit. We have used some of these posts to build our list of alternatives and similar projects.

Stats

Basic witokit repo stats
1
9
2.6
over 3 years ago

akb89/witokit is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of witokit is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com