bitextor
trankit
Our great sponsors
bitextor | trankit | |
---|---|---|
2 | 1 | |
278 | 707 | |
1.1% | - | |
5.9 | 5.7 | |
8 months ago | 13 days ago | |
Python | Python | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bitextor
trankit
-
Trankit v1.0.0 - An open-source Transformer-based Multilingual NLP Toolkit for 56 languages is out.
Trankit is written in Python and can be easily installed via pip. Our code and pretrained models are publicly available at: https://github.com/nlp-uoregon/trankit
What are some alternatives?
ArchiveBox - 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
Hebrew-Tokenizer - A very simple python tokenizer for Hebrew text.
Stanza - Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
grab-site - The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
nematus - Open-Source Neural Machine Translation in Tensorflow
argilla - Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
sentence-splitter - Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.
wiktextract - Wiktionary dump file parser and multilingual data extractor
OpenNMT-py - Open Source Neural Machine Translation and (Large) Language Models in PyTorch
flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)