SaaSHub helps you find the best software and product alternatives Learn more →
Lemmatization-lists Alternatives
Similar projects and alternatives to lemmatization-lists based on common topics and language
-
trankit
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
awesome-sentiment-analysis
Repository with all what is necessary for sentiment analysis and related areas
-
Awesome-pytorch-list
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
-
CogCompNLP
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
lemmatization-lists reviews and mentions
-
Ambiguous spellings
It's a bit of a massive undertaking maintaining such a data set so it's mostly taken from https://github.com/michmech/lemmatization-lists At the top of the file you'll see some additional I've added to deal with personal pronouns and numbers.
-
Is there a text list of words and their variations?
Another one to add to your list: https://github.com/michmech/lemmatization-lists
-
Trying to build a lemmatizer from scratch
One approach might be to take a lemmatization list, like the lemma-token lists at https://github.com/michmech/lemmatization-lists/, and compile it into a Finite State Transducer. The Helsinki FST package, for instance, has an hfst-strings2fst command to compile pairs of strings into a transducer. You might need to do some reformatting of the input first.
-
A note from our sponsor - SaaSHub
www.saashub.com | 29 Apr 2024
Stats
michmech/lemmatization-lists is an open source project licensed under ODC Open Database License v1.0 which is not an OSI approved license.
Sponsored