pytextrank
spacy-models
pytextrank | spacy-models | |
---|---|---|
2 | 3 | |
2,165 | 1,685 | |
0.6% | 2.2% | |
4.7 | 6.3 | |
6 months ago | 4 months ago | |
Python | Python | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pytextrank
spacy-models
- spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)
-
word similarity vs. sentence similarity
Well the medium model is using Glove (common crawl) for word vectors. There are only 685K keys so depending on the corpus you are working with, its possible lots of the words you are interested in don't have a corresponding vector and end up as zero vectors. Spacy Document/Span vectors are simply averages of the word vectors. So the higher performance of phrases may simply be because there is a higher chance of non Out of Vocabulary (OOV) words. So less chance of a zero vector.
-
SpaCy VS Transformers for NER
spaCy vs transformers isn't really a good comparison. You can plug a variety of things into spaCy's NLP pipelines, including Huggingface's transformer models. spaCy 3, in particular, has pre-built models with Huggingface's transformers, like en_core_web_trf.
What are some alternatives?
turing - :sparkles: :dna: Turing ES - Enterprise Search, Chatbot using Search Engine and Many NLP Vendors.
flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)
Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization
rasa - 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
retext-readability - plugin to check readability
Dragonfire - the open-source virtual assistant for Ubuntu based Linux distributions
pke - Python Keyphrase Extraction module
MAX-Toxic-Comment-Classifier - Detect 6 types of toxicity in user comments.
textstat - :memo: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
healthsea - Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
bert2bert-summarization - Abstractive summarization using Bert2Bert framework.
thinc-apple-ops - 🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library