pke
yake
pke | yake | |
---|---|---|
3 | 5 | |
1,556 | 1,656 | |
- | 0.8% | |
3.1 | 3.0 | |
over 1 year ago | 11 months ago | |
Python | Python | |
GNU General Public License v3.0 only | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pke
- Question on easing comprehension
-
[P] Building model to extract keywords from legal documents
Look into rake, pke, phrasemachine, pyate, keybert.
-
Best approach for automatic key word extraction
There are lots of off-the-shelf tools for this. Look into: - https://github.com/boudinfl/pke - https://github.com/kevinlu1248/pyate - https://github.com/zelandiya/RAKE-tutorial - https://github.com/slanglab/phrasemachine - https://github.com/MaartenGr/KeyBERT/
yake
- Show HN: Whisper.cpp and YAKE to Analyse Voice Reflections [iOS]
-
Simplest keyword extractor
Personally I prefer using YAKE.
- What method should be used to tag specific texts, when the dataset is too small for training a model?
- Is there any YAKE (yet another keyword extractor) implementation in R? Unsupervised Approach for Automatic Keyword Extraction using Text Statistical Features.
-
Alternate approaches to TF-IDF?
You can look for usage here: https://github.com/LIAAD/yake and there is also a reference section with publications for more details of how this works. From what I remember, each keyphrase candidate is assigned an aggregated score based on various features: position in the text, casing, frequency, surrounding text frequency...
What are some alternatives?
KeyBERT - Minimal keyword extraction with BERT
rake-nltk - Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
pytextrank - Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
flashtext - Extract Keywords from sentence or Replace keywords in sentences.
textstat - :memo: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
simple_keyword_clusterer - A simple machine learning package to cluster keywords in higher-level groups.
pyate - PYthon Automated Term Extraction
scattertext - Beautiful visualizations of how language differs among document types.
retext-readability - plugin to check readability
faiss - A library for efficient similarity search and clustering of dense vectors.