japanese-words-to-vectors
magnitude
japanese-words-to-vectors | magnitude | |
---|---|---|
1 | 5 | |
83 | 1,612 | |
- | 0.1% | |
10.0 | 0.0 | |
over 2 years ago | 10 months ago | |
Python | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
japanese-words-to-vectors
-
Abstract-Concreteness Value Lexical Data for Japanese
I'm looking for data for how concrete or abstract different lexical items are in Japanese, similar to this data for English. I'm not very well versed in computational linguistics, so even though I've found this word-to-vector model that can create vectors for Japanese words, but I'm not sure how to extrapolate abstractness values from the resulting vectors, or if that's even possible without using a predefined abstract-concrete vector like shown here.
magnitude
-
Text Classification Library for a Quick Baseline
(3) FastText now supports multiple languages [2].
[1] https://github.com/plasticityai/magnitude#pre-converted-magn...
-
Pgvector – vector similarity search for Postgres
Check out Magnitude, we built it to solve that problem: https://github.com/plasticityai/magnitude
It's still loaded from a file, but heavily uses memory-mapping and caching to be speedy and not overload your RAM immediately. And in production scenarios, multiple worker processes can share that memory due to the memory mapping.
Disclaimer: I'm the author.
-
Build an Embeddings index from a data source
General language models from pymagnitude
-
Tutorial series on txtai
Backed by the pymagnitude library. Pre-trained word vectors can be installed from the referenced link.
What are some alternatives?
Korpora - Korean corpus repository
flashtext - Extract Keywords from sentence or Replace keywords in sentences.
open-discourse - Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
faiss - A library for efficient similarity search and clustering of dense vectors.
pgvector - Open-source vector similarity search for Postgres
finalfusion-rust - finalfusion embeddings in Rust
Milvus - A cloud-native vector database, storage for next generation AI applications
txtai - 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Resume-Matcher - Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.
pretty-print-confusion-matrix - Confusion Matrix in Python: plot a pretty confusion matrix (like Matlab) in python using seaborn and matplotlib
Romanian-Word-Embeddings - Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gensim library). The .vec and .model files are available for download (all in one archive).
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT