lm-scorer
lingvo
lm-scorer | lingvo | |
---|---|---|
4 | 1 | |
294 | 2,778 | |
- | -0.1% | |
0.0 | 8.5 | |
about 2 years ago | 22 days ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lm-scorer
- How to obbtain probability for entire sequence (Huggingface transformers)
-
MLM vs CLM for actual language modeling
I've tried this once and found the CLM score to be a better indicator than BERT log prob for my use-case. For CLM, I had used lm-scorer.
- "simonepri/lm-scorer: Language Model based sentences scoring library" ("This package provides a simple programming interface to score sentences using different ML language models.")
-
Whole sentence rather than word frequency nltk?
As in, how generally would a sentence make sense in the totality of English? You could look into language models that give probability of a sentence. You can try a library called lm-scorer.
lingvo
-
Voice assistant that can be taught how to swear (Part 1)
To calculate the Word Error Rate I took a python script from the tensorflow/lingvo project and rewrote it in js. In essence, it is just a simple solution of the Edit Distance task, in addition to error calculation for each of the three types: deletion, insertion, and replacement. In the end, I did not the most intelligent method of comparing texts, and yet it was sufficient enough to later on add parameters to queries to Speech-to-Tex.
What are some alternatives?
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
TTS-Voice-Wizard - Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
penney - Penney's Game
seq2seq - A general-purpose encoder-decoder framework for Tensorflow
Sentence-Adder-Anki-Addon - Add sentences to Anki editor window in one click
allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
Tyche - A library for probabilistic reasoning and belief modelling in Python.
awesome-speech-recognition-speech-synthesis-papers - Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
ModuleFormer - ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
Mava - 🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
deepspeech-playbook - A crash course for training speech recognition models using DeepSpeech.
pocketsphinx-python - Python interface to CMU Sphinxbase and Pocketsphinx libraries