upsc3ne
whatlang-rs
upsc3ne | whatlang-rs | |
---|---|---|
2 | 7 | |
1 | 954 | |
- | - | |
0.6 | 5.1 | |
about 1 year ago | 2 months ago | |
Rust | Rust | |
GNU General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
upsc3ne
-
[R] What are some Text Similarity methods?
At the risk of self-advertising, you can find an implementation of Fuzzy String Matching method "token set ratio" here in my repo for obscenity detection in Rust: https://github.com/Chubek/upsc3ne
- **MAZADAYASNA PROGRAMMER ALERT** I am learning Rust so...
whatlang-rs
-
Lingua 1.5.0 - The most accurate natural language detection library for Rust, now with support for detecting multiple languages in mixed-language text
How does it compare to whatlang?
-
Python Binding for WhatLang (Detect languages) - Blazing Fast ⚡
WhatLang is a Python library for detecting the language of a text. It is based on the WhatLang Rust library.
-
To people with real Rusty jobs: How did you land it? What exactly do you do at your job? How proficient are you? What skills besides Rust? How long did it take?
I started working on whatlang project (https://github.com/greyblake/whatlang-rs). In 2017 I started going to Rust interviews. At that moment there were only 3 companies in Berlin that were offering Rust jobs (as far as I know): Parity, Mozilla, 1aim. I had interview with all of them and did not pass. I had classical Ruby/web background, and at that moment Rust was seen as alternative to C++, so many would expect me to know C++ well (but it was not really the case). I did continue working on my open source projects and writing blog posts from time to time. Year 2020 was very different. I was like rust turned from underdog to mainstream. I felt like Rust job openings tripled. Head hunters started writing me on LinkedIn, waw! I got contacted by big CryptoExchange, because they wanted to use my library for technical analysis. Sounds like a dream! Eventually, I find a job at Impero.com, thanks to this subreddit. They posted a job description and I send them my CV. Soon I got hired. It's a remote job, but at that moment it did not make a difference, because of the pandemic.
-
Whatlang 0.15.0 released (lightweight lib for language recognition)
CHANGELOG: https://github.com/greyblake/whatlang-rs/blob/master/CHANGELOG.md
- Whatlang: A Natural language detection library for Rust
-
Whatlang strikes back
Regarding Chinese / Japanese, if I got it correctly Japanese may include Katakana, Hiragana and Mandarin, while Chinese includes only Mandarin characters (again I can be wrong here).
What are some alternatives?
rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
cargo-select - Cargo subcommand to easily run targets/examples
Fluent - Rust implementation of Project Fluent
OuevreDeOctet - Ouevre de Octet (French for "Artwork of Byte" is a CLM transformer model entirely written in Rust --- it can be used for everything but we use it for our artistic endeavors!
textwrap - An efficient and powerful Rust library for word wrapping text.
flx-rs - Rewrite emacs-flx in Rust for dynamic modules [maintainer=@jcs090218]
lingua-rs - The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
nlprule - A fast, low-resource Natural Language Processing and Text Correction library written in Rust.
suffix - Fast suffix arrays for Rust (with Unicode support).
ngrams - (Read-only) Generate n-grams
UNIC - UNIC: Unicode and Internationalization Crates for Rust