whatlang-rs
whatlang-accuracy-benchmark
Our great sponsors
whatlang-rs | whatlang-accuracy-benchmark | |
---|---|---|
7 | 1 | |
947 | 0 | |
- | - | |
5.1 | 0.0 | |
about 1 month ago | 11 months ago | |
Rust | Rust | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
whatlang-rs
-
Lingua 1.5.0 - The most accurate natural language detection library for Rust, now with support for detecting multiple languages in mixed-language text
How does it compare to whatlang?
-
Python Binding for WhatLang (Detect languages) - Blazing Fast ⚡
WhatLang is a Python library for detecting the language of a text. It is based on the WhatLang Rust library.
-
To people with real Rusty jobs: How did you land it? What exactly do you do at your job? How proficient are you? What skills besides Rust? How long did it take?
I started working on whatlang project (https://github.com/greyblake/whatlang-rs). In 2017 I started going to Rust interviews. At that moment there were only 3 companies in Berlin that were offering Rust jobs (as far as I know): Parity, Mozilla, 1aim. I had interview with all of them and did not pass. I had classical Ruby/web background, and at that moment Rust was seen as alternative to C++, so many would expect me to know C++ well (but it was not really the case). I did continue working on my open source projects and writing blog posts from time to time. Year 2020 was very different. I was like rust turned from underdog to mainstream. I felt like Rust job openings tripled. Head hunters started writing me on LinkedIn, waw! I got contacted by big CryptoExchange, because they wanted to use my library for technical analysis. Sounds like a dream! Eventually, I find a job at Impero.com, thanks to this subreddit. They posted a job description and I send them my CV. Soon I got hired. It's a remote job, but at that moment it did not make a difference, because of the pandemic.
-
Whatlang 0.15.0 released (lightweight lib for language recognition)
CHANGELOG: https://github.com/greyblake/whatlang-rs/blob/master/CHANGELOG.md
-
Whatlang strikes back
I am happy to announce a release of a new version (0.12.0) of whatlang.
Regarding Chinese / Japanese, if I got it correctly Japanese may include Katakana, Hiragana and Mandarin, while Chinese includes only Mandarin characters (again I can be wrong here).
whatlang-accuracy-benchmark
-
Whatlang strikes back
I did not dare to compete with you in accuracy, so my main focus to improve just to compete with my own old version of whatlang. So I had to setup some benchmarks: https://github.com/whatlang/whatlang-accuracy-benchmark/blob/master/reports/2021-04-18-v0-12-0.md
What are some alternatives?
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
Fluent - Rust implementation of Project Fluent
textwrap - An efficient and powerful Rust library for word wrapping text.
lingua-rs - The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
suffix - Fast suffix arrays for Rust (with Unicode support).
ngrams - (Read-only) Generate n-grams
cpc - Text calculator with support for units and conversion
UNIC - UNIC: Unicode and Internationalization Crates for Rust
code - Source code for the book Rust in Action
sonic - 🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.
tabwriter - Elastic tabstops for Rust.