UNIC
textwrap
Our great sponsors
UNIC | textwrap | |
---|---|---|
4 | 2 | |
234 | 426 | |
1.7% | - | |
0.0 | 6.3 | |
8 months ago | 10 days ago | |
Rust | Rust | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
UNIC
-
I'm 15 ETH Away from Making the Unicode Character Database (UCD) Available on Rinkeby Testnet
For reference, here is an equivalent library in Rust: https://github.com/open-i18n/rust-unic/
-
icu vs rust_icu
There is also rust-unic which provides both normalization and access to the character database. I have also used this because of their text segmentation support, and I would probably recommend rust-unic in general. I hope to see more progress on that front.
-
Ć Programming Language
I try to be mindful of making my software as accessible as possible, but the following
> creating a lookup table for all the unicode material out there might've been considered impractical or performance-hitting for the developers.
just doesn't ring true to me in any way for current software. I understand that people can be using older software, which is why I strive to restrict myself to ASCII as much as possible for the widest possible support for my users, but my software also supports unicode identifiers, up to and including a whole unicode table to talk about confusables[1]. And not all TTS software "ignores" characters, which is why people advice against using 𝑓𝑎𝑛𝑐𝑦 unicode because it doesn't get read as text but instead each character is described individually. (This is also something that TTS software should support for their users' sake, but I digress.)
[1]: this is thanks to the crate unic-udc containing this information: https://github.com/open-i18n/rust-unic
-
Unicode sorting is hard & why browsers added special emoji matching to regexp
Regarding https://github.com/open-i18n/rust-unic, could it be that the project, or otherwise was superseded by https://github.com/unicode-org/icu4x ?
textwrap
-
What are the biggest problems that you're facing right now in this stage of your programming journey?
Projects are things you do for fun to learn. A few years ago I got curious about Rust and so I started reading about it. My learning project was text wrapping: https://github.com/mgeisler/textwrap
-
Textwrap 0.14.0 released with support for wrapping text without word separators
The demo shows how Textwrap can be used to wrap both proportional and fixed-width text. In the demo, the text is rendered on a HTML canvas element, but it could just as well go to a PDF file, a GUI, or similar. The ability to wrap text outside of the terminal was added in version 0.13.0.
What are some alternatives?
Fluent - Rust implementation of Project Fluent
whatlang-rs - Natural language detection library for Rust. Try demo online: https://whatlang.org/
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
suffix - Fast suffix arrays for Rust (with Unicode support).
ngrams - (Read-only) Generate n-grams
cpc - Text calculator with support for units and conversion
datamatrix-fu - Data Matrix barcodes in the Fusion programming language
fut - Fusion programming language. Transpiling to C, C++, C#, D, Java, JavaScript, Python, Swift, TypeScript and OpenCL C.