UNIC
cldr
UNIC | cldr | |
---|---|---|
4 | 5 | |
234 | 843 | |
1.3% | 2.3% | |
0.0 | 9.8 | |
6 days ago | 6 days ago | |
Rust | Java | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
UNIC
-
I'm 15 ETH Away from Making the Unicode Character Database (UCD) Available on Rinkeby Testnet
For reference, here is an equivalent library in Rust: https://github.com/open-i18n/rust-unic/
-
icu vs rust_icu
There is also rust-unic which provides both normalization and access to the character database. I have also used this because of their text segmentation support, and I would probably recommend rust-unic in general. I hope to see more progress on that front.
-
Δ Programming Language
I try to be mindful of making my software as accessible as possible, but the following
> creating a lookup table for all the unicode material out there might've been considered impractical or performance-hitting for the developers.
just doesn't ring true to me in any way for current software. I understand that people can be using older software, which is why I strive to restrict myself to ASCII as much as possible for the widest possible support for my users, but my software also supports unicode identifiers, up to and including a whole unicode table to talk about confusables[1]. And not all TTS software "ignores" characters, which is why people advice against using πππππ¦ unicode because it doesn't get read as text but instead each character is described individually. (This is also something that TTS software should support for their users' sake, but I digress.)
[1]: this is thanks to the crate unic-udc containing this information: https://github.com/open-i18n/rust-unic
-
Unicode sorting is hard & why browsers added special emoji matching to regexp
Regarding https://github.com/open-i18n/rust-unic, could it be that the project, or otherwise was superseded by https://github.com/unicode-org/icu4x ?
cldr
-
Gathering Timezone Information in GoLang
Creating this mapping is a manual process, and the link contains the reference for the mappings. To establish this mapping, you can find the necessary information by visiting the link.
- Latest intl and icu versions cause "breaking change" with Canadian currency display
-
What they donβt tell you when you translate your app
One problem I stumbled upon frequently is codebases that did not support localized formats, but just assumed a certain format to use, for example through concatenation.
There are capabilities built into the programming languages, which allow to format numbers, currencies, etc. with a specific locale. There are also great resources [1] out there that provide all kinds of formats and localized names for countries, currencies, etc.
[1] Unicode CLDR: https://github.com/unicode-org/cldr
-
Are there lists of Unicode characters (and combinations) which a specific language might use?
Small addition: If you need the characters in machine-readable form, the source is the CLDR project. For Portuguese, the XML file is here on Github: https://github.com/unicode-org/cldr/blob/master/common/main/pt.xml
-
The Ultimate EU Passport - Made by me :)
This information is false. en-150 in CLDR does not use this Euro English variant. It's just world English (en-001) with 3 adjustments: 24 hour time, currency symbol after the number and European time zone codes. Source. That's it.
What are some alternatives?
Fluent - Rust implementation of Project Fluent
icu4x - Solving i18n for client-side and resource-constrained environments.
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
VTerminalPaletteEditor - A standalone GUI application for creating and editing VTerminal palettes.
textwrap - An efficient and powerful Rust library for word wrapping text.
ppl-i18n - Translations for PewPew Live.
whatlang-rs - Natural language detection library for Rust. Try demo online: https://whatlang.org/
VTerminal - A new Look-and-Feel (LaF) for Java, which allows for a grid-based display of Unicode characters with custom fore/background colors, font sizes, and pseudo-shaders. Originally designed for developing Roguelike/lite games.
cpc - Text calculator with support for units and conversion
datamatrix-fu - Data Matrix barcodes in the Fusion programming language
go-timezone - Gathering Timezone Information in GoLang