grapheme-splitter
proposal-intl-segmenter
Our great sponsors
grapheme-splitter | proposal-intl-segmenter | |
---|---|---|
4 | 5 | |
894 | 145 | |
- | 0.7% | |
0.0 | 0.0 | |
about 3 years ago | over 2 years ago | |
JavaScript | HTML | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
grapheme-splitter
-
Create a Satisfying Wavy Text Animation With Framer Motion
Do note that if you're using an international language, you might want to check out Grapheme Splitter to divide strings into individual user perceived characters, as opposed to computer perceived characters. Since our text is in English, it'd just add unnecessary complication and an extra step to our project so I'm not adding it in :)
-
String encodings
Splitting by grapheme clusters (or the characters the user actually sees): JS doesn't support this natively, so you'll need a library like grapheme-splitter. There's a Stage-4 proposal in the works, though: Intl.Segmenter:
-
The complete guide to working with strings in modern JavaScript
Exactly, and emoji are outside the BMP, so it's not exactly an edge case, but the norm where two code units (UTF-16 double-bytes) are used to make one code point (Unicode character).
And it gets even worse, when you consider that for many purposes you're not even interested in code points but in graphemes -- e.g. a single visible emoji might actually be a combination of 5 code points, represented by 8 UTF-8 code units, taking up 16 bytes.
If you want to split a string by graphemes, you can either use the main dedicated library for it [3], or the relatively new API Intl.Segmenter [4] which is in Chrome and Safari, but still hasn't made it to Firefox [5].
[1] https://blog.jonnew.com/posts/poo-dot-length-equals-two
[2] https://www.contentful.com/blog/2016/12/06/unicode-javascrip...
[3] https://github.com/orling/grapheme-splitter
[4] https://github.com/tc39/proposal-intl-segmenter
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=1423593
-
LSMatrix
You’ll need a dependent package called GraphemeSplitter which is required if your font contains multi-byte characters like for ex. Hindi. The package is included in the above repo but it was created by orling.
proposal-intl-segmenter
-
String encodings
Splitting by grapheme clusters (or the characters the user actually sees): JS doesn't support this natively, so you'll need a library like grapheme-splitter. There's a Stage-4 proposal in the works, though: Intl.Segmenter:
-
Updates from the 86th meeting of TC39
Intl.Segmenter: Unicode segmentation in JavaScript slides.
-
Is there no .reverse() method for a string like there is for an array?
But even that's not bulletproof. The best method is to divide the string into grapheme clusters before reversing, which is where Intl.Segmenter comes in.
-
The complete guide to working with strings in modern JavaScript
Exactly, and emoji are outside the BMP, so it's not exactly an edge case, but the norm where two code units (UTF-16 double-bytes) are used to make one code point (Unicode character).
And it gets even worse, when you consider that for many purposes you're not even interested in code points but in graphemes -- e.g. a single visible emoji might actually be a combination of 5 code points, represented by 8 UTF-8 code units, taking up 16 bytes.
If you want to split a string by graphemes, you can either use the main dedicated library for it [3], or the relatively new API Intl.Segmenter [4] which is in Chrome and Safari, but still hasn't made it to Firefox [5].
[1] https://blog.jonnew.com/posts/poo-dot-length-equals-two
[2] https://www.contentful.com/blog/2016/12/06/unicode-javascrip...
[3] https://github.com/orling/grapheme-splitter
[4] https://github.com/tc39/proposal-intl-segmenter
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=1423593
-
Emoji under the hood
Also potentially (but not in practice so far) locale-specific. See the FAQ on Javascript's implementation: https://github.com/tc39/proposal-intl-segmenter#why-should-we-pass-a-locale-and-options-bag-for-grapheme-boundaries-isnt-there-just-one-way-to-do-it
What are some alternatives?
scriptable - Scriptable scripts for iOS
compressed-emoji-shortcodes - A Quest to Find a Highly Compressed Emoji :shortcode: Lookup Function
React - The library for web and native user interfaces.
.NET Runtime - .NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
zapatos - Zero-abstraction Postgres for TypeScript: a non-ORM database library
proposal-error-cause - TC39 proposal for accumulating errors
proposal-call-this - A proposal for a simple call-this operator in JavaScript.
framer/motion - Open source, production-ready animation and gesture library for React
proposal-source-phase-imports - Proposal to enable importing modules at the source phase
proposal-destructuring-private - A proposal integrate private fields and destructuring