Janet-utf8 Alternatives

Similar projects and alternatives to janet-utf8

hn-search

1,638 525 2.9 TypeScript janet-utf8 VS hn-search

Hacker News Search
xi-editor

42 19,815 2.6 Rust janet-utf8 VS xi-editor

A modern editor with a backend written in Rust.
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
text

10 298 6.8 C++ janet-utf8 VS text

A spicy text library for C++ that has the explicit goal of enabling the entire ecosystem to share in proper forward progress towards a bright Unicode future. (by soasis)
grapheme-splitter-lite

1 6 10.0 Kotlin janet-utf8 VS grapheme-splitter-lite

A light-weight Java library that breaks strings into user-perceived characters a.k.a. Grapheme Clusters for common cases.
tonsky.me

1 11 9.0 Clojure janet-utf8 VS tonsky.me
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better janet-utf8 alternative or higher similarity.

Suggest an alternative to janet-utf8

janet-utf8 reviews and mentions

Posts with mentions or reviews of janet-utf8. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-02.

The Absolute Minimum Every Software Developer Must Know About Unicode in 2023
7 projects | news.ycombinator.com | 2 Oct 2023

Regarding UTF-8 encoding:
“And a couple of important consequences:
- You CAN’T determine the length of the string by counting bytes.
- You CAN’T randomly jump into the middle of the string and start reading.
- You CAN’T get a substring by cutting at arbitrary byte offsets. You might cut off part of the character.”
One of the things I had to get used to when learning the programming language Janet is that strings are just plain byte sequences, unaware of any encoding. So when I call `length` on a string of one character that is represented by 2 bytes in UTF-8 (e.g. `ä`), the function returns 2 instead of 1. Similar issues occur when trying to take a substring, as mentioned by the author.
As much as I love the approach Janet took here (it feels clean and simple and works well with their built-in PEGs), it is a bit annoying to work with outside of the ASCII range. Fortunately, there are libraries that can deal with this issue (e.g. https://github.com/andrewchambers/janet-utf8), but I wish they would support conversion to/from UTF-8 out of the box, since I generally like Janet very much.
One interesting thing I learned from the article is that the first byte can always be determined from its prefix. I always wondered how you would recognize/separate a unicode character in a Janet string since it may have 1-4 bytes length, but I guess this is the answer.

Stats

Basic janet-utf8 repo stats

Mentions

Stars

Activity

10.0

Last Commit

over 2 years ago

andrewchambers/janet-utf8 is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of janet-utf8 is C.

Popular Comparisons

janet-utf8

Janet-utf8 Alternatives

Similar projects and alternatives to janet-utf8

hn-search

xi-editor

InfluxDB

text

grapheme-splitter-lite

tonsky.me

SaaSHub

janet-utf8 reviews and mentions

Stats

Popular Comparisons