unicode-transforms vs with-utf8

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

unicode-transforms		with-utf8
	Project
1	Mentions	4
47	Stars	52
-	Growth	-
2.5	Activity	6.3
6 months ago	Latest Commit	6 days ago
Haskell	Language	Haskell
BSD 3-clause "New" or "Revised" License	License	Mozilla Public License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

unicode-transforms

Posts with mentions or reviews of unicode-transforms. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-04-17.

[ANN] unicode-collation 0.1
3 projects | /r/haskell | 17 Apr 2021

Thanks! Here's a puzzle. Profiling shows that about a third of the time in my code is spent in normalize from unicode-transforms. (Normalization is a required step in the algorithm but can be omitted if you know that the input is already in NFD form.) And when I add a benchmark that omits normalization, I see run time cut by a third. But text-icu's run time in my benchmark doesn't seem to be affected much by whether I set the normalization option. I am not sure how to square that with the benchmarks here that seem to show unicode-transforms outperforming text-icu in normalization. text-icu's documentation says that "an incremental check is performed to see whether the input data is in FCD form. If the data is not in FCD form, incremental NFD normalization is performed." I'm not sure exactly what this means, but it may mean that text-icu avoids normalizing the whole string, but just normalizes enough to do the comparison, and sometimes avoids normalization altogether if it can quickly determine that the string is already normalized. I don't see a way to do this currently with unicode-transforms.

with-utf8

Posts with mentions or reviews of with-utf8. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-09-22.

Best resources to learn haskell?
4 projects | /r/haskell | 22 Sep 2022

Invalid byte sequence sounds like a locale issue (see this blog post).
What Is IO Monad?
3 projects | news.ycombinator.com | 9 Mar 2022

The real fix will happen when Data.Text moves from UTF-16 to UTF-8:
https://discourse.haskell.org/t/hf-tech-proposal-1-utf-8-enc...
Fortunately this proposal has been accepted, but I don't know the timeline for its implementation in GHC.
Until then, working with UTF-8 is kind of convoluted
https://serokell.io/blog/haskell-with-utf8
Also: libraries need to be fixed to accept Data.Text instead of String. IsString helps (it's a typeclass that contains all string types) but only if APIs take it instead of defaulting to String. Adding random string conversions to cope with legacy APIs is very annoying.
Where can I look for help?
1 project | /r/agda | 2 Jan 2022

Agda is written in Haskell, and the quoted error message is a common problem with applications written in Haskell. You probably have an environment variable like LC_CTYPE or LC_ALL which is set to an unusual value. I'd try setting LANG, LC_CTYPE and LC_ALL to en_US.utf8.
Using VS Code with Haskell
2 projects | /r/haskellquestions | 15 Feb 2021

without this I get a build error (see here

What are some alternatives?

When comparing unicode-transforms and with-utf8 you can also consider the following projects:

refined - Refinement types with static checking

haskell-language-server - Official haskell ide support via language server (LSP). Successor of ghcide & haskell-ide-engine.

hashable - A class for types that can be converted to a hash value

text - Haskell library for space- and time-efficient operations over Unicode text.

jump - Jump start your Haskell development

text-short - Memory-efficient representation of Unicode text strings

code-builder - Packages for defining APIs, running them, generating client code and documentation.

unicode-data - Access unicode character database

critbit - A Haskell implementation of crit-bit trees.

safeio - Haskell Library for safe (atomic) IO

hnix - A Haskell re-implementation of the Nix expression language

binary-io - Read and write values of types that implement Binary from and to Handles

unicode-transforms vs refined with-utf8 vs haskell-language-server unicode-transforms vs hashable with-utf8 vs text unicode-transforms vs jump with-utf8 vs text-short unicode-transforms vs code-builder with-utf8 vs unicode-data unicode-transforms vs critbit with-utf8 vs safeio unicode-transforms vs hnix with-utf8 vs binary-io

Compare unicode-transforms vs with-utf8 and see what are their differences.

unicode-transforms

with-utf8

unicode-transforms

with-utf8

What are some alternatives?