Programming: How would I turn utf 32 characters/symbols into something I can compare a string to?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

unidecode

1 14 0.0 Java

Transliteration from Unicode to US-ASCII and ISO 8859-2.

Sounds like you're looking for what is known as "unidecode" -- basically it takes Unicode text and converts it to US-ASCII, while preserving as much as possible. Since you've mentioned Java, here's a library that I've tried out for one tiny project some time ago, it should work well for accented letters; the examples should give you an overview of what it does.

homoglyph

4 491 0.0 JavaScript

A big list of homoglyphs and some code to detect them

However I found this library that you could probably use somehow. But honestly right now I would consider it a more pragmatic solution to write a very short Python program that takes any UTF-8 encoded text file as input and produces the normalized variant as output and then use that file for further processing.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Hope whoever has to debug this is wearing a helmet cause they'll definitely smash their head against the wall
1 project | /r/programminghorror | 12 Feb 2022
Trojan Source: Invisible Vulnerabilities
2 projects | news.ycombinator.com | 31 Oct 2021

Programming: How would I turn utf 32 characters/symbols into something I can compare a string to?

This page summarizes the projects mentioned and recommended in the original post on /r/AskProgramming
homoglyphs
Post date: 2 Jul 2021

unidecode

homoglyph

InfluxDB

Related posts

Programming: How would I turn utf 32 characters/symbols into something I can compare a string to?

This page summarizes the projects mentioned and recommended in the original post on /r/AskProgramming homoglyphs Post date: 2 Jul 2021

unidecode

homoglyph

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/AskProgramming
homoglyphs
Post date: 2 Jul 2021