unicode-transforms
critbit
Our great sponsors
unicode-transforms | critbit | |
---|---|---|
1 | 3 | |
47 | 330 | |
- | - | |
2.5 | 0.0 | |
6 months ago | over 2 years ago | |
Haskell | C | |
BSD 3-clause "New" or "Revised" License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
unicode-transforms
-
[ANN] unicode-collation 0.1
Thanks! Here's a puzzle. Profiling shows that about a third of the time in my code is spent in normalize from unicode-transforms. (Normalization is a required step in the algorithm but can be omitted if you know that the input is already in NFD form.) And when I add a benchmark that omits normalization, I see run time cut by a third. But text-icu's run time in my benchmark doesn't seem to be affected much by whether I set the normalization option. I am not sure how to square that with the benchmarks here that seem to show unicode-transforms outperforming text-icu in normalization. text-icu's documentation says that "an incremental check is performed to see whether the input data is in FCD form. If the data is not in FCD form, incremental NFD normalization is performed." I'm not sure exactly what this means, but it may mean that text-icu avoids normalizing the whole string, but just normalizes enough to do the comparison, and sometimes avoids normalization altogether if it can quickly determine that the string is already normalized. I don't see a way to do this currently with unicode-transforms.
critbit
-
Ask HN: What are some 'cool' but obscure data structures you know about?
> Good use-case: routing. Say you have a list of 1 million IPs that are [deny listed].
Apparently, bloom filters make for lousy IP membership checks, read: https://blog.cloudflare.com/when-bloom-filters-dont-bloom/
CritBit Trie [0] and possibly Allotment Routing Table (ART) are better suited for IPs.
[0] https://github.com/agl/critbit
[1] https://web.archive.org/web/20210720162224/https://www.harig...
-
Rethink-app: DNS over HTTPS, firewall, and connection tracker for Android
developer here
I'd imagine the app should work over IPv6-only networks thanks to 464xlat. I may be wrong, because I've never tested it on a IPv6-only network.
The reason for IPv6 is two fold:
1. Firewall today simply stores classless IP address rules as strings in a sqlite table fronted by a lfu cache backed by a typical hash-map. With IPv6, I'd imagine, this won't scale. So, we need a more economical in-memory data-structure (like a crit-bit trie [0] or art tree).
2. Apparently LwIP has problems with HappyEyeballs (I personally never saw it, but got a couple of reports from users about it that it was an unrecoverable error once the connectivity was lost, and the firewall had to be restarted). We're in the process of replacing LwIP with gvisor/netstack now [2], just to get IPv6 support back on track.
[0] https://github.com/agl/critbit
[1] http://www.hariguchi.org/art/art.pdf
[2] https://github.com/celzero/firestack/issues/3
- Critbit Trees in C(WEB)
What are some alternatives?
with-utf8 - Get your IO right on the first try
flatbuffers - An implementation of the flatbuffers protocol in Haskell.
refined - Refinement types with static checking
tables - Deprecated because of
hashable - A class for types that can be converted to a hash value
rethink-app - DNS over HTTPS / DNS over Tor / DNSCrypt client, WireGuard proxifier, firewall, and connection tracker for Android.
jump - Jump start your Haskell development
semantic-source - Parsing, analyzing, and comparing source code across many languages
hnix - A Haskell re-implementation of the Nix expression language
nextstep-plist - Parser and printer for NextStep style plist files
critbit - A Haskell implementation of crit-bit trees.
data-treify - Reify a recursive data structure into an explicit graph.