libu8ident
go
libu8ident | go | |
---|---|---|
9 | 2,079 | |
17 | 120,063 | |
- | 1.0% | |
1.8 | 10.0 | |
11 months ago | 2 days ago | |
C | Go | |
Apache License 2.0 | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
libu8ident
- Roaring bitmaps are compressed bitmaps, can be 100x faster
-
International domain names: where does HTTPS://meßagefactory.ca lead you?
In programming languages it's much worse. Identifiers can either be unidentifiable, and if so everybody has a different opinion what "identifiable" means. Even the standard on identifiers, UTF-39, is buggy and has too many interpretations, leading to a complete disaster. https://github.com/rurban/libu8ident/blob/master/doc/c11.md
In punycode domain names it's quite simple still.
With other names, it's even worse. No-one cares. Linkers do not, username and filesystem drivers do not. The Apple HFS+ did care a bit one day, until someone in the higher ranks decided that no-one needs unicode security anymore and switched the new APFS to unsafe again.
-
Using Unicode in a compiler
No, it's definitely not safe to use unrestricted Unicode in a compiler. See https://github.com/rurban/libu8ident/ for identifier rules, and http://www.unicode.org/reports/tr55/ for much worse problems.
- Ask HN: What interesting problems are you working on? ( 2022 Edition)
- Unicode Utilities: Confusables
-
How can you be fooled by the U+202E trick?
That's why unicode published the security guidelines and mechanisms to avoid such attacks. In 2004 already.
The problem is that nobody cared. Browsers invented punycode instead of following tr39, email ditto. But ok, at least something. Java did it, cperl did, rust did it.
Everybody else is vulnerable. Esp. most other programming languages, filesystems and login systems. https://github.com/rurban/libu8ident/blob/master/doc/c11.md
- Prevent Trojan Source attacks with GCC 12
-
Unicode Normalization Forms: When ö = ö
I'm maintaining such a library.
coreutils, diff, grep, patch, sed and friends all cannot find Unicode strings, they have no string support. They can only mimic filesystems, finding binary garbage. Strings are so rthi g different than pure ASCII or BINARY garbage. Strings have an encoding and are Unicode.
Filesystems are even worse because they need to treat filenames as identifiers, but do not. Nobody cares about TR31, TR39, TR36 and so on.
Here is an overview of the sad state of Unicode unsafeties in programming languages: https://github.com/rurban/libu8ident/blob/master/c11.md
- Why does Windows 10 run faster than Fedora?
go
-
Arena-Based Parsers
The description indicates it is not production ready, and is archived at the same time.
If you pull all stops in each respective language, C# will always end up winning at parsing text as it offers C structs, pointers, zero-cost interop, Rust-style struct generics, cross-platform SIMD API and simply has better compiler. You can win back some performance in Go by writing hot parts in Go's ASM dialect at much greater effort for a specific platform.
For example, Go has to resort to this https://github.com/golang/go/blob/4ed358b57efdad9ed710be7f4f... in order to efficiently scan memory, while in C# you write the following once and it compiles to all supported ISAs with their respective SIMD instructions for a given vector width: https://github.com/dotnet/runtime/blob/56e67a7aacb8a644cc6b8... (there is a lot of code because C# covers much wider range of scenarios and does not accept sacrificing performance in odd lengths and edge cases, which Go does).
Another example is computing CRC32: you have to write ASM for Go https://github.com/golang/go/blob/4ed358b57efdad9ed710be7f4f..., in C# you simply write standard vectorized routine once https://github.com/dotnet/runtime/blob/56e67a7aacb8a644cc6b8... (its codegen is competitive with hand-intrinsified C++ code).
There is a lot more of this. Performance and low-level primitives to achieve it have been an area of focus of .NET for a long time, so it is disheartening to see one tenth of effort in Go to receive so much spotlight.
-
Go: the future encoding/json/v2 module
A Discussion about including this package in Go as encoding/json/v2 has been started on the Go Github project on 2023-10-05. Please provide your feedback there.
-
Evolving the Go Standard Library with math/rand/v2
I like the Principles section. Very measured and practical approach to releasing new stdlib packages. https://go.dev/blog/randv2#principles
The end of the post they mention that an encoding/json/v2 package is in the works: https://github.com/golang/go/discussions/63397
-
Microsoft Maintains Go Fork for FIPS 140-2 Support
There used to be the GO FIPS branch :
https://github.com/golang/go/tree/dev.boringcrypto/misc/bori...
But it looks dead.
And it looks like https://github.com/golang-fips/go as well.
-
Borgo is a statically typed language that compiles to Go
I'm not sure what exactly you mean by acknowledgement, but here are some counterexamples:
- A proposal for sum types by a Go team member: https://github.com/golang/go/issues/57644
- The community proposal with some comments from the Go team: https://github.com/golang/go/issues/19412
Here are some excerpts from the latest Go survey [1]:
- "The top responses in the closed-form were learning how to write Go effectively (15%) and the verbosity of error handling (13%)."
- "The most common response mentioned Go’s type system, and often asked specifically for enums, option types, or sum types in Go."
I think the problem is not the lack of will on the part of the Go team, but rather that these issues are not easy to fix in a way that fits the language and doesn't cause too many issues with backwards compatibility.
[1]: https://go.dev/blog/survey2024-h1-results
-
AWS Serverless Diversity: Multi-Language Strategies for Optimal Solutions
Now, I’m not going to use C++ again; I left that chapter years ago, and it’s not going to happen. C++ isn’t memory safe and easy to use and would require extended time for developers to adapt. Rust is the new kid on the block, but I’ve heard mixed opinions about its developer experience, and there aren’t many libraries around it yet. LLRD is too new for my taste, but **Go** caught my attention.
-
How to use Retrieval Augmented Generation (RAG) for Go applications
Generative AI development has been democratised, thanks to powerful Machine Learning models (specifically Large Language Models such as Claude, Meta's LLama 2, etc.) being exposed by managed platforms/services as API calls. This frees developers from the infrastructure concerns and lets them focus on the core business problems. This also means that developers are free to use the programming language best suited for their solution. Python has typically been the go-to language when it comes to AI/ML solutions, but there is more flexibility in this area. In this post you will see how to leverage the Go programming language to use Vector Databases and techniques such as Retrieval Augmented Generation (RAG) with langchaingo. If you are a Go developer who wants to how to build learn generative AI applications, you are in the right place!
-
From Homemade HTTP Router to New ServeMux
net/http: add methods and path variables to ServeMux patterns Discussion about ServeMux enhancements
-
Building a Playful File Locker with GoFr
Make sure you have Go installed https://go.dev/.
- Fastest way to get IPv4 address from string
What are some alternatives?
Confusables - Simple library for matching a string to another string that is same but has letters that only *look* the same as original string
v - Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io
featurebase - A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Docker instance: https://hub.docker.com/r/featurebasedb/featurebase
TinyGo - Go compiler for small places. Microcontrollers, WebAssembly (WASM/WASI), and command-line tools. Based on LLVM.
libredwg - Official mirror of libredwg. With CI hooks and nightly releases. PR's ok
zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
safeclib - safec libc extension with all C11 Annex K functions
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
nbperf - Improved NetBSD's Perfect Hash Generation Tool v3
Angular - Deliver web apps with confidence 🚀
reals - A lightweight python3 library for arithmetic with real numbers.
golang-developer-roadmap - Roadmap to becoming a Go developer in 2020