bstr
ocaml
bstr | ocaml | |
---|---|---|
10 | 119 | |
744 | 5,162 | |
- | 0.7% | |
6.7 | 9.9 | |
2 months ago | 5 days ago | |
Rust | OCaml | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bstr
-
We're building a browser when it's supposed to be impossible
Libraries for a lot of this stuff exist (albeit in many cases not very mature yet):
- https://github.com/pop-os/cosmic-text does text layout (which Taffy explicitly considers out of scope)
- https://github.com/AccessKit/accesskit does accessibility
- https://github.com/servo/rust-cssparser does value-agnostic CSS parsing (it will parse the general syntax but leaves value parsing up to the user, meaning you can easily add support for whatever properties you what). Libraries like https://github.com/parcel-bundler/lightningcss implement parsing for the standard css properties.
- There are crates like https://github.com/BurntSushi/bstr and https://docs.rs/wtf8/latest/wtf8/ for working with non-unicode text
We are planning to add a C API to Taffy, but tbh I feel like C is not very good for this kind of modularised approach. You really want to be able to expose complex APIs with enforced type safety and this isn't possible with C.
-
Chunking strings in Elixir: how difficult can it be?
As the author of bstr and also the regex implementation that bstr uses to implement word breaking, it is linear time.
NSFL: https://github.com/BurntSushi/bstr/blob/86947727666d7b21c97e...
-
A byte string library for Rust
OsStr uses WTF-8 on Windows, and just represents the raw underlying bytes on Unix.
Byte strings can be WTF-8. They can be anything. The problem is that there is no real way to (easily) get the underlying WTF-8 bytes of an OsStr on Windows. So there's no free conversion to and from byte strings.
I wrote more about this in the bstr docs (and don't miss the link to os_str_bytes): https://docs.rs/bstr/latest/bstr/#file-paths-and-os-strings
I'd be happy to answer more questions if you have them. :-) https://github.com/BurntSushi/bstr/discussions
-
Where is the `str` struct/primitive defined ? I am learning Rust, so don't shoot please :).
Check out bstr, which does this exact thing for its BString and BStr types.
-
Tips when porting C++ programs to Rust
Currently slated for next Monday: https://github.com/BurntSushi/bstr/issues/40
- bstr 1.0 request for comments
-
Let's Stop Ascribing Meaning to Code Points (2017)
This is just an FYI. I don't mean to say much to your overall point, although, as someone else who has spent a lot of time doing Unicode-y things, I do tend to agree with you. I had a very similar discussion a bit ago.[1]
Putting that aside, at least with respect to grapheme segmentation, it might be a little simpler than you think. But maybe only a little. The unicode-segmentation crate also does word segmentation, which is quite a bit more complicated than grapheme segmentation. For example, you can write a regex to parse graphemes without too much fuss[2]. (Compare that with the word segmentation regex, much to my chagrin.[3]) Once you build the regex, actually using it is basically as simple as running the regex.[4]
Sadly, not all regex engines will be able to parse that regex due to its use of somewhat obscure Unicode properties. But the Rust regex crate can. :-)
And of course, this somewhat shifts code size to heap size. So there's that too. But bottom line is, if you have a nice regex engine available to you, you can whip up a grapheme segmenter pretty quickly. And some regex engines even have grapheme segmentation built in via \X.
[1]: https://github.com/BurntSushi/aho-corasick/issues/72
[2]: https://github.com/BurntSushi/bstr/blob/e38e7a7ca986f9499b30...
[3]: https://github.com/BurntSushi/bstr/blob/e38e7a7ca986f9499b30...
[4]: https://github.com/BurntSushi/bstr/blob/e38e7a7ca986f9499b30...
-
os_str_bytes now has string types!
This is a great idea. I realize the find implementation is not ideal and have considered bringing in an optional dependency to improve performance. I remembered bstr using two-way search, so I was wondering if depending on the full crate for searching would be worthwhile, but I see that changed. Thanks for the tip!
-
What you don't like about Rust?
Fun little nit-pick that does not detract from your overall point: you can actually count graphemes with a regex and that's exactly what bstr does. :-)
ocaml
-
Autoconf makes me think we stopped evolving too soon
> OCaml’s configure script is also “normal”
If that’s this OCaml, it has a configure.ac file in the root directory, which looks suspicious for an Autotools-free package: https://github.com/ocaml/ocaml
-
The Return of the Frame Pointers
You probably already know, but with OCaml 5 the only way to get flamegraphs working is to either:
* use framepointers [1]
* use LBR (but LBR has a limited depth, and may not work on on all CPUs, I'm assuming due to bugs in perf)
* implement some deep changes in how perf works to handle the 2 stacks in OCaml (I don't even know if this would be possible), or write/adapt some eBPF code to do it
OCaml 5 has a separate stack for OCaml code and C code, and although GDB can link them based on DWARF info, perf DWARF call-graphs cannot (https://github.com/ocaml/ocaml/issues/12563#issuecomment-193...)
If you need more evidence to keep it enabled in future releases, you can use OCaml 5 as an example (unfortunately there aren't many OCaml applications, so that may not carry too much weight on its own).
[1]: I haven't actually realised that Fedora39 has already enabled FP by default, nice! (I still do most of my day-to-day profiling on an ~CentOS 7 system with 'perf --call-graph dwarf', I was aware that there was a discussion to enable FP by default, but haven't noticed it has actually been done already)
-
Top Paying Programming Technologies 2024
11. OCaml - $91,026
-
OCaml: a Rust developer's first impressions
> It partially helps since it forces you to have types where they matters most: exported functions
But the problém the OP has is not knowing the types when reading the source (in the .ml file).
> How would it feels like to use list if only https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.ml was available,
If the signature where in the source file (which you can do in OCaml too), there would be no problem - which is what all the other (for some definition of "other") languages except C and C++ (even Fortran) do.
No, really, I can't see a single advantage of separate .mli files at all. The real problém is that the documentation is often worse too, as the .mli is autogenerated and documented afterwards - and now changes made later in the sources need to be documented in the mli too, so anything that doesn't change the type often gets lost. The same happens in C and C++ with header files.
-
Bringing more sweetness to ruby with sorbet types 🍦
If you have been in the Ruby community for the past couple of years, it's possible that you're not a super fan of types or that this concept never passed through your mind, and that's totally cool. I myself love the dynamic and meta-programming nature of Ruby, and honestly, by the time of this article's writing, we aren't on the level of OCaml for type checking and inference, but still, there are a couple of nice things that types with sorbet bring to the table:
-
What is gained and lost with 63-bit integers? (2014)
Looks like there have been proposals to eliminate use of 3 operand lea in OCaml code (not accepted sadly):
https://github.com/ocaml/ocaml/pull/8531
-
Notes about the ongoing Perl logo discussion
An amazing example is Ocaml lang logo / mascot. It might be useful to talk with them to know what was the process behind this work. The About page camel head on Perl dot org header is also a pretty good example of simplification, but it's not a logo, just a friendly illustration, as the O'Reilly camel is. Another notable logo for this animal is the well known tobacco industry company, but don't get me started on that (“good” logo, though, if we look at the effectiveness of their marketing).
-
What can Category Theory do?
Haskell and Agda are probably the most obvious examples. Ocaml too, but it is much older, so its type system is not as categorical. There is also Idris, which is not as well-known but is very cool.
- Playing Atari Games in OCaml
-
Bloat
That does sound problematic, but without the code it is hard to tell what is the issue. Typically, compiling a 6kLoc file like https://github.com/ocaml/ocaml/blob/trunk/typing/typecore.ml takes 0.8 s on my machine.
What are some alternatives?
miniserve - 🌟 For when you really just want to serve some files over HTTP right now!
Alpaca-API - The Alpaca API is a developer interface for trading operations and market data reception through the Alpaca platform.
tonic - A native gRPC client & server implementation with async/await support.
VisualFSharp - The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
rust-memchr - Optimized string search routines for Rust.
dune - A composable build system for OCaml.
cargo-geiger - Detects usage of unsafe Rust in a Rust crate and its dependencies.
TradeAlgo - Stock trading algorithm written in Python for TD Ameritrade.
rust - Empowering everyone to build reliable and efficient software.
melange - A mixture of tooling combined to produce JavaScript from OCaml & Reason
rust-semverver - Automatic checking for semantic versioning in library crates
rust - Rust for the xtensa architecture. Built in targets for the ESP32 and ESP8266