msgspec
Cap'n Proto
msgspec | Cap'n Proto | |
---|---|---|
31 | 66 | |
1,868 | 11,180 | |
- | 0.8% | |
8.6 | 9.2 | |
about 1 month ago | 7 days ago | |
Python | C++ | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
msgspec
- Htmx, Rust and Shuttle: A New Rapid Prototyping Stack
-
Litestar 2.0
Full support for validation and serialisation of attrs classes and msgspec Structs. Where previously only Pydantic models and types where supported, you can now mix and match any of these three libraries. In addition to this, adding support for another modelling library has been greatly simplified with the new plugin architecture
-
FastAPI 0.100.0:Release Notes
> Maybe it was very slow before
That is at least partly the case. I maintain msgspec[1], another Python JSON validation library. Pydantic V1 was ~100x slower at encoding/decoding/validating JSON than msgspec, which was more a testament to Pydantic's performance issues than msgspec's speed. Pydantic V2 is definitely faster than V1, but it's still ~10x slower than msgspec, and up to 2x slower than other pure-python implementations like mashumaro.
Recent benchmark here: https://gist.github.com/jcrist/d62f450594164d284fbea957fd48b...
[1]: https://github.com/jcrist/msgspec
-
Pydantic 2.0
While it's definitely much faster than pydantic V1 (which is a huge accomplishment!), it's still not exactly what I'd call "fast".
I maintain msgspec (https://github.com/jcrist/msgspec), a serialization/validation library which provides similar functionality to pydantic. Recent benchmarks of pydantic V2 against msgspec show msgspec is still 15-30x faster at JSON encoding, and 6-15x faster at JSON decoding/validating.
Benchmark (and conversation with Samuel) here: https://gist.github.com/jcrist/d62f450594164d284fbea957fd48b...
This is not to diminish the work of the pydantic team! For many users pydantic will be more than fast enough, and is definitely a more feature-filled tool. It's a good library, and people will be happy using it! But pydantic is not the only tool in this space, and rubbing some rust on it doesn't necessarily make it "fast".
-
Need help developing a high performance Redis ORM for Python
https://github.com/jcrist/msgspec so I am using this instead of Pydantic.
-
Blog post: Writing Python like it’s Rust
Another thing: why pyserde rather than stuff like msgspec? https://github.com/jcrist/msgspec
- Show HN: Msgspec, a fast serialization/validation library for Python
-
[Guide] A Tour Through the Python Framework Galaxy: Discovering the Stars
Try msgspec | Maat | turbo for fast serialization and validation
-
Pydantic V2 rewritten in Rust is 5-50x faster than Pydantic V1
Congratulations to the team, Pydantic is an amazing library.
If you find JSON serialization/deserialization a bottleneck, another interesting library (with much less features) for Python is msgspec: https://github.com/jcrist/msgspec
-
Starlite updates March '22 | 2.0 is coming
This feature is yet to be released, but it will allow you to seamlessly use data modelled with for example Pydantic, SQLAlchemy, msgspec or dataclasses in your route handlers, without the need for an intermediary model; The conversion will be handled by the specific DTO "backend" implementation. This new paradigm also makes it trivial to add support for any such modelling library, by simply implementing an appropriate backend.
Cap'n Proto
-
Mysterious Moving Pointers
Yeah I pretty much only use my own alternate container implementations (from KJ[0]), which avoid these footguns, but the result is everyone complains our project is written in Kenton-Language rather than C++ and there's no Stack Overflow for it and we can't hire engineers who know how to write it... oops.
[0] https://github.com/capnproto/capnproto/blob/v2/kjdoc/tour.md
-
Show HN: Comprehensive inter-process communication (IPC) toolkit in modern C++
- may massively reduce the latency involved.
Those sharing Cap'n Proto-encoded data may have particular interest. Cap'n Proto (https://capnproto.org) is fantastic at its core task - in-place serialization with zero-copy - and we wanted to make the IPC (inter-process communication) involving capnp-serialized messages be zero-copy, end-to-end.
That said, we paid equal attention to other varieties of payload; it's not limited to capnp-encoded messages. For example there is painless (<-- I hope!) zero-copy transmission of arbitrary combinations of STL-compliant native C++ data structures.
To help determine whether Flow-IPC is relevant to you we wrote an intro blog post. It works through an example, summarizes the available features, and has some performance results. https://www.linode.com/blog/open-source/flow-ipc-introductio...
Of course there's nothing wrong with going straight to the GitHub link and getting into the README and docs.
Currently Flow-IPC is for Linux. (macOS/ARM64 and Windows support could follow soon, depending on demand/contributions.)
-
Condvars and atomics do not mix
FWIW, my C++ toolkit library, KJ, does the same thing.[0]
But presumably you could still write a condition predicate which looks at things which aren't actually part of the mutex-wrapped structure? Or does is the Rust type system able to enforce that the callback can only consider the mutex-wrapped value and values that are constant over the lifetime of the wait? (You need the latter e.g. if you are waiting for the mutex-wrapped value to compare equal to some local variable...)
[0] https://github.com/capnproto/capnproto/blob/e6ad6f919aeb381b...
- Cap'n'Proto: infinitely faster than Protobuf
-
I don’t understand zero copy
The second one is to encode data in such a way that you can read it and operate on it directly from the buffer. You write data in a layout that is the same, or easily transformed as types in memory. To do that you usually need to encode with a known schema, only Sized types to efficiently compute fields locations as offsets in the buffer, and you usually represent pointers as offset into the encode. You can look at capnproto protocol for instance https://capnproto.org/
-
OpenTF Renames Itself to OpenTofu
Worked well for Cap'n Proto (the cerealization protocol)! https://capnproto.org/
-
A Critique of the Cap'n Proto Schema Language
With all due respect, you read completely wrong.
* The very first use case for which Cap'n Proto was designed was to be the protocol that Sandstorm.io used to talk between sandbox and supervisor -- an explicitly adversarial security scenario.
* The documentation explicitly calls out how implementations should manage resource exhaustion problems like deep recursion depth (stack overflow risk).
* The implementation has been fuzz-tested multiple ways, including as part of Google's oss-fuzz.
* When there are security bugs, I issue advisories like this:
https://github.com/capnproto/capnproto/tree/v2/security-advi...
* The primary aim of the entire project is to be a Capability-Based Security RPC protocol.
- Cap'n Proto: serialization/RPC system – core tools and C++ library
-
Sandstorm: Open-source platform for self-hosting web app
I like how they use capability-based security [0] and use Cap'n Proto protocol. This is another technology that is slow to get broad adoption, but has many things going for when compared to e.g. Protocol Buffers (Cap'n Proto is created by the primary author of Protobuf v2, Kenton Varda).
[0] https://sandstorm.io/how-it-works#capabilities
[1] https://capnproto.org
-
Flatty - flat message buffers with direct mapping to Rust types without packing/unpacking
Related but not Rust-specific: FlatBuffers, Cap'n Proto.
What are some alternatives?
pydantic - Data validation using Python type hints
gRPC - The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)
orjson - Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
Protobuf - Protocol Buffers - Google's data interchange format
fastapi - FastAPI framework, high performance, easy to learn, fast to code, ready for production
FlatBuffers - FlatBuffers: Memory Efficient Serialization Library
mashumaro - Fast and well tested serialization library
ZeroMQ - ZeroMQ core engine in C++, implements ZMTP/3.1
MessagePack - MessagePack serializer implementation for Java / msgpack.org[Java]
Apache Thrift - Apache Thrift
marshmallow - A lightweight library for converting complex objects to and from simple Python datatypes.