k8s-openapi
datafusion-ballista
k8s-openapi | datafusion-ballista | |
---|---|---|
7 | 12 | |
367 | 1,308 | |
- | 6.0% | |
8.3 | 8.2 | |
6 days ago | 8 days ago | |
Rust | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
k8s-openapi
-
WinBtrfs – an open-source btrfs driver for Windows
It's called sans-io in Python land, which is where I heard it first.
https://sans-io.readthedocs.io/
I did it for one of my projects back in 2018 https://github.com/Arnavion/k8s-openapi/commit/9a4fbb718b119...
-
The bane of my existence: Supporting both async and sync code in Rust
Another option is to implement your API in a sans-io form. Since k8s-openapi was mentioned (albeit for a different reason), I'll point out that its API gave you a request value that you could send using whatever sync or async HTTP client you want to use. It also gave you a corresponding function to parse the response, that you would call with the response bytes however you got them from your client.
https://github.com/Arnavion/k8s-openapi/blob/v0.19.0/README....
(Past tense because I removed all the API features from k8s-openapi after that release, for unrelated reasons.)
-
Welcome to Comprehensive Rust
Macro expansion is slow, but only noticeably in the specific situation of a) third-party proc macros, b) a debug build, and c) a few thousand invocations of said proc macros. This is because debug builds compile proc macros in debug mode too, so while the macro itself compiles quickly (because it's a debug build), it ends up running slowly (because it's a debug build).
I know this from observing this on a mostly auto-generated crate that had a couple of thousand types with `#[derive(serde::)]` on each. [1]
This doesn't affect most users, because first-party macros like `#[derive(Debug)]` etc are not slow because they're part of rustc and are thus optimized regardless of the profile, and even with third-party macros it is unlikely that they have thousands of invocations. Even if it is* a problem, users can opt in to compiling just the proc macros in release mode. [2]
[1]: https://github.com/Arnavion/k8s-openapi/issues/4
[2]: https://github.com/rust-lang/cargo/issues/5622
-
OpenAPI Generator allows generation of API client libraries from OpenAPI Specs
>OpenAPI Generator allows generation of API client libraries from OpenAPI Specs
It does, but the generated code can be very shitty for some combinations of spec and output language. I maintain Rust bindings for the Kubernetes API server's API, and I chose to write my own code generator instead. The README at https://github.com/Arnavion/k8s-openapi has more details.
-
Any good toy Rust project for k8s application?
k8s_openapi - https://github.com/Arnavion/k8s-openapi
-
Approaches for Chaining Access to Deeply Nested Optional Structs
For example: I have a routine that checks the value of (from k8s-openapi): Ingress -> IngressStatus -> LoadBalancerStatus -> Vec[0] -> String
-
Writing a Kubernetes CRD Controller in Rust
As the maintainer of the Rust bindings that the library used in the article (kube) is backed by, I can confirm that Kubernetes' openapi spec requires a lot of Kubernetes-specific handling to generate a good client than generic openapi generators do not provide.
See https://github.com/Arnavion/k8s-openapi/blob/master/README.m... for a full description.
I also confirm that I keep it up-to-date with Kubernetes releases and have been doing so for the ~3 years that it's been around. Not just the minor ones every few months, but even the point ones; these days the latter usually only involves updating the test cases instead of code changes and they're done within a few hours of the upstream release.
datafusion-ballista
-
Polars
Not super on topic because this is all immature and not integrated with one another yet, but there is a scaled-out rust data-frames-on-arrow implementation called ballista that could maybe? form the backend of a polars scale out approach: https://github.com/apache/arrow-ballista
-
Rust vs. Go in 2023
> Is Rust's compile-time GC about something other than performance somehow?
AFAIK, memory safety and language features as RAII is also available in C++, for instance. About the reasons for slow compilation, take a look at https://www.reddit.com/r/rust/comments/xna9mb/why_are_rust_p...
Not having a GC is also about not having a runtime as you mention (e.g. nice for creating Python extensions and embedded systems programming) and also more runtime deterministic performance: on that, if I'm not mistaken that was the reason for Discourse switching to Rust and also, e.g.: "the choice of Rust as the main execution language avoids the overhead of GC pauses and results in deterministic processing times" https://github.com/apache/arrow-ballista/blob/main/README.md
- Ballista (Rust) vs Apache Spark. A Tale of Woe.
-
Evolution and Trends of Data Engineering 2022/23
Ballista (Arrow-Rust), which is largely inspired by Apache Spark, there are some interesting differences.
-
Data Engineering with Rust
https://github.com/jorgecarleitao/arrow2 https://github.com/apache/arrow-datafusion https://github.com/apache/arrow-ballista https://github.com/pola-rs/polars https://github.com/duckdb/duckdb
- Any job processing framework like Spark but in Rust?
-
Is Apache Arrow DataFusion and Ballista the future of big data engineering/science?
Source: https://github.com/apache/arrow-ballista
-
Pure Python Distributed SQL Engine
Can you explain how this might differ from something like https://github.com/apache/arrow-ballista
I've seen several variants of "next-gen" spark, but nowhere have I really seen the different tradeoffs/advantages/disadvantages between them.
- Scala or Rust? which one will rule in future?
-
Welcome to Comprehensive Rust
Rust has amazing integration with Python through PyO3 [1] so see it like a safe alternative for high performance calculations. The ecosystem itself is starting to come together exciting projects like Polars [2] (Pandas alternative), nalgebra [3], Datafusion [4] and Ballista [5]
[1] https://github.com/PyO3/pyo3
[2] https://github.com/pola-rs/polars/
[3] https://docs.rs/nalgebra/latest/nalgebra/
[4] https://github.com/apache/arrow-datafusion
[5] https://github.com/apache/arrow-ballista
What are some alternatives?
kube - Rust Kubernetes client and controller runtime
duckdb - DuckDB is an in-process SQL OLAP Database Management System
fusionauth-openapi - FusionAuth OpenAPI client
lance - Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
go - The Go programming language
seafowl - Analytical database for data-driven Web applications 🪶
spectrum - OpenAPI Spec SDK and Converter for OpenAPI 3.0 and 2.0 Specs to Postman 2.0 Collections. Example RingCentral spec included.
connector-x - Fastest library to load data from DB to DataFrames in Rust and Python
smithy - Smithy is a protocol-agnostic interface definition language and set of tools for generating clients, servers, and documentation for any programming language.
opteryx - 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
tokio - A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
sqlglot - Python SQL Parser and Transpiler