delta-rs
sops
Our great sponsors
delta-rs | sops | |
---|---|---|
28 | 150 | |
1,820 | 15,114 | |
6.1% | 2.4% | |
9.7 | 9.0 | |
1 day ago | about 22 hours ago | |
Rust | Go | |
Apache License 2.0 | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
delta-rs
- Delta-rs – a Rust-based implementation of deltalake
-
Delta Lake vs. Parquet: A Comparison
I work at Databricks, but am pretty must just an OSS nerd, mainly focusing on Delta Rust recently: https://github.com/delta-io/delta-rs
I did some keyword research and wrote this post cause lots of folks are doing searches for Delta Lake vs Parquet. I'm just trying to share a fair summary of the tradeoffs with folks who are doing this search. It's a popular post and that's why I figured I would share it here.
-
Working with Rust
Seeing a lot of great libraries coming out with python bindings in the data world e.g delta-rs Polars. I see it growing in this space as a C++ alternative
-
Ideas/Suggestions around setting up a data pipeline from scratch
If I’m not misunderstanding, you could both decode the gRPC protobuf AND write to delta lake in Rust. Tonic, Delta-rs.
-
Delta-rs with upserts
https://github.com/delta-io/delta-rs/issues/850 … looks like it’s on the roadmap!
-
Read and filter delta files on Azure from a .net application
Microsoft talk a lot about OneLake and that the delta file format will be the standard during the build conference. Is it only me that find it strange that their marketing team talks so much about the delta format when they do not even provide a library to work with the delta format from .net? It would be easy for them to maintain bindings to https://github.com/delta-io/delta-rs but also provide a reader that support V-Order https://learn.microsoft.com/en-us/fabric/data-engineering/delta-optimization-and-v-order?tabs=sparksql
-
Polars query engine 0.29.0 released
I know someone will be adding this on the python side in the coming weeks. On the rust side you can use delta-rs with polars. Though you would be compiling both arrow2 and arrow-rs, so that's quite heavy.
-
Delta Lake without Databricks?
You don’t need DBX to use Delta Lake. You can use S3 as the backend and just use the Python Delta Lake library. It works great! https://github.com/delta-io/delta-rs
-
Seeking Recommendations for a Master Data Management Tool
Maybe if I get some free time soon I can formalize into a working example. Been wanting an excuse to try similar concept in delta-rs and polars/duckdb vs databricks/spark vs iceberg/polars.
-
Opportunity to contribute to a popular Rust data project (delta-rs)
delta-rs is a native Rust library for Delta Lake. It's a better way to store data than Parquet files and is fundamentally important library for the Rust data ecosystem. It's tightly integrated with Polars and Datafusion and there is a lot of interesting Rust work to be done.
sops
-
Pico.sh – Hacker Labs
My script just sets up default .sops.yaml for https://github.com/getsops/sops
You can further edit .sops.yaml(eg have multiple of them) and decide how you split secrets in your directory tree to further customize who can decrypt the secrets.
It works pretty well for prod/dev splits, etc
-
Encrypting your secrets with Mozilla SOPS using two AWS KMS Keys
Mozilla SOPS (Secrets OPerationS) is an open-source command-line tool for managing and storing secrets. It uses secure encryption methods to encrypt secrets at rest and decrypt them at runtime. SOPS supports a variety of key management systems, including AWS KMS, GCP KMS, Azure Key Vault, and PGP. It's particularly useful in a DevOps context where sensitive data like API keys, passwords, or certificates need to be securely managed and seamlessly integrated into application workflows.
-
An opinionated template for deploying a single k3s cluster with Ansible backed by Flux, SOPS, GitHub Actions, Renovate, Cilium, Cloudflare and more!
Encrypted secrets thanks to SOPS and Age
-
Tracking SQLite Database Changes in Git
We do the exact same thing to keep track of some credentials we use sops[1] and AWS KMS to separate credentials by sensitivity, then use the git differ to view the diffs between the encrypted secrets
Definitely not best practice security-wise, but it works well
[1] https://github.com/getsops/sops
-
The Twelve-Factor App
For anyone new to SOPS like I was - https://github.com/getsops/sops
- Storing and managing private keys
-
Show HN: Shello – Wrangle Environment Variables
I've found this is largely solved by strictly separating plain config and secrets, and then having secrets pull from GCP secret manager / vault / whatever.
You can then commit all the config (including the secret identifiers) and it all just works so long as you're authenticated with your secret storage system.
We do this for the live configuration as well in line with Gitops and find it to work well.
If you don't want to use a cloud secret manager you can also use something like https://github.com/getsops/sops to commit the encrypted secrets safely
-
Check your secrets into Git [video]
Basically, the simpler the better --just encrypt your secrets and check them in to version control.
We use SOPS[0] for this, and have found it to be pretty nice.
[0]: https://github.com/getsops/sops
-
How to secure secrets of docker-compose stacks with git?
The answer is that secrets shouldn't be stored in the git repo at all, but somewhere safe like a password manager or Mozilla's SOPS which people seem to love.
-
Is it safe to commit a Terraform file to GitHub?
Unfortunately, the SOPS project is in some sort of a limbo state and there has been quite a long period with limited maintenance and unclear position from Mozilla. Despite the project being accepted into the CNCF, it's still unclear what will happen with it going forward.
What are some alternatives?
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
sealed-secrets - A Kubernetes controller and tool for one-way encrypted Secrets
roapi - Create full-fledged APIs for slowly moving datasets without writing a single line of code.
Vault - A tool for secrets management, encryption as a service, and privileged access management
materialize - The data warehouse for operational workloads.
age - A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.
ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.
git-crypt - Transparent file encryption in git
kafka-delta-ingest - A highly efficient daemon for streaming data from Kafka into Delta Lake
terraform-provider-sops - A Terraform provider for reading Mozilla sops files
delta-oss
vault-secrets-operator - Create Kubernetes secrets from Vault for a secure GitOps based workflow.