lakeFS
kubebuilder
Our great sponsors
lakeFS | kubebuilder | |
---|---|---|
48 | 45 | |
4,058 | 7,384 | |
2.3% | 1.8% | |
9.8 | 9.2 | |
7 days ago | 7 days ago | |
Go | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lakeFS
-
A Step-by-Step Guide to Implementing Data Version Control
# Download the LakeFS binary wget https://github.com/treeverse/lakeFS/releases/latest/download/lakefs # Make the binary executable chmod +x lakefs # Initialize LakeFS with S3 as the storage backend ./lakefs init --backend s3 --s3-gateway-endpoint --s3-region --s3-force-path-style --s3-access-key --s3-secret-key
-
Jujutsu: A Git-compatible DVCS that is both simple and powerful
Might want to look at purpose built tools for that such as lakeFS (https://github.com/treeverse/lakeFS/)
* Disclaimer: I'm one of the creators/maintainers of the project.
-
Data diffs: Algorithms for explaining what changed in a dataset (2022)
Might want to checkout lakeFS: https://github.com/treeverse/lakeFS
(full disclosure: I'm one of the creators)
-
Transactions in Spark / Delta lake?
Take a look at https://github.com/treeverse/lakeFS -
- LakeFS – Version Control for Big Data
- DuckDB <3 LakeFS
- We built an open-source project (3.1K stars on GitHub) for data version control
-
How are you incrementally testing your data pipelines as you develop them?
I mean if you're ready to adopt a new framework into your ecosystem this is one of the major usecases for LakeFS.
- Git-for-Data
- LakeFS: Git-like versioning for object stores
kubebuilder
-
SpinKube: Orchestrating light, fast and efficient WebAssembly (Wasm) workloads in Kubernetes (k8s)
The Spin operator uses the Kubebuilder framework and contains a Spin App Custom Resource Definition (CRD) and controller. It watches Spin App Custom Resources and realizes the desired state in the K8s cluster. Aside from the immediate benefits gained by running Wasm workloads in k8s, additional optimizations such as Horizontal Pod Scaling (HPA) and k8s Event-driven Autoscaling (KEDA) can be achieved in a pinch.
-
Building a Kubernetes Operator with the Operator Framework
kubebuilder: brew install kubebuilder
-
Annotations in Kubernetes Operator Design
The operator that I've been working on is designed to manage the full lifecycle of a QuestDB database instance, including version and hardware upgrades, config changes, backups, and (eventually) recovery from node failure. I used the Operator SDK and kubebuilder frameworks to provide scaffolding and API support.
-
Kubebuilder Tips and Tricks
Recently, I've been spending a lot of time writing a Kubernetes operator using the go operator-sdk, which is built on top of the Kubebuilder framework. This is a list of a few tips and tricks that I've compiled over the past few months working with these frameworks.
-
We moved our Cloud operations to a Kubernetes Operator
Since we built our operator using the Kubebuilder framework, most standard monitoring tasks were handled for us out-of-the-box. Our operator automatically exposes a rich set of Prometheus metrics that measure reconciliation performance, the number of k8s API calls, workqueue statistics, and memory-related metrics. We we were able to ingest these metrics into pre-built dashboards by leveraging the grafana/v1-alpha plugin, which scaffolds two Grafana dashboards to monitor Operator resource usage and performance. All we had to do was add these to our existing Grafana manifests and we were good to go!
-
Has anyone ever tried to learn how k8s works?
I wrote a CSI driver and some operators. I admire K8s, because you can find solution to almost any problem in the source code - API versioning, load balancing, request throttling, optimistic concurrency, security, and much much more. I recommend https://book.kubebuilder.io/ It is similar to Operator SDK, but without Openshift-specific stuff. It gradually introduces you to many k8s concepts, and follows design patterns that k8s uses internally.
- What Is A Kubernetes Operator?
-
If you write a Kubernetes Operator: Events vs Conditions?
Do you mean this: https://book.kubebuilder.io/ ?
-
Kubernetes Operators
https://book.kubebuilder.io/ all you need to know
-
Writing a Kubernetes Operator
A better way to write an operator these days is to use kubebuilder [1].
My complaint is that I have seen orgs write operators for random stuff, often reinventing the wheel. Lot of operators in orgs are result of resume driven development. Having said that it often comes handy for complex orchestration.
[1]https://github.com/kubernetes-sigs/kubebuilder
What are some alternatives?
dvc - 🦉 ML Experiments and Data Management with Git
helm-operator - Successor: https://github.com/fluxcd/helm-controller — The Flux Helm Operator, once upon a time a solution for declarative Helming.
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
client-go - Go client for Kubernetes.
git-lfs - Git extension for versioning large files
operator-sdk - SDK for building Kubernetes applications. Provides high level APIs, useful abstractions, and project scaffolding.
Ory Kratos - Next-gen identity server replacing your Auth0, Okta, Firebase with hardened security and PassKeys, SMS, OIDC, Social Sign In, MFA, FIDO, TOTP and OTP, WebAuthn, passwordless and much more. Golang, headless, API-first. Available as a worry-free SaaS with the fairest pricing on the market!
crossplane - The Cloud Native Control Plane
MLflow - Open source platform for the machine learning lifecycle
kubegres - Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
duf - Disk Usage/Free Utility - a better 'df' alternative
python - Official Python client library for kubernetes