enhancements
spark-operator
enhancements | spark-operator | |
---|---|---|
63 | 9 | |
3,457 | 2,809 | |
0.7% | 1.2% | |
9.8 | 9.3 | |
8 days ago | 8 days ago | |
Go | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
enhancements
-
A skeptic's first contact with Kubernetes
The motivation is more the latter, but it's not at all clear the proposed removal of the embedded kustomize will proceed, given the compatibility implications. See discussion at https://github.com/kubernetes/enhancements/issues/4706#issue... and following.
-
Debugging Distroless Images with kubectl and cdebug
(I do see there are some proposed enhancements related to profiles that might help here)
-
Design Docs at Google
Thanks for these links!
I picked out one at random just to check if my skeptical reaction is fair: https://github.com/kubernetes/enhancements/tree/master/keps/...
- OK, this is actually a really good and useful doc!
- However, it's not an up-front design doc, it has clearly been written after the bulk of the work has been done, to explain and justify rolling out a big change. (See the "implementation history" timeline: https://github.com/kubernetes/enhancements/tree/master/keps/...)
- It looks like the template wasn't very useful; most of the required sections are marked "N/A", and there are comments like The best test for work like this is, more or less, "did it work?"
-
IBM to buy HashiCorp in $6.4B deal
> was always told early on that although they supported vault on kubernetes via a helm chart, they did not recommend using it on anything but EC2 instances (because of "security" which never really made sense their reasoning).
The reasoning is basically that there are some security and isolation guarantees you don't get in Kubernetes that you do get on bare metal or (to a somewhat lesser extent) in VMs.
In particular for Kubernetes, Vault wants to run as a non-root user and set the IPC_LOCK capability when it starts to prevent its memory from being swapped to disk. While in Docker you can directly enable this by adding capabilities when you launch the container, Kubernetes has an issue because of the way it handles non-root container users specified in a pod manifest, detailed in a (long-dormant) KEP: https://github.com/kubernetes/enhancements/blob/master/keps/... (tl;dr: Kubernetes runs the container process as root, with the specified capabilities added, but then switches it to the non-root UID, which causes the explicitly-added capabilities to be dropped).
You can work around this by rebuilding the container and setting the capability directly on the binary, but the upstream build of the binary and the one in the container image don't come with that set (because the user should set it at runtime if running the container image directly, and the systemd unit sets it via systemd if running as a systemd service, so there's no need to do that except for working around Kubernetes' ambient-capability issue).
> It always surprised me how these conversations went. "Well we don't really recommend kubernetes so we won't support (feature)."
-
Exploring cgroups v2 and MemoryQoS With EKS and Bottlerocket
0 is not the request we've defined. And that makes sense. Memory QoS has been in alpha since Kubernetes 1.22 (August 2021) and according to the KEP data was still in alpha as of 1.27.
-
Jenkins Agents On Kubernetes
Note: There's actually a Structured Authentication Config established via KEP-3331. It's in v1.28 as a feature flag gated option and removes the limitation of only having one OIDC provider. I may look into doing an article on it, but for now I'll deal with the issue in a manner that should work even with a bit older versions versions of Kubernetes.
-
Isint release cycle becoming a bit crazy with monthly releases and deprecations ?
Kubernetes supports a skew policy of n+2 between API server and kubelet. This means if your CP and DP are both on 1.20, you could upgrade your control plane twice (1.20 -> 1.21 -> 1.22) before you need to upgrade your data plane. And when it comes time to upgrade your data plane you can jump from 1.20 to 1.22 to minimize update churn. In the future, this skew will be opened to n+3 https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/3935-oldest-node-newest-control-plane
-
Kubernetes SidecarContainers feature is merged
The KEP (Kubernetes Enhancement Proposal) is linked to in the PR [1]. From the summary:
> Sidecar containers are a new type of containers that start among the Init containers, run through the lifecycle of the Pod and don’t block pod termination. Kubelet makes a best effort to keep them alive and running while other containers are running.
[1] https://github.com/kubernetes/enhancements/tree/master/keps/...
-
What's there in K8s 1.27
This is where the new feature of mutable scheduling directives for jobs comes into play. This feature enables the updating of a job's scheduling directives before it begins. Essentially, it allows custom queue controllers to influence pod placement without needing to directly handle the assignment of pods to nodes themselves. To learn more about this check out the Kubernetes Enhancement Proposal 2926.
-
Dependencies between Services
What your asking is a (vanilla) Kubernetes non-goal, others have mentioned fluxcd and other add ons that provide primitives for dependency aware deployments. The problem space is so large, that it's unreasonable to to address these concerns in Kubernetes itself, instead, make it extensible... Look at this KEP for example: https://github.com/kubernetes/enhancements/issues/753 Sidecar containers have existed, and been named as such since WAY before that KEP's inception, defining what these things should and shouldn't do is largely arbitrary. Aka: your use-case is niche, if you don't like the behavior, use flux or argo, or write something yourself.
spark-operator
-
Data on Kubernetes: Part 4 - Argo Workflows: Simplify parallel jobs : Container-native workflow engine for Kubernetes 🔮
In this section, we'll dive into creating and deploying a data processing platform on Amazon Elastic Kubernetes Service Amazon EKS. The solution includes essential Kubernetes add-ons: Argo Workflows, Argo Events, Spark Operator for managing Spark jobs, Fluent Bit for logging, and Prometheus for metrics.
-
Dependency issue with Pyspark running on Kubernetes using spark-on-k8s-operator
I have spent days now trying to figure out a dependency issue I'm experiencing with (Py)Spark running on Kubernetes. I'm using the spark-on-k8s-operator and Spark's Google Cloud connector.
-
Experience setting up Spark and Hudi on Kubernetes
We're using https://github.com/bitnami/charts/tree/main/bitnami/spark, but I have heard good things about https://github.com/GoogleCloudPlatform/spark-on-k8s-operator as well. Hudi should not need any long running deployments as per the docs https://hudi.apache.org/docs/0.5.1/deployment/#deploying
-
[Spark-k8s] — Getting started # Part 1
The SparkOperator must be installed before we can use Spark on Kubernetes. Google created this operator, which is available on Github. In a nutshell, the operator is in charge of monitoring the cluster for specific events related to the spark job, as known as kind: SparkApplication
-
So many sparks on K8s... Could anyone give me a few explanations please ?
Spark-Operator (as I understand, Spark-Operator integrates well with K8s while vanilla Spark integration with K8s seems extremely complex to maintain)
-
"Running Apache Spark on EKS Fargate"
Spark on K8s Operator is a project from Google that allows submitting spark applications on Kubernetes cluster using CustomResource Definition SparkApplication. It uses mutating admission webhook to modify the pod spec and add the features not officially supported by spark-submit.
-
My Journey With Spark On Kubernetes... In Python (1/3)
In this section, you use Helm to deploy the Kubernetes Operator for Apache Spark from the incubator Chart repository. Helm is a package manager you can use to configure and deploy Kubernetes apps.
-
My Journey With Spark On Kubernetes... In Python (2/3)
Additional details of how SparkApplications are run can be found in the design documentation.
-
Gopher Gold #14 - Wed Oct 07 2020
GoogleCloudPlatform/spark-on-k8s-operator (Go): Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
What are some alternatives?
kubeconform - A FAST Kubernetes manifests validator, with support for Custom Resources!
volcano - A Cloud Native Batch System (Project under CNCF)
kubernetes-json-schema - Schemas for every version of every object in every version of Kubernetes
trojan-go - Go实现的Trojan代理,支持多路复用/路由功能/CDN中转/Shadowsocks混淆插件,多平台,无依赖。A Trojan proxy written in Go. An unidentifiable mechanism that helps you bypass GFW. https://p4gefau1t.github.io/trojan-go/
klipper-lb - Embedded service load balancer in Klipper
helm-operator - Successor: https://github.com/fluxcd/helm-controller — The Flux Helm Operator, once upon a time a solution for declarative Helming.
pixie - Instant Kubernetes-Native Application Observability
kubebuilder - Kubebuilder - SDK for building Kubernetes APIs using CRDs
connaisseur - An admission controller that integrates Container Image Signature Verification into a Kubernetes cluster
helm - The Kubernetes Package Manager [Moved to: https://github.com/helm/helm]
conftest - Write tests against structured configuration data using the Open Policy Agent Rego query language
flink-on-k8s-operator - Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.