enhancements vs spark-operator

enhancements

Enhancements tracking repo for Kubernetes (by kubernetes)

Source Code

Suggest alternative

Edit details

spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes. (by kubeflow)

Kubernetes kubernetes-operator apache-spark kubernetes-crd kubernetes-controller Spark google-cloud-dataproc

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

enhancements		spark-operator
	Project
58	Mentions	8
3,257	Stars	2,609
1.6%	Growth	2.3%
9.7	Activity	8.2
1 day ago	Latest Commit	about 11 hours ago
Go	Language	Go
Apache License 2.0	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

enhancements

Posts with mentions or reviews of enhancements. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-19.

IBM to buy HashiCorp in $6.4B deal
1 project | news.ycombinator.com | 25 Apr 2024

> was always told early on that although they supported vault on kubernetes via a helm chart, they did not recommend using it on anything but EC2 instances (because of "security" which never really made sense their reasoning).
The reasoning is basically that there are some security and isolation guarantees you don't get in Kubernetes that you do get on bare metal or (to a somewhat lesser extent) in VMs.
In particular for Kubernetes, Vault wants to run as a non-root user and set the IPC_LOCK capability when it starts to prevent its memory from being swapped to disk. While in Docker you can directly enable this by adding capabilities when you launch the container, Kubernetes has an issue because of the way it handles non-root container users specified in a pod manifest, detailed in a (long-dormant) KEP: https://github.com/kubernetes/enhancements/blob/master/keps/... (tl;dr: Kubernetes runs the container process as root, with the specified capabilities added, but then switches it to the non-root UID, which causes the explicitly-added capabilities to be dropped).
You can work around this by rebuilding the container and setting the capability directly on the binary, but the upstream build of the binary and the one in the container image don't come with that set (because the user should set it at runtime if running the container image directly, and the systemd unit sets it via systemd if running as a systemd service, so there's no need to do that except for working around Kubernetes' ambient-capability issue).
> It always surprised me how these conversations went. "Well we don't really recommend kubernetes so we won't support (feature)."
Exploring cgroups v2 and MemoryQoS With EKS and Bottlerocket
7 projects | dev.to | 19 Feb 2024

0 is not the request we've defined. And that makes sense. Memory QoS has been in alpha since Kubernetes 1.22 (August 2021) and according to the KEP data was still in alpha as of 1.27.
Jenkins Agents On Kubernetes
7 projects | dev.to | 4 Sep 2023

Note: There's actually a Structured Authentication Config established via KEP-3331. It's in v1.28 as a feature flag gated option and removes the limitation of only having one OIDC provider. I may look into doing an article on it, but for now I'll deal with the issue in a manner that should work even with a bit older versions versions of Kubernetes.
Isint release cycle becoming a bit crazy with monthly releases and deprecations ?
2 projects | /r/kubernetes | 11 Jul 2023

Kubernetes supports a skew policy of n+2 between API server and kubelet. This means if your CP and DP are both on 1.20, you could upgrade your control plane twice (1.20 -> 1.21 -> 1.22) before you need to upgrade your data plane. And when it comes time to upgrade your data plane you can jump from 1.20 to 1.22 to minimize update churn. In the future, this skew will be opened to n+3 https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/3935-oldest-node-newest-control-plane
Kubernetes SidecarContainers feature is merged
7 projects | news.ycombinator.com | 10 Jul 2023

The KEP (Kubernetes Enhancement Proposal) is linked to in the PR [1]. From the summary:
> Sidecar containers are a new type of containers that start among the Init containers, run through the lifecycle of the Pod and don’t block pod termination. Kubelet makes a best effort to keep them alive and running while other containers are running.
[1] https://github.com/kubernetes/enhancements/tree/master/keps/...
What's there in K8s 1.27
1 project | dev.to | 4 Jun 2023

This is where the new feature of mutable scheduling directives for jobs comes into play. This feature enables the updating of a job's scheduling directives before it begins. Essentially, it allows custom queue controllers to influence pod placement without needing to directly handle the assignment of pods to nodes themselves. To learn more about this check out the Kubernetes Enhancement Proposal 2926.
Dependencies between Services
1 project | /r/kubernetes | 6 Apr 2023

What your asking is a (vanilla) Kubernetes non-goal, others have mentioned fluxcd and other add ons that provide primitives for dependency aware deployments. The problem space is so large, that it's unreasonable to to address these concerns in Kubernetes itself, instead, make it extensible... Look at this KEP for example: https://github.com/kubernetes/enhancements/issues/753 Sidecar containers have existed, and been named as such since WAY before that KEP's inception, defining what these things should and shouldn't do is largely arbitrary. Aka: your use-case is niche, if you don't like the behavior, use flux or argo, or write something yourself.
When you learn the Sidecar Container KEP got dropped from the Kubernets release. Again.
2 projects | /r/kubernetes | 6 Apr 2023
Kubernetes 1.27 will be out next week! - Learn what's new and what's deprecated - Group volume snapshots - Pod resource updates - kubectl subcommands … And more!
2 projects | /r/kubernetes | 4 Apr 2023

If further interested, I may recommend checking out the KEP. I love how they document the decision making, and all these edge cases :).
How can I force assign an IP to my Load Balancer ingress in “status.loadBalancer”?
1 project | /r/kubernetes | 4 Apr 2023

See https://kubernetes.io/docs/reference/kubectl/conventions/#subresources and https://github.com/kubernetes/enhancements/issues/2590

spark-operator

Posts with mentions or reviews of spark-operator. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-15.

Dependency issue with Pyspark running on Kubernetes using spark-on-k8s-operator
1 project | /r/codehunter | 31 May 2023

I have spent days now trying to figure out a dependency issue I'm experiencing with (Py)Spark running on Kubernetes. I'm using the spark-on-k8s-operator and Spark's Google Cloud connector.
Experience setting up Spark and Hudi on Kubernetes
2 projects | /r/dataengineering | 15 Apr 2023

We're using https://github.com/bitnami/charts/tree/main/bitnami/spark, but I have heard good things about https://github.com/GoogleCloudPlatform/spark-on-k8s-operator as well. Hudi should not need any long running deployments as per the docs https://hudi.apache.org/docs/0.5.1/deployment/#deploying
[Spark-k8s] — Getting started # Part 1
3 projects | dev.to | 19 Jul 2022

The SparkOperator must be installed before we can use Spark on Kubernetes. Google created this operator, which is available on Github. In a nutshell, the operator is in charge of monitoring the cluster for specific events related to the spark job, as known as kind: SparkApplication
So many sparks on K8s... Could anyone give me a few explanations please ?
1 project | /r/apachespark | 27 Oct 2021

Spark-Operator (as I understand, Spark-Operator integrates well with K8s while vanilla Spark integration with K8s seems extremely complex to maintain)
"Running Apache Spark on EKS Fargate"
1 project | dev.to | 14 Aug 2021

Spark on K8s Operator is a project from Google that allows submitting spark applications on Kubernetes cluster using CustomResource Definition SparkApplication. It uses mutating admission webhook to modify the pod spec and add the features not officially supported by spark-submit.
My Journey With Spark On Kubernetes... In Python (1/3)
4 projects | dev.to | 12 Apr 2021

In this section, you use Helm to deploy the Kubernetes Operator for Apache Spark from the incubator Chart repository. Helm is a package manager you can use to configure and deploy Kubernetes apps.
My Journey With Spark On Kubernetes... In Python (2/3)
2 projects | dev.to | 12 Apr 2021

Additional details of how SparkApplications are run can be found in the design documentation.
Gopher Gold #14 - Wed Oct 07 2020
22 projects | dev.to | 7 Oct 2020

GoogleCloudPlatform/spark-on-k8s-operator (Go): Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.

What are some alternatives?

When comparing enhancements and spark-operator you can also consider the following projects:

kubeconform - A FAST Kubernetes manifests validator, with support for Custom Resources!

volcano - A Cloud Native Batch System (Project under CNCF)

kubernetes-json-schema - Schemas for every version of every object in every version of Kubernetes

trojan-go - Go实现的Trojan代理，支持多路复用/路由功能/CDN中转/Shadowsocks混淆插件，多平台，无依赖。A Trojan proxy written in Go. An unidentifiable mechanism that helps you bypass GFW. https://p4gefau1t.github.io/trojan-go/

klipper-lb - Embedded service load balancer in Klipper

helm-operator - Successor: https://github.com/fluxcd/helm-controller — The Flux Helm Operator, once upon a time a solution for declarative Helming.

Hey - HTTP load generator, ApacheBench (ab) replacement

kubebuilder - Kubebuilder - SDK for building Kubernetes APIs using CRDs

connaisseur - An admission controller that integrates Container Image Signature Verification into a Kubernetes cluster

helm - The Kubernetes Package Manager [Moved to: https://github.com/helm/helm]

kubeval - Validate your Kubernetes configuration files, supports multiple Kubernetes versions

flink-on-k8s-operator - Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.

enhancements vs kubeconform spark-operator vs volcano enhancements vs kubernetes-json-schema spark-operator vs trojan-go enhancements vs klipper-lb spark-operator vs helm-operator enhancements vs Hey spark-operator vs kubebuilder enhancements vs connaisseur spark-operator vs helm enhancements vs kubeval spark-operator vs flink-on-k8s-operator

Compare enhancements vs spark-operator and see what are their differences.

enhancements

spark-operator

enhancements

spark-operator

What are some alternatives?