-
external-dns
Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Actually I'm using it on bare metal and it works. Initial setup wasn't very hard but I think it could be more intuitive. Overall I think documentation for self-hosting kubernetes sometimes a bit incomplete.
Yes, I need to add A records with IPs for each domain, but that's one time setup. I did it manually, but you can automate it [1] (depends on what you use for DNS provider but you can extend it to support your provider or maybe there is another existing solution).
I'm not sure that one server in front of the cluster is more reliable than using all cluster nodes for load balancing. I guess that in automated solutions like [1] cluster's node could be automatically deleted from DNS if it went down.
My setup is not so big so I don't have real need for load balancing, but it seems possible with existing solutions.
[1] https://github.com/kubernetes-sigs/external-dns
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
> we found the start times with Kubernetes too slow
Just curious if you could elaborate here? I work with k8s on docker, and we're also going to be spinning up ephemeral containers (and most of the other things you say) with jupyter notebooks. We're all in on k8s, but since you might be ahead of me, just wondering what hurdles you have faced?
Our big problem was fetching containers took too long since we have kitchen sink containers that are like 10 GB (!) each. They seem to spin up pretty fast though if the image is already pulled. I've worked on a service that lives in the k8s cluster to pull images to make sure they are fresh (https://github.com/lsst-sqre/cachemachine) but curious if you are talking about that or the networking?
-
nixery
Container registry which transparently builds images using the Nix package manager. Canonical repository is https://cs.tvl.fyi/depot/-/tree/tools/nixery
Wow, this is excellent! At a previous job, we had been using k8s + knative to spin up containers on demand, and likewise were unhappy with the delays. Spawner seems excellent.
One question: have you had to do any custom container builds on demand, and if so, have you had to deal with large containers (e.g. a Python base image with a few larger packages installed from PyPI)? We would run up against extremely long build image times using tools like kaniko, and caching would typically have only a limited benefit.
I was experimenting using Nix to maybe solve some of these problems, but never got far enough to run a speed test, and then left the job before finishing. But it seems to me some sort of algorithm like Nixery uses (https://nixery.dev) to generate cacheable layers with completely repeatable builds and nothing extraneous would help.
Maybe that's not a problem you had to solve, but if it is, I'd love your thoughts.
-
nydus
Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.
If you're pulling big images you could try kube-fledged (it's the simplest option, a CRD that works like a pre-puller for your images), or if you have a big cluster you can try a p2p distributor, like kraken or dragonfly2.
Also there's that project called Nydus that allows starting up big containers way faster. IIRC, starts the container before pulling the whole image, and begins to pull data as needed from the registry.
https://github.com/senthilrch/kube-fledged
https://github.com/dragonflyoss/Dragonfly2
https://github.com/uber/kraken
https://nydus.dev/
-
kube-fledged
A kubernetes operator for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly
If you're pulling big images you could try kube-fledged (it's the simplest option, a CRD that works like a pre-puller for your images), or if you have a big cluster you can try a p2p distributor, like kraken or dragonfly2.
Also there's that project called Nydus that allows starting up big containers way faster. IIRC, starts the container before pulling the whole image, and begins to pull data as needed from the registry.
https://github.com/senthilrch/kube-fledged
https://github.com/dragonflyoss/Dragonfly2
https://github.com/uber/kraken
https://nydus.dev/
-
Dragonfly2
Discontinued Dragonfly is an open source P2P-based file distribution and image acceleration system. It is hosted by the Cloud Native Computing Foundation (CNCF) as an Incubating Level Project. [Moved to: https://github.com/dragonflyoss/dragonfly]
If you're pulling big images you could try kube-fledged (it's the simplest option, a CRD that works like a pre-puller for your images), or if you have a big cluster you can try a p2p distributor, like kraken or dragonfly2.
Also there's that project called Nydus that allows starting up big containers way faster. IIRC, starts the container before pulling the whole image, and begins to pull data as needed from the registry.
https://github.com/senthilrch/kube-fledged
https://github.com/dragonflyoss/Dragonfly2
https://github.com/uber/kraken
https://nydus.dev/
-
If you're pulling big images you could try kube-fledged (it's the simplest option, a CRD that works like a pre-puller for your images), or if you have a big cluster you can try a p2p distributor, like kraken or dragonfly2.
Also there's that project called Nydus that allows starting up big containers way faster. IIRC, starts the container before pulling the whole image, and begins to pull data as needed from the registry.
https://github.com/senthilrch/kube-fledged
https://github.com/dragonflyoss/Dragonfly2
https://github.com/uber/kraken
https://nydus.dev/
-
> we'd like the scheduler to be aware of which node has which image
The kubernetes scheduler should be aware of which node has which image, that is why the Node object has [status.images] (https://kubernetes.io/docs/reference/generated/kubernetes-ap...) field.
It turned out to be somewhat tricky, because it increased the size of the Node object, and colocating node heartbeats onto the same object meant that a bigger object was changing relatively often. But that was addressed by moving [heartbeats to a different object](https://github.com/kubernetes/enhancements/issues/589)
-
No, and I wouldn't, since I absolutely love it. I've put our entire build pipeline and everything into one single cluster at the moment, and been finding it incredibly straight-forward and easy to build our CI/CD pipelines using it.
Do I recommend Kubernetes to other people/companies though? Absolutely not! The learning curve is incredibly steep, and it really does take investment into understanding how it works.
But to anyone who is looking to use Kubernetes, I highly recommend https://helm.sh since it actually makes templating deployments significantly easier.
-
Crossplane [1] is great way to create and manage resources across cloud providers, MSPs via kubernetes objects.
[1] https://crossplane.io
-
swarmsible
Ansible based Tooling and production grade example Docker Stacks. Updated with new learnings from running Docker Swarm in production
Story of one of the projects I am involved in:
We came from Ansible managed deployments of vanilla docker with nginx as single node ingress with another load balancer on top of that.
Worked fine, but HA for containers that are only allowed to exist once in the stack was one thing that caused us headaches.
Then, we had a workshop for Rancher RKE. Looked promising at the start, but operating it became a headache as we didn't have enough people in the project team to maintain it. Certificates expiring was an issue and the fact that you actually kinda had to baby-sit the cluster was a turn off.
We killed the switch to kubernetes.
In the meantime we were toying around with Docker Swarm for smaller scale deployments and inhouse infrastructure. We didn't find anything to not like and are currently moving into that direction.
How we do things in Swarm:
1. Monitoring using an updated Swarmprom stack (https://github.com/neuroforgede/swarmsible/tree/master/envir...)
-
4. Container Autoscaling: did not need it yet for our internal installations as well as our customer deployments on bare metal, but we would go for a solution based on prometheus metrics, similar to https://github.com/UnclePhil/ascaler
-
6. Volumes: Hetzner Cloud Plugin, see https://github.com/costela/docker-volume-hetzner
Reasons that we would dabble in k8s again:
1. A lot of projects are k8s only (see OpenFaaS for example)
-
terraform-cdk
Define infrastructure resources using programming constructs and provision them using HashiCorp Terraform
The CDK for Terraform went GA today (https://www.terraform.io/cdktf and https://www.hashicorp.com/blog/cdk-for-terraform-now-general...). It's a framework that extends the capabilities of CDK so that you can use the whole Terraform ecosystem of providers and modules.
Under the hood it means that the `cdktf synth` command ultimately generates Terraform configuration that can be executed like any other Terraform config. It's definitely not a case of Terraform trying to be like CDK. Each has it's strengths, choose whichever makes the most sense for your workflow.
-
Config Connector [1] is also an option in this space for GCP, it supports many GCP resources and thus far our experience with it has been largely positive.
[1] https://cloud.google.com/config-connector/docs/overview
-
-
CI/CD pipeline is GitHub Actions which run SSH commands on our servers that execute deployment scripts: https://github.com/bugout-dev/spire/blob/main/deploy/deploy....
We use systemd to manage services.
We use Ansible to set up servers.
Our infrastructure spans AWS, Google Cloud, and servers in a datacenter.
Related posts
-
Automating Kubernetes Deployments with FluxCD for Patched and Signed Container Images
-
Level-up Container Security: 4 Open-Source Tools for Secure Software Supply Chain
-
🌐 Navigating the CNCF Landscape: A Roadmap for Open Source Contributions 🚀
-
7 Best Practices for Container Security
-
Harbor: An open source trusted cloud native registry