Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
There are also official Heml charts available for ARC: https://github.com/actions/actions-runner-controller
I am in the process of setting it up on a cheap Hetzner box. If it works, would be a great deal! You can get a 64 GB RAM box for 35 EUR/mo at server auctions with unlimited traffic. I don't mention CPU or GPU, as typically this isn't a bottleneck for my projects.
Plus, I can configure cache sharing via host-mounted dir. E.g. pnpm cache can be all in one place, and be locally available to pods via a mounted dir. Same for the Docker image cache. This would speed up CI runs and also reduce network traffic by a huge margin.
GitHub Actions effectively has no local caching. There's an action for caching, but it uses a blob storage for cache artifacts. Which then gets network fetched, gzip'ed and gunzip'ed each time, and from my experience this has never been a gain for medium to large npm projects, as they have thousands of small .js files in node_modules, and thus takes a long time to compress and decompress. I think npm edge cache servers are already so optimized and fast, that in my experience almost always it's faster to install from npm directly. I even tested this on AWS, where the cache was stored in S3, in the same region as CodeBuild (CI), and direct installs from npm were still faster by about 30%.
So other than adding more hardware resources, local caching is the only way to significantly speed up GH Actions, from my experience, and thus you must have your runner.
We maintain a little action called Action-Debugger that let's you ssh into a running GitHub action workflow to help debug pesky issues. It has a few additional features when you use it with our runners but works very well by itself.
https://docs.warpbuild.com/tools/action-debugger
https://github.com/WarpBuilds/action-debugger
I had a similar experience with ARC (actions-runner-controller).
One of the machines in the fleet failed to sync its clock via NTP. Once a job X got scheduled to it, the runner pod failed authentication due to incorrect clock time, and then the whole ARC system started to behave incorrectly: job X was stuck without runners, until another workflow job Y was created, and then X got run but Y became stuck. There were also other wierd behaviors like this so I eventually rebuilt everything based on VMs and stopped using ARC.
Using VMs also allowed me to support the use of the official runner images [0], which is good for compatibility.
I feel more people would benefit from managed "self-hosted" runners, so I started DimeRun [1] to provide cheaper GHA runners for people who don't have the time/willingness to troubleshoot low-level infra issues.
[0]: https://github.com/actions/runner-images
Related posts
- Show HN: Managed GitHub Actions Runners for AWS
- Is it possible to use AWS compute instances for running GitHub Actions jobs?
- My Infrastructure as Code Rosetta Stone - Deploying the same Django application on AWS ECS Fargate with CDK, Terraform and Pulumi
- My Infrastructure as Code Rosetta Stone - Deploying the same web application on AWS ECS Fargate with CDK, Terraform and Pulumi
- DeepFlow uses Spot Instances to speed up GitHub Action exploration