workerpool-go
bb-remote-execution
workerpool-go | bb-remote-execution | |
---|---|---|
2 | 3 | |
56 | 104 | |
- | 2.9% | |
4.3 | 8.2 | |
9 months ago | about 1 month ago | |
Go | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
workerpool-go
bb-remote-execution
-
Write Your Own Task Queue
Though it obviously depends on the case at hand, I sort of agree with this.
For a distributed build cluster that I maintain (Buildbarn, https://github.com/buildbarn/bb-remote-execution/), I also had to implement a scheduler process that would queue compilation/test actions, so that they can be executed on workers later on.
Initially I looked into using some conventional queueing system, but eventually settled on implementing my own as part of the scheduler process. So far I'm really happy with this choice, as it has allowed me to implement the following features, and more:
- In-flight deduplication of identical compilation actions. If identical actions are scheduled with different priorities, the highest priority is used.
- Multi-level scheduling fairness between groups, users in a group, builds run by the same user, etc.. The fairness cooperates well with priorities.
- Automatic removal of queued actions that are no longer associated with any running build.
- Stickiness, where workers prefer picking up actions that are similar to the one they ran previously, for reducing network utilisation.
- Facilities for draining workers.
Though I'm not saying it would have been impossible to achieve this with an off the shelf task queue, I'm not convinced it would have been easy. Adding new features right now only means I need to care about the actual semantics of it, as opposed to trying to figure out how to map it onto the feature set of the queueing system of choice.
-
LiteFS a FUSE-based file system for replicating SQLite
I was going to raise that point exactly.
As someone who spends an awful amount of time using FUSE, my recommendation is to only use it in cases where the software that interacts with the file system isn't easily changeable. For example, for Buildbarn which I maintain (https://github.com/buildbarn/bb-remote-execution), I need to use it. It's infeasible to change arbitrary compilers and tests to all interact with a network distributed build cache. Designing the FUSE file system was a pretty heavy investment though, as you really need to be POSIXly correct to make it all work. The quality of implementations of FUSE also varies between OSes and their versions. macFUSE, for example, is quite different from Linux FUSE.
Given that SQLite already has all of the hooks in place, I would strongly recommend using those. In addition to increasing portability, it also makes it easier to set up/run. As an example, it's pretty hard to mount a FUSE file system inside of a container running on Kubernetes without risking locking up the underlying host. Doing the same thing with the SQLite VFS hooks is likely easy and also doesn't require your container to run with superuser privileges.
-
Disorderfs: FUSE-based filesystem that introduces non-determinism into metadata
Buildbarn, a build cluster implementation for Bazel that I maintain, can also run build actions (compilation steps, unit tests) in a FUSE file system. Though the primary motivator for this is that it reduces the time to construct a build action's file system to nearly instant, it has the advantage that I can also do things similar to disorderfs. Shuffling directory listings is actually something that I also added. Pretty useful!
https://github.com/buildbarn/bb-remote-execution/blob/eb1150...
What are some alternatives?
miniqueue - A simple, single binary, message queue. Supports HTTP/2 and Redis Protocol.
litefs - FUSE-based file system for replicating SQLite databases across a cluster of machines
verneuil - Verneuil is a VFS extension for SQLite that asynchronously replicates databases to S3-compatible blob stores.
BeanstalkD - Beanstalk is a simple, fast work queue.
asciiflow - ASCIIFlow