remote-apis
outrun
remote-apis | outrun | |
---|---|---|
5 | 13 | |
300 | 3,109 | |
0.0% | - | |
5.8 | 0.0 | |
15 days ago | over 1 year ago | |
Starlark | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
remote-apis
-
Mozilla sccache: cache with cloud storage
In the case of the Remote Execution/Cache API used by Bazel among others[1] at least, it's a bit more detailed. There's an "ActionCache" and an actual content-addressed cache that just stores blobs ("ContentAddressableStorage"). When you run a `gcc -O2 foo.c -o foo.o` command (locally or remotely; doesn't matter), you upload an "Action" into the action cache, which basically said "This command was run. As a result it had this stderr, stdout, error code, and these input files read and output files written." The input and output files are then referenced by the hash of their contents, in this case.
Most importantly you can look up an action in the ActionCache without actually running it. So now when another person comes by and runs the same build command, they say "Has this Action, with these inputs, been run before?" and the server can say "Yes, and the output is a file identified by hash XYZ" where XYZ is the hash of foo.o
So realistically you always some mix of "input content hashing" and "output content hashing" (the second being the definition of 'content addressable'.)
[1] https://github.com/bazelbuild/remote-apis/blob/main/build/ba...
-
Distcc: A fast, free distributed C/C++ compiler
Not only it's distributed like distcc, Bazel also provide sandboxing to ensure that environment factors does not affect the build/test results. This might not mean much for smaller use cases, but at scale with different compiler toolchains targeting different OS and CPU Architecture, the sandbox helps a ton in keeping your cache accurate.
On top of it, the APIs Bazel uses to communicate with the remote execution environment is standardized and adopted by other build tools with multiple server implementation to match it. Looking into https://github.com/bazelbuild/remote-apis/#clients, you could see big players are involved: Meta, Twitter, Chromium project, Bloomberg while there are commercial supports for some server implementations.
Finally, on top of C/C++, Bazel also supports these remote compilation / remote test execution for Go, Java, Rust, JS/TS etc... Which matters a lot for many enterprise users.
Disclaimer: I work for https://www.buildbuddy.io/ which provides one of the remote execution server implementation and I am a contributor to Bazel.
-
When to Use Bazel?
Regardless of whether you should use Bazel or not, my hope is that any future build systems attempt to adopt Bazel's remote execution protocol (or at least a protocol that is similar in spirit):
https://github.com/bazelbuild/remote-apis
In my opinion the protocol is fairly well designed.
-
Programming Breakthroughs We Need
> The thing I really would like to see is a smarter CI system. Caching of build outputs, so you don't have to rebuild the world from scratch every time. Distributed execution of tests and compilation, so you are not bottle-necked by one machine.
This is already achievable nowadays using Bazel (https://bazel.build) as a build system. It uses a gRPC based protocol for offloading/caching the actual build on a build cluster (https://github.com/bazelbuild/remote-apis). I am the author of one of the Open Source build cluster implementations (Buildbarn).
-
Distributed Cloud Builds for Everyone
Very nice! I really like the ease-of-use of this, as well as the scale-to-zero costs. That's a tricky thing to achieve. Seems like it could become a standard path to ease the migration from local to remote builds.
If the author is interested in standardizing the same, I'd suggest implementing the REAPI protocol (https://github.com/bazelbuild/remote-apis). It should be amenable to implementing on a Lambda-esque back-end, and is already standard amongst most tools doing Remote Execution (including Bazel! Bazel+llama could be fun). And equally, it's totally usable by a distcc-esque distribution tool (recc[1] is one example) - that's also what Android is doing before they finish migrating to Bazel ([2], sadly not yet oss'd).
The main interesting challenge I expect this project to hit is going to be worker-local caching: for compilation actions it's not too bad to skip assuming the compiler is built into the container environment, but if branching out into either hermetic toolchains or data-heavy action types (like linking), fetching all bytes to the ephemeral worker anew each time may prove to be prohibitive. On the other hand, that might be a nice transition point to switch to persistent workers: use a lambda backed solution for the scale-to-0 case, and switch execution stacks under the hood to something based on reused VMs when hitting sufficient scale that persistent executors start to win out.
(Disclaimer: I TL'd the creation of this API, and Google implementation of the same).
[1] https://gitlab.com/BuildGrid/recc
[2] https://opensource.googleblog.com/2020/11/welcome-android-op...
outrun
-
Distcc: A fast, free distributed C/C++ compiler
While it's purpose is different it can be used to do distributed compiling, so I'll leave it here.
https://github.com/Overv/outrun
Since I was just going down this rabbit hole recently, I kind of wonder if it's possible to set the filesystem on something more like the BitTorrent protocol so things like the libraries/compilers/headers that are used during compilation dont all need to come from the main pc. It probably wouldn't be useful until you reached a stupid number of computers and you started reaching the limits of the Ethernet wire, but for something stupid that can run on a pi cluster it would be a fun project.
-
Programing laptop
Your mention of compile heavy workloads reminded me of a project called Outrun, it offloads work to another machine. All it seems to require is Python, Fuse3 and ssh.
-
The u-root CPU command
Awesome! This write up is satisfyingly detailed. Prior work in this space includes Plan9 of course, as well as the python project Outrun, which has it's own RPC-based FUSE FS: https://github.com/Overv/outrun
Other approachs to deployment in particular include the functional package managers Nix and Guix, which can create lightweight application images, and could probably be cobbled together into some sort of remote environment replication even across architectures. As I read on, I thought less about how this compares with Guix in regards to application/environment packaging and more about how these things could be glued together in interesting ways, because I think the intro leads in through slightly off-label examples, if that makes sense. Application packaging isn't what this addresses at the end of the day, but it's no less fascinating for it.
- GitHub - Overv/outrun: Execute a local command using the processing power of another Linux machine.
- Way to run commands using other linux system's compute power
- Outrun - Execute a local command using the processing power of another Linux machine.
What are some alternatives?
dylint - Run Rust lints from dynamic libraries
rffmpeg - rffmpeg: remote SSH FFmpeg wrapper tool
bazel-gba-example - Bazel GBA (Game Boy Advance) Example
OpenAFS - Fork of OpenAFS from git.openafs.org for visualization
llama
icecream - Distributed compiler with a central scheduler to share build load
MyDef - Programming in the next paradigm -- your way
bazel-buildfarm - Bazel remote caching and execution service
cargo-mutants - :zombie: Inject bugs and see if your tests catch them!
embedded-postgres-binaries - Lightweight bundles of PostgreSQL binaries with reduced size intended for testing purposes.