cdc-file-transfer
got
Our great sponsors
- InfluxDB - Collect and Analyze Billions of Data Points in Real Time
- SonarCloud - Analyze your C and C++ projects with just one click.
- Mergify - Updating dependencies is time-consuming.
cdc-file-transfer | got | |
---|---|---|
24 | 12 | |
2,851 | 125 | |
0.7% | 0.8% | |
10.0 | 0.0 | |
3 months ago | 28 days ago | |
C++ | Go | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cdc-file-transfer
-
Born from the ashes of Stadia, this repository contains tools for synching and streaming files from Windows to Linux.
The README has pretty good explanation of why it's better than rsync, and the animations help show exactly what the difference is.
- Google made a tool like rsync which is 3x faster
-
CDC File Transfer
Slightly OT, but I like the schematic gifs used in the Readme.md (pretty amazing doc overall!) like this one [0]. Does anyone have suggestions what tools they might have used (or might be used in general) to create those?
[0] https://github.com/google/cdc-file-transfer/blob/main/docs/l...
The documentation in the code itself is pretty great as well:
https://github.com/google/cdc-file-transfer/blob/main/fastcd...
The same question was asked here: https://github.com/google/cdc-file-transfer/issues/56
We also ran the experiment with the native Linux rsync, i.e syncing Linux to Linux, to rule out issues with Cygwin. Linux rsync performed on average 35% worse than Cygwin rsync, which can be attributed to CPU differences.
Windows to Windows is being worked on, see https://github.com/google/cdc-file-transfer/compare/main...s....
Linux to Linux is also an option if there is demand, but currently it's Windows to Linux only.
got
-
Show HN: A version control system based on rsync
I've not heard the term "probabilistic tree" and I've having difficulty pulling up references. I suspect it's implemented by subpackage ptree[0]. Do you have resources on what makes probabilistic trees different from hash tables?
[0] https://github.com/gotvc/got/tree/master/pkg/gotkv/ptree
Sure, Git stores data in a trie. Each file is one blob identified by hash, and directories (called trees in Git) are blobs where each line is a directory entry with a name and the hash of a file or another tree. This means that modifying an object /down/a/long/path/like/this.txt has to create copies of all the trees on the way up. The technical term for this is "write amplification", and in Git it is affected by path length among other things.
Got stores paths in a probabilistic tree (GotKV[0]). The number of nodes before you get to data will scale logarithmically with the size of the entire filesystem, not the depth of a specific object.
Then there is the issue of large files. A file in Git is always 1 blob. Syncing a large file is not easy because if you are interrupted and have to restart, you have lost all your progress. You can't verify the hash of a blob until you have the whole thing. Got has a maximum blob size, so you'll only be buffering <2MB at a time before you can verify that the blob is correct. If a transfer is interrupted, the most you'll have to repeat is one blobs worth, plus any tree nodes above that blob.
Compared to rsync, Got uses variable size chunks and a faster content defined chunking algorithm, recently featured here on HN[1]. I haven't thought about if variable vs fixed chunks is better for file transfer, but for version control, the higher chance of convergence is important. It means you have better deduplication.
This is a very similar to one of my projects "Got".
The algorithms it uses are superior to rsync and git in a few ways. It comes short on features, especially for software development compared to Git. The motivation is more for personal file storage.
I notice you're using Go and AGPL licensed, so you could borrow any of Got's libraries without issue. (Got is GPL licensed.) Definitely reach out in a GitHub issue.
-
CDC File Transfer
FastCDC is the same chunking algorithm used in Got.
-
SourceHut terms of service updates, cryptocurrency projects to be removed
Thanks for sharing RocketGit. This is the first time I've heard of it, and yes, it does look like a cool copyleft solution to self-hosted Git.
Another interesting option is Brendan Caroll's got[0], which allows sharing of repositories over INET256[1]. I'm sure there are other P2P approaches to Git, but this one just piqued my interest. Unfortunately it has a naming conflict with OpenBSD's Game of Trees[2].
[0] https://github.com/gotvc/got
-
Show HN: Encrypted Git hosting should be easy
I work on a project which solves a similar use case.
Got also does E2EE encryption, but it can additionally encrypt branch names from remote servers.
-
What Comes After Git
I've been working on a project "Got". Which deals with the LFS problem, mentioned in the post.
Got isn't really trying to do software version control better than Git. It's trying to make general purpose file versioning practical, with a workflow similar to Git's.
-
Show HN: Let's build an end-to-end encrypted data store
In the same space is the key-value store underlying Got: GotKV. https://github.com/gotvc/got/tree/master/pkg/gotkv
It stores encrypted blobs in any content-addressed store, and provides a copy-on-write key-value store API.
- Show HN: Got is like Git, but with an 'o'
What are some alternatives?
d2 - D2 is a modern diagram scripting language that turns text to diagrams.
bita - Differential file synchronization over http
imsy - simple incremental pull of immutable large files
Killed by Google - Part guillotine, part graveyard for Google's doomed apps, services, and hardware.
git-remote-aws - encrypted git hosting should be easy
forge - Work with Git forges from the comfort of Magit
typecrypt - Typescript public key encryption library using webcrypto (designed for social networks)
Zenko - Zenko is the open source multi-cloud data controller: own and keep control of your data on any cloud.
backup - immutable backups so simple that unborkable
git-branchless - High-velocity, monorepo-scale workflow for Git
difftastic - a structural diff that understands syntax 🟥🟩