Do we think of Git commits as diffs, snapshots, and/or histories?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Git

    Git Source Code Mirror - This is a publish-only repository but pull requests can be turned into patches to the mailing list via GitGitGadget (https://gitgitgadget.github.io/). Please follow Documentation/SubmittingPatches procedure for any of your improvements.

  • I understand all that.

    I'm saying, if you write a survey and one of the possible answers is "diff", but you don't clearly define what you mean by "diff", then don't be surprised if respondents use any reasonable definition that makes sense to them. Ask an ambiguous question, get a mishmash of answers.

    The thing that Git uses for packfiles is called a "delta" by Git, but it's also reasonable to call it a "diff". After all, Git's delta algorithm is "greatly inspired by parts of LibXDiff from Davide Libenzi"[1]. Not LibXDelta but LibXDiff.

    Yes, how Git stores blobs (using deltas) is orthogonal to how Git uses blobs. But while that orthogonality is useful for reasoning about Git, it's not wrong to think of a commit as the totality of what Git does, including that optimization. (Some people, when learning Git, stumble over the way it's described as storing full copies, think it's wasteful. For them to wrap their heads around Git, they have to understand that the optimization exists. Which makes sense because Git probably wouldn't be practical if it lacked that optimization.)

    The reason I'm bringing all this up is, if you're trying to explain Git, which is what the original article is about, then it's very important to keep in mind that someone who is learning Git needs to know what you mean when you say "diff". Most people who already know Git would tend to gravitate toward the definition of "diff" that you're assuming (the thing that Git computes on the fly and never stores), but people who already know Git aren't the target audience when you're teaching Git.

    ---

    [1] https://github.com/git/git/blob/master/diff-delta.c

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts