git internals

updated Nov 30, 2025

  • Where exactly is the git repository?

    😎

    The git repository is actually the hidden .git folder

  • What is the data model for a directory in git?

    😎

    a tree object

  • What does git ls-tree do?

    😎

    It lists the contents of a specified tree object.

    Each entry in a tree has four parts:

    • Mode — e.g. 100644. The leading 100 marks a regular file; the 644 is the familiar UNIX permission bits. Git version-controls these permissions, so changing a file’s mode counts as a change.
    • Type — the kind of object the entry references (blob for a file, tree for a subdirectory).
    • Hash — the object’s unique identifier (e.g. e69d…).
    • Filename.

    The key insight: the filename lives in the tree, not in the blob. This is why two files with identical contents are stored only once — both tree entries point at the same blob. Renaming a file is therefore cheap: Git writes a new tree entry but reuses the existing blob, so the raw data is never duplicated.

  • What does git commit do?

    😎

    It saves a snapshot of the working directory by creating a tree object in the git repository.

    A commit records a moment in time: who made the snapshot, a message, and the tree that represents what the working directory looked like at that point.

    A commit also distinguishes two people, which often confuses newcomers:

    • Author — the person who actually wrote the code.
    • Committer — the person who saved the change to the repository.

    Git was created for Linux kernel development, where someone commonly authors a change, sends a patch to a maintainer, and the maintainer commits it. Tracking the two independently supports that workflow.

  • What does git add do?

    😎

    It copies the file’s contents into the objects directory of the git repository, naming the stored object by the hash of its contents plus a small header.

    Concretely, git add does the following:

    1. Reads the file’s contents and treats them as a blob (“binary large object” — just raw data, with no name).
    2. Prepends a small header (the object type and content length) and runs the result through the SHA hashing algorithm, which always yields a 40-character hex string (e.g. e69d…).
    3. Stores the blob under .git/objects/, using the first two hex characters as a directory name and the remaining characters as the filename.

    So every git add writes an object into your local objects directory. (Adding an empty file, for example, produces a 0-byte blob whose hash is computed the same way.)

    • Why doesn’t git push push all commits from other branches?

git FAQ

  • what’s the technical reason?