Skip to main content

Command Palette

Search for a command to run...

How Git Works Internally

What Lies Inside the .git Folder?

Updated
5 min read
How Git Works Internally

The .git folder: the whole repo lives here

When you run, git init you create a .git directory. Everything Git needs to track and reconstruct your project is inside that hidden folder:

  • .git/objects/ - object storage (blobs, trees, commits, tags). Objects are stored compressed and named by their hash.

  • .git/refs/ - named references: refs/heads/* (branches), refs/tags/* (tags), refs/remotes/*.

  • .git/HEAD - a pointer to the current branch (for example ref: refs/heads/main) or directly to a commit hash in detached HEAD state.

  • .git/index - the staging area (also called the index): a binary file that maps paths to blob hashes and stores stat info used for fast diffs.

  • .git/packed-refs, .git/config, .git/logs/ and other housekeeping files.

  • .git/objects/pack/ - packfiles: compressed, delta-compressed collections of objects used to save space and speed operations.

Think of .git as a small, efficient database that stores your project’s history and metadata — not a copy of your working files.

The three core object types

Git stores content as objects. The most important ones:

  • Blob - raw file contents. A blob is the file data (no filename metadata). A blob is created when you stage a file.

  • Tree - a snapshot of a directory. It maps filenames (and metadata like mode) to blob hashes or to other tree hashes (subdirectories).

  • Commit - a snapshot of the project at a point in time. A commit contains:

    • a reference to a tree (the root snapshot),

    • zero or more parent commit hashes (one for normal commits, multiple for merges),

    • author/committer metadata and the commit message.

There are other object types (annotated tag objects and internal plumbing objects), but blobs, trees, and commits are the mental model you need.

Content-addressable storage & hashes

Every object is identified by a cryptographic hash of its contents (historically, SHA-1, moving toward SHA-256). The hash is computed over the object type, size, and content — e.g., blob 14\0Hello, world\n. The result is a unique ID like 3b18e3.... Because the ID depends on content:

  • The same file content stored in different repositories yields the same blob hash — Git can deduplicate automatically.

  • Any corruption or accidental change in an object changes its hash — git fsck can detect this.

  • Commits are safe and verifiable because they reference tree hashes, which reference blob hashes, forming a hash chain you can verify.

This chaining is what makes Git extremely robust.

What happens on git add

High-level flow when you run git add file.txt:

  1. Git reads the file contents from your working directory.

  2. Git creates a blob object from the file contents and writes it to .git/objects/ (zlib compressed). You can replicate this step with plumbing:

     git hash-object -w file.txt   # writes blob and prints its hash
    
  3. Git updates the index (.git/index) to record that file.txt now corresponds to that blob hash and stores stat info (mtime, size, mode) so it can cheaply tell later whether the file changed.

Important notes:

  • git add stages content, not filenames alone — the index maps filenames → blob-hash.

  • If the same file content already exists (same hash), Git reuses the blob; no duplicate content is stored.

What happens on git commit

When you run git commit -m "message" (assuming staged changes exist), Git:

  1. Write a tree object that represents the project state in the index. The tree references blobs (and subtrees) for every tracked file/directory. You can reproduce this with:

     git write-tree   # prints tree-hash
    
  2. Creates a commit object that includes:

    • tree: <tree-hash>

    • parent: <parent-hash> (if any)

    • author/committer metadata

    • commit message
      You can create a commit manually (plumbing) with:

    echo "My commit" | git commit-tree <tree-hash> -p <parent-hash>

That prints the new commit hash.

  1. Updates the branch ref that HEAD points to (for example refs/heads/main) to point to the new commit hash.

    • This is what moves the branch forward.

    • HEAD itself either points to that ref (ref: refs/heads/main) or contains a hash when detached.

  2. Optionally writes reflog entries in .git/logs/ to track ref changes.

So commits are lightweight pointers to trees (snapshots), and trees point to blobs (file contents).

Visualizing the commit → tree → blob relationship

commit C (hash: ccccc)
  ├─ tree T (hash: ttttt)
  │   ├─ file a.txt -> blob A (hash: aaaaa)
  │   ├─ file b.txt -> blob B (hash: bbbbb)
  │   └─ dir/ -> tree T2 (hash: t2t2t)
  │        └─ file c.txt -> blob C (hash: ccccc)
  └─ parent -> commit B (hash: bbbbb)

Each arrow is a reference by hash. To inspect any object:

git cat-file -p <hash>   # pretty-print the object

Branches, HEAD, and refs

  • A branch is just a file under .git/refs/heads/ that contains a commit hash. Example: .git/refs/heads/main might contain d34db33f....

  • HEAD points to either a branch ref (ref: refs/heads/main) or directly to a commit (detached HEAD).

  • Moving a branch is simply updating the file that stores the hash. This is why Git operations are fast.

Packs and garbage collection

Storing every object as a separate file becomes inefficient for big repos. Git periodically compresses many objects into packfiles under .git/objects/pack/ (files like pack-*.pack and pack-*.idx). git gc (garbage collect) will pack loose objects and remove unreferenced objects older than a grace period.

How Git ensures integrity & consistency

  • Objects are content-addressed: their SHA is derived from content - changing content changes the hash.

  • Commits reference trees, trees reference blobs - this creates an immutable web of hashes. If any object is corrupted or tampered with, the hash chain breaks.

  • git fsck verifies object connectivity and integrity.

  • Reflogs and multiple copies (remotes) provide redundancy and recovery points.