Architecture
How Heddle fits together
A contributor's map of the codebase. The OSS workspace at HeddleCo/heddle is a 17-crate Apache-2.0 Rust workspace published to
crates.io as heddle-*. A closed workspace adds
the hosted backend on top. This SvelteKit app sits alongside
both. This page names the abstractions you'll encounter, the
patterns that recur, and the invariants the rest of the code
is built on. For a per-crate tour see the crate tour; for
the CLI's behaviour contract see operating principles.
Repository as central coordinator
The Repository type in crates/repo is the primary interface most operations use. It coordinates
between object storage, refs, the oplog, the worktree, and
config — single seam, predictable lifetime, owns the locks.
A new verb that doesn't go through Repository is almost certainly missing an integration check.
The same type supports two execution models: standard repos
where heddle_dir == root/.heddle, and agent
checkouts where heddle_dir is a pointer to a
shared object store while the checkout has its own
isolated HEAD. The store stays content-addressed and
deduplicated; the HEAD divergence is what gives each agent
its own working state.
Content-addressed, immutable objects
Every blob, tree, and state object is hashed with BLAKE3 and
addressed by its hash. Two captures that produce identical
content produce the same state ID — Heddle never stores the
same content twice. The address is the integrity check: any
tampering produces a different ID, which heddle
fsck can detect by re-hashing.
State objects, once stored, are never mutated or deleted. Threads (mutable named pointers) can be moved to any state — including non-ancestors — but the underlying state objects are write-once. The key guarantee follows: no data is ever lost, even after a rebase or force push. The deleted history is unreachable from the thread but still addressable from its state ID.
See the conceptual page on captures and states for the user-facing framing of the same model.
Trait-based storage abstraction
Object access goes through the ObjectStore trait, not a concrete filesystem type. The default
implementation is a filesystem-backed store with packfile
support (delta + zstd compression via heddle gc
--aggressive). The trait also has an in-memory
implementation for tests, and future backends (S3,
database) drop in behind the same interface.
This shape is why hosted-mode swapping S3 for the local store is contained at one seam, not threaded through every callsite.
Agent checkout pattern
An agent checkout becomes isolated via a .heddle file that points to the shared object store, while the
checkout has its own HEAD. Two agents working on the same
repo see the same objects (zero duplication on disk) but
each has its own working state to capture against. The .heddle/agents/ directory in the shared store
holds lightweight TOML session records linked to threads —
that's how the orchestration surface (heddle agent
reserve, etc.) knows which agent is doing what.
Append-only oplog with scoped undo
Every action that changes repository state writes an OpRecord to the oplog in crates/oplog. The oplog is append-only and
chained — each entry has a checksum that depends on the
previous one. heddle fsck validates the chain;
repair rebuilds materialized views from oplog replay.
Undo and redo are scoped: oplog entries carry the checkout's HEAD-path scope, and undo/redo selects batches only from that scope. Two agents in two checkouts can safely undo independent work without trampling each other. See the conceptual page on the oplog.
Packfiles and delta compression
Loose objects can be packed via heddle gc into
packfiles with varint-encoded sizes and Git-style delta
compression. The pack builder uses a sliding window
(default W=10) to find optimal delta bases. The FsStore reads from packfiles before falling
back to loose objects, so packed and loose forms are
transparent to callers.
50–70% space savings on a typical repo after heddle gc --aggressive; the cost is a one-time
rewrite plus the loss of trivially-mounted loose objects.
Hosted: data plane + control plane
The hosted server lives in a separate closed workspace and isn't part of this docs surface. It splits into two layers. The data plane handles object access, refs, and oplog — same interfaces as the local Repository, backed by S3 + Postgres instead of the filesystem. The control plane handles namespaces, repositories, grants, and admin APIs; Postgres is the durable metadata source.
A self-hosted user never touches the control plane. A hosted user gets it implicitly through the platform. The boundary between data and control is sharp on purpose — each layer can scale and evolve independently.
Threads, actors, sessions
Three concepts you'll see together in the code:
- Thread — the human-facing work context.
A named record (
task/biscuit-authz) that collects captures, retries, merges. See task threads. - Actor — the active worker (human or
agent) writing to the thread. Identity is read from
HEDDLE_AGENT_*/HEDDLE_PRINCIPAL_*env vars at capture time. - Session — the execution / provenance
record.
HEDDLE_SESSION_IDlinks captures that happened during the same agent run; the segment is a provider/model epoch within a session.
The design point: Heddle follows harnesses ambiently
instead of making users run tools through Heddle. Set the
env vars once; every capture inherits them. See heddle capture for the full env-var stack and override flags.
Formal specifications (Quint)
Core state machines — merge resolution, ref locking, agent
lifecycle, worktree, repository ops — are formally specified
in specs/quint/ using Quint. Rust property
tests in the same areas mirror the specs; CI runs both.
The pattern is "spec first when the invariants matter,"
not blanket coverage.
The web app
The SvelteKit app in web/ is not a browser
IDE. It's an emerging product surface for repository
inspection, namespace operations, change visibility, and
eventually compare/review. Some routes are fully API-backed
today; others are foundation surfaces with partial or
mock-backed UI. Status chips on the marketing pages —
SHIPPED / FOUNDATION / PLANNED — are the same vocabulary
used inside the codebase to talk about web-route maturity.
Next
- Crate tour — one paragraph per crate so you know where to look for what.
- Operating principles — the contract the CLI ships to.
- The repo on GitHub — clone it, build it, run the tests.