Skip to main content
Partners

Enabling Evolutionary Database Development: Database branching with Lakebase, the conclusion

Part 3: Jen's Team at Scale

by Pramod Sadalage and Kevin Hartman

The methodology described in Evolutionary Database Design and operationalized in Refactoring Databases: Evolutionary Database Design has been clear for twenty years. The seven practices, the catalog of 70+ named refactorings, the transition mechanics – all of it documented, peer-reviewed, taught.

That methodology reached CI/CD in 2010 with Continuous Delivery (Chapter 12: Managing Data). Migrations became first-class artifacts in the deployment pipeline. The discipline of database-changes-as-code reached the broader CI/CD movement. What CD didn't solve was per-pipeline isolation: pipelines could run migrations, but they still needed a target database, and that target was shared. Practice #4 – Everybody gets their own database instance – has stayed aspirational on most teams because true per-developer production-shaped databases cost time, money, and DBA cycles. The compensating layer that emerged to work around the gap (mock objects, shared staging environments, in-memory database substitutes, DBA ticket queues) became foundational methodology by default, not by design.

In 2026, copy-on-write database branching arrives in Databricks Lakebase. A one-second, zero-storage-at-creation branch of a terabyte-scale production database is now an O(1) operation. The constraint that kept Practice #4 aspirational has lifted.

This series describes what changes when the constraint lifts: not the methodology – that holds – but the practices that emerge for the first time, the team-scale governance that becomes automatic, the role evolution for the DBA, and the new substrate that agents share with their human counterparts.

Meet Jen

Jen is the developer character from Evolutionary Database Design. In that essay she implemented a database refactoring – splitting an inventory_code field into location_code, batch_number, and serial_number – as a routine user story, illustrating that DBAs and developers can collaborate, schemas can evolve in small increments, and migrations carry the change forward safely.

The series picks up with Jen twenty years later. The methodology she follows is the same one she followed in 2003. What's new is the substrate underneath her workflow: copy-on-write database branching, which makes the practices she has been reading about operationally real at production scale. Across the three parts of this series she is the same Jen at three scopes – her day (Part 1), her new playbook (Part 2), and her team (Part 3).

Part 3: Jen's Team at Scale

Part 1 walked Jen through one feature. Part 2 named the eleven-practice playbook her work follows in 2026. Part 3 takes the same playbook to a team of fifty developers, with agents creating branches alongside humans, and asks: what becomes structural at this scale?

Three things become load-bearing.

First, the tier topology, the long-running branches that represent each environment in the promotion path. At one developer, you had a feature branch and production. At fifty, you have a structured hierarchy with stable lanes and ephemeral lanes layered on top.

Second, the permission model, the framework that says who can do what to which branch. At one developer, you could trust convention. At fifty, with agents in the mix, the framework has to be designed once and enforced automatically.

Third, the role of the DBA. At one developer, the DBA was Jen's design partner on the PR. At fifty, the DBA is the platform engineer who designed the framework Jen and her teammates are operating inside.

This post covers each of those, then turns to the agents. Agents on the same capability is Practice #11. Agents are like junior developers: they produce code that runs, tests that pass, migrations that apply, and, without guidance, unmaintainable systems. Tests are how the team keeps them honest. The TDD playbook that comes next is how the team makes the tests come first.

Tiers as long-running branches, not separate instances

In the world before branching, an environment was an instance: a dedicated Postgres deployment for staging, another for UAT, another for performance testing, each provisioned, patched, masked, and synced separately. The compensating layer Part 2 named lived here too. Drift between environments was structural.

At team scale, the tier model collapses into long-running branches off the same Lakebase parent.

A branch is one of two things: a tier (long-living, a parent in the promotion hierarchy) or a feature (ephemeral, descends from a tier and gets cleaned up). A tier has a parent. The parent-of chain is the promotion hierarchy.

Fig 1: A simple layout of Main line and its branches

In Fig 1: we see a simple hierarchy, with the main being the production and Feature branches are taken whenever needed, this setup generally is useful for early prototyping or early stage work with a really small team. In mature teams with more developers and/or lots of environments needing a setup as shown below. 

Fig 2: A layout with main line consisting of latest schema and reference data and all its various branches

In some enterprises, there is need to have a release candidate(RC) and this release candidate is under development for sometime and after successful testing it is promoted to production, Fig 3: shows a layout that allows for release candidates to be developed and later promoted to production, the release candidate branch that then be cleaned up, 

Fig 3: A layout using release candidate for development and testing

The names of the branches are arbitrary, what matters is the conventions on how the parent-of conventions are set up. A policy that does not let transitions that contradict the parent chain hierarchy can be implemented to prevent a direct feature merge.

The policy definitions enable many benefits for pipeline management:

  • One pipeline definition, branch-aware. The pr.yml introduced in Part 2 runs against every PR; the merge.yml runs against every promotion. The same workflow covers features, tiers, and the transitions between them.
  • Promotion is merge, not redeploy. Shipping from staging to production is a git merge whose downstream effect is a Lakebase branch promotion. The migration applies once at each tier, validated at the prior tier first just like how code that is validated in earlier stages.
  • No drift between "the test environment" and production. Every tier descends from the same parent. Schema diff between any two tiers is a real, computable thing: the schema is one chain of pages with divergence markers, not two installs of database software. Enabling teams to not handle a fleet of database servers that need to be patched and upgraded.
  • Rollback by repoint. A bad promotion is recovered by pointing the application at the pre-promotion snapshot of the tier. The snapshot itself is another branch, allowing teams to recover from faulty deployments.
  • Cost attribution by project_id, branch_id, endpoint_id. Unity Catalog captures metadata automatically. A SQL query against the audit and billing tables answers who created which branch, when was the branch created and what it costs to keep the branch running.

The large number of database instances also drops sharply. A six environment instance world (prod, staging, UAT, QA, perf, demo) collapses to one Lakebase parent with a parent-linked hierarchy of long running branches. The instance used to provision, monitor, and patch is now a logical branch with the same data shape as production, governed by the same policies as production, that resets to production state in one second when needed.

Different conventions allow you to create many different types of branches as parents, a common convention would be maintain a branch that has the database schema and any reference data in so that anyone can branch from it and populate it with test data or run automated tests that create real data and avoid running in conflict with staging or other branches

What to do now: the permission model

Practice #10 In Part 2's playbook we discussed governance designed once, inherited per branch. Let's see how it is implemented .

The design work is not runtime gating. It is a structural design that common tasks can then enforce.

The decisions to make now, before the team scales or agents are added:

  • Creating branches off each tier. Forking off production is a different permission as compared to forking off staging or forking off a feature. The default should be: features fork off the entry (bottom) tier, never off production. Production forks are reserved for hotfix and recovery flows.
  • Promotion between tiers. A “feature to staging” promotion is a code review concern. A “staging to production” promotion is a release coordination concern. The two are separate gates with separate reviewers allowing independence to business and development teams.
  • Read vs. write. Read access to production-shaped data on a branch is not the same permission as write access to that branch. Many engineering roles need the read; few need the write.
  • Unity Catalog policies. Unity Catalog policies like masking, row filters, and column-level permissions hold on production. Those policies are inherited on every descendant branch by default; tier-specific exceptions (for example a QA tier with synthesized PII for load testing) are declared once.
  • Audit trail captures. Every branch creation, every promotion, every migration application, every access pattern, in a single queryable place.

The principle that makes this work at team scale: roles declare; the policy enforces. The platform engineer declares the tier hierarchy, the permission model, the promotion paths, and the Unity Catalog policy posture. The policy refuses a transition that contradicts what was declared. There is no place where a human or an agent can override a declared boundary by retrying the operation in a different shape.

This is the work to do today, before the team is fifty developers and agents are creating branches faster than any human could review them. The framework is the thing that holds the team together with shared conventions and guardrails. Everything else operates inside it.

The new role: from DBA to platform engineer

Twenty years ago, the closing of the 2003 Evolutionary Database Design essay echoed the following:

“Using the techniques we describe here may sound like it is a lot of work, but in fact it doesn't require a huge amount of people. On many projects we have had thirty-odd developers and a team size (including QA, analysts and management) of close to a hundred. On any given day we would have a hundred or so copies of various schemas out on people's workstations. Yet all this activity needed only one full time DBA with a couple of developers understanding the workings of the process and workflow.”

That argument carries forward into 2026 with five reinforcements.

1. The ratio holds, with more headroom per DBA. One full-time DBA per ~100 people, with the same hundred-plus concurrent branches in flight, carries less cost per branch because branch creation is now a one-second metadata operation. The ratio isn't the story. What the DBA is doing with the hours is.

2. The work shifts up the stack. Hours that went to infrastructure provisioning, schema provisioning, controlling access and occasional manual intervention in 2003 now move to branching policy design, masking policies, promotion workflows, and audit trail observability. The concrete artifacts: schema-diff bots that post on every PR, scheduled jobs that reset development branches nightly, observability dashboards tracking branch lifecycle and TTL compliance, CI definitions that gate merges on schema validation. This is platform design work; much higher order work than before.

3. Agents enter the equation. Something the 2003 essay didn't have to deal with was Agents writing code. Neon reports about half a million branches a day, with over 80% of them created by agents. One DBA cannot ticket gate that volume. The role's evolution to platform architect is the only role that works at agent scale.

4. The numbers get concrete. A six-developer team typically generates 30+ operational tickets per sprint in the old model (provisioning, schema reviews, data refreshes, access grants). In the branch-native model: under 5 high-value policy reviews per sprint. The DBA toil drops from 20+ hours per week to under 5 and MTTR drops from 4+ hours to under 30 minutes. This reduction in toil can help the DBA pair with developers to arrive at optimal solutions for the features being developed.

5. The audit trail becomes a strategic dashboard. What used to require cross referencing three services and three query languages is now a single SQL query against the platform's system tables. Every branch, every action, every cost, every governance event in one place. The DBA isn't manually building this view; it's the platform.

In the foreword to Refactoring Databases (2006), Martin Fowler hoped the book represented "only a first step" toward tools that would automate database refactoring the way IDEs automate code refactoring. Branching is that next step. What Fowler hoped for in 2006, disciplined database change at the speed of code, is what the platform now provides. The DBA designs the discipline; the platform applies it.

The new role title varies (platform engineer, database platform owner, DBA in name still). The substance is the same: the person who designs the framework everyone else operates within.

Agents on the same capability

Practice #11 in Part 2 we described the coding-agent-as-practitioner with the same branching capability. Let's discuss. 

Agents get access to branches, not production. The same workflow rules that apply to Jen apply to the agent. Tests run against a real database on a branch, not against mocks an agent could modify or delete. Schema diffs land on every PR, regardless of who authored the migration. The policies that protect Jen protects the agent.

But the policies alone are not enough. Agents, left undirected, are like junior developers.

A junior developer, given a feature ticket and no further guidance, can produce code that compiles, tests that pass, and a migration script that applies cleanly. The code might also duplicate logic that lived elsewhere in the codebase introducing duplication. The migration might add a column with the right name and the wrong type. The test might pass because it exercises only the happy path. None of these failures show up in the green CI run; they show up six weeks later when somebody else has to extend the work.

Agents do the same thing but much faster and at higher volume.

Without explicit guidance, an agent will:

  • Reinvent a pattern that the codebase already has.
  • Author a schema change that looks right but skips the named refactoring transition mechanics (e.g., drops a column without first moving the data, or adds a NOT NULL column without updating existing rows).
  • Write tests that pass against the data shape it imagined, not the data shape that exists in production.
  • Author migrations that apply but produce inconsistent state on rollback.
  • Layer abstractions on top of abstractions to satisfy a small change.

The substrate makes the green bar honest (no mocks; real database on a branch). What it doesn't do is make the code maintainable.

The team makes the code maintainable through four reinforcements:

  • Guardrails: the permission model. Agents cannot create branches off production, cannot promote between tiers, cannot apply migrations to a tier they don't own. The substrate refuses.
  • Patterns: the named refactorings. The 2006 catalog at databaserefactoring.com names 70+ refactorings with explicit transition mechanics. An agent guided to "apply the Split Column refactoring" produces a different migration than an agent guided to "split this column."
  • Workflow: the source-control-management (SCM) state machine. Agents follow a sequence of states with blocking gates between them. The substrate refuses transitions that don't satisfy the declared contract.
  • Review: humans on the PR loop. Schema-diff visible on every PR, with the DBA on the review path. Practice #1 re-cast from Part 2 made this asynchronous; at team scale, with agents in the mix, that asynchronous review is what catches the slippage tests didn't catch.

The SCM workflow is the load-bearing one. In the Lakebase App Dev Kitsource control management covers more than the code branch: it covers the paired branching (the code branch and the Lakebase branch managed as a single unit, as Part 1 introduced), this paired branching provided as a feature in the common substrate of the Lakebase App Dev Kit enforces common guardrails such as merges that contradict the hierarchy, the migration that travels with the branch, the CI gates, and the merge that applies the migration to the parent tier. The dev kit ships this SCM workflow as an executable state machine. 

Fig 4: The various states in an SCM workflow.

Fig 4: above describes the five states, during development: scaffold-complete, feature-claimed, pr-ready, ci-green, merged. Each transition between different states is driven by a CLI command (lakebase-scm-claim-feature-branch, lakebase-scm-prepare-pr, lakebase-scm-wait-ci, lakebase-scm-merge). Each CLI command validates preconditions before doing work, performs the transition, and writes the new state to .lakebase/workflow-state.json (a schema-validated gate surface). A failed gate leaves the machine recoverable at the prior state. The gates are blocking, not advisory.

Agents call these CLIs, they cannot invent a parallel path. The substrate refuses to advance the state machine on a precondition failure: a feature branch parented on the wrong tier is rejected; an attempt to merge before CI is green is refused; an inconsistent state file blocks the next gate. The handoff contracts are owned by the scrum-master role; the substrate enforces them. Structural decisions (the tier hierarchy, the source tier for a feature, the promotion path) belong to the architect or scrum-master, are recorded, and are then honored by the substrate. The substrate never invents a tier or a parent; it honors what was declared and refuses transitions that contradict it.

This is the framework that breaks how teams have been using agents so far. The naive integration treats an agent as a senior engineer in a chat window: dump context, ask for output, iterate. That works at one-developer scale but breaks at team scale, because the agent's "context" cannot be reviewed, governed, or replayed. Treat the agent as a junior developer instead: give it a narrow, documented job inside an executable state machine, validate its output against a schema, advance the gate, repeat. The substrate enforces; the workflow-state file is the API.

Playbook entries for the team-scale practices

Part 2 introduced practices #10 and #11 in the Emerging practices for 2026 

Practice #10: Governance designed once, inherited per branch

Rule. The permission model, Unity Catalog policies to manage access control and audit-trail capture are designed once on the trunk and inherited automatically by every descendant branch.

Why is this a durable habit now?  At team scale governance has to be a function of the dba or platform engineer, not a discipline a developer remembers. Branches are created and destroyed in seconds; per-branch manual governance configuration would consume the time savings produced by branching.

Mechanics:

  • Declare the tier hierarchy: which long-running branches exist, what their parent links are, what governance posture each tier carries.
  • Declare the permission boundaries: who can create branches off each tier, who can promote between tiers, who can read vs. write.
  • Declare the Unity Catalog policy inheritance: masking, row filters, and which column level permissions inherit from parent by default; tier specific exceptions are declared once. Auto propagation across all Unity Catalog policy types is finishing landing; design the framework for the destination state.
  • Declare the audit trail capture: every branch creation, every promotion, every migration application, every access pattern lands in queryable system tables automatically.
  • The dba or platform engineer enforces through policy. A transition that contradicts the declared model is refused.

Anti-pattern. Configuring governance per branch at runtime. The point of declaring once is that the framework holds even when branches are created faster than a human can review them. Manual per-branch configuration recreates the bottleneck branching just removed.

Where Jen's team extends. Jen's platform engineer or dba declared the tier hierarchy at project creation. Every branch Jen, her teammates, or the team's agents create inherits the declared masking, permissions, and audit configuration. When the team adds a new tier, the framework records the new parent link; in-flight features keep their original parent (no retroactive re-parenting); new features fork from the new entry tier.

Practice #11: Agent-as-practitioner with the same branching capability

Rule. Agents operate inside the SCM workflow's executable state machine: five states, blocking gates between them, schema-validated state files. The same workflow rules govern Jen and the agent; a common substrate enforces them regardless of who is acting.

Why is this a durable habit now?  Branch creation is a metadata operation, so agent driven volume is feasible. The substrate developed for agents to leverage can refuse transitions that contradict the declared tier hierarchy or the recorded gate state. There is no chat-window context to fall back on; only the artifact on disk (workflow-state.json) crosses the boundary between gate transitions.

Mechanics:

  • The SCM state machine has five states: scaffold-complete, feature-claimed, pr-ready, ci-green, merged. Each transition is driven by a CLI; each CLI validates preconditions before doing work.
  • The gate surface is .lakebase/workflow-state.json, validated against scm-workflow-state.schema.json. Every transition writes the new state and the invariants the next gate needs.
  • Structural decisions (tier hierarchy, source tier per feature, promotion path, handoff contracts) belong to roles (architect or scrum-master), are recorded, and are then enforced by the substrate.
  • Agents call the CLIs. The substrate enforces; agents cannot route around. A failed gate leaves the state machine recoverable at the prior state; the agent does not "retry in a different shape."
  • The CI gates that run inside pr-ready to ci-green exercise real Postgres on a branch, with schema diff posted on the PR. Real database state is what the agent's work is measured against.

Anti-pattern. Treating an agent as a senior engineer in a chat window using “dump context and ask for output” works at single  developer scale but breaks at team scale because the "context" cannot be reviewed, governed, or replayed. Use the artifact-as-API model instead: agents READ workflow-state.json and the documented inputs for their phase; they WRITE the documented outputs; the validators check; the next gate fires only when the contract holds.

Where Jen's team extends. Every agent on Jen's team operates inside the same five-state machine that Jen and her teammates do. The scrum-master role owns the handoff contracts; the substrate refuses transitions that don't satisfy them. An agent cannot ship a feature that was forked from the wrong tier; an agent cannot merge before CI is green; an agent cannot bypass the schema-diff artifact. The framework holds regardless of who or what initiated the action.

TDD as an opt-in layer woven on top

Practice #11 establishes the SCM workflow as the baseline: every kit consumer follows it, agents and humans alike. TDD is a separate consideration that layers on top of that baseline for teams that want test-first discipline. It is opt-in; the SCM gates are mandatory regardless of path.

Why tests matter, even before TDD: when agents author code, tests are the only enforcement that scales. Kent Beck, in his 2026 Pragmatic Engineer interview, named the failure mode publicly: he's having trouble stopping AI agents from deleting tests in order to make them pass. The green bar is cheap to satisfy when nothing in the loop forces the agent to confront the real shape of the system. Mocks make this trivial. In-memory substitutes do too.

Branching makes the green bar honest at the data layer. A real database on a real branch is what the agent is testing against; the schema constraints reject row inserts that don't match, foreign keys reject orphans, real data shape exposes assumptions the mock would have absorbed silently, the agent cannot drop tables. The cost of faking compliance rises with these guardrails.

But the substrate is necessary, not sufficient. Tests have to come from somewhere. If the agent writes them, the agent can also delete them. This is the gap TDD fills.

The TDD workflow layers on top of the SCM workflow. It fires between the SCM states feature-claimed and pr-ready; it calls down into SCM for branch operations (cycle experiment branches use the SCM primitive underneath); it does not call up into SCM. The dependency is one-way. Teams that don't want the TDD layer can ship features by raw editing on the feature branch and still satisfy every SCM gate.

The Lakebase App Dev Kit ships the TDD workflow as a second state machine with its own per-role agents and gate validators:

  • Spec-author turns a requester narrative into a structured feature artifact, schema validated.
  • Architect-reviewer maps the feature's non-functional requirements and architecture principles to architectural decisions, output as architecture.json plus prose.
  • Test-strategist produces the test list and per acceptance criterion scenarios, output as test-list.json plus rendered markdown. Every NFR has at least one acceptance criteria (AC); every AC has a scenario.
  • Scrum-master orchestrates the build cycles. Each cycle forks an experiment branch (using the SCM substrate underneath), runs a driver agent to implement the next AC, runs a navigator agent to review and validates the outcome.
  • Driver and navigator are the inner loop test-writer and code-writer, paired, RED-GREEN-REFACTOR.

Each role has documented inputs and outputs, validated against a schema. Each agent gets only its documented inputs; outputs are validated before the next role runs. The artifact is the API between roles; the schema is the type check. A missing artifact is treated like a failed gate. A malformed artifact is treated like a missing one. The TDD layer borrows the same artifact-as-API model that Practice #11 established for SCM.

The full playbook for the TDD layer lives in the Companion: Lakebase App Dev Kit (open-source, with a companion e-book for human practitioners). The SCM and TDD state machines, the per-role agent contracts, the artifact conformance checks, and the gate validators all ship as CLIs that any orchestrator (the kit, the IDE extension, a CI job, a human shell session) can call.

The short version: SCM is the baseline (Practice #11). TDD is a layer on top. Branching makes tests honest; TDD makes tests come first; the kit makes both workflows executable.

What Jen's Team Shows

Part 1 walked Jen through one feature: she paired a code branch with a Lakebase branch, ran a real migration against production shaped data in seconds, tested without mocks, opened a PR with the schema diff posted inline for review, and merged with the migration applied and the ephemeral branches cleaned up. Database change became part of normal development.

Part 2 named the playbook: the seven practices from 2003, the limitations that kept five of them aspirational until 2026, the same seven re-cast once branching landed, plus two new practices that the technology enables for the individual developer. Nine practices in the day-to-day, two more emerging at team scale.

Part 3 took the playbook to the team. Defined the tier topology, describing how long-running branches reside inside one Lakebase parent, how the permission model becomes the platform engineer's design artifact, declared once and enforced by the substrate (Practice #10). How the DBA's role completes its evolution to platform architect, with five reinforcements of the 2003 staffing argument. Agents enter the workflow on the same capability, inside the SCM workflow's executable state machine, with the substrate enforcing the gates regardless of who or what is acting (Practice #11). TDD is an opt-in layer woven on top: tests-first discipline with dedicated roles, gates, and artifact contracts, for teams that want it.

The Companion: Plugin Walkthrough covers the Lakebase SCM Extension for VS Code and Cursor end-to-end.

The Companion: Lakebase App Dev Kit, with a companion e-book for human practitioners, covers the TDD workflow above: the SCM and TDD state machines, the per-role agents, the gate validators, and the artifact contracts that make agents safe to put on the team.

The methodology was clear for twenty years. The technical capability landed in 2026. The playbook for both human and agent practitioners is now operational. Jen's team is fifty developers and a fleet of agents; the workflow is the same.

Conclusion: The capability to branch a database now gives immense flexibility to the development team to provision databases, build tests against real schema, run CI for each PR creation against its own database and allow agents to work in this manner all with the governance framework of Unity Catalog enforcing policies.

 

Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.