Skip to main content

How agentic software development will change databases

What agents actually need from database infrastructure, and what we learned building Lakebase

AI agents now create roughly 4x more databases than human users

Published: March 30, 2026

Announcements6 min read

In our previous blog, we introduced Lakebase, the third-generation database architecture that fundamentally separates storage and compute. In this blog, we explore a critical consequence of this shift: how are AI agents changing the software development lifecycle, and what kind of databases do AI agents actually need?

The software development lifecycle is undergoing a radical transformation. LLMs have enabled a new generation of agentic frameworks that can analyze requirements, write code, execute tests, deploy services, and iteratively refine applications, all at record speed. As a result, the marginal cost of building and deploying applications is plummeting.

Even though we are still at the early stages of agentic software development, we have consistently observed both within Databricks and among our customer base that the rate of experimentation is accelerating and the sheer volume of applications being built is exploding. As the world transitions from handcrafted software to agentic software development, we identify three emergent trends that will jointly redefine the requirements of modern database systems:

  1. Software development will shift from a conventional slow and linear process to a rapid evolutionary process.
  2. Software will become more valuable overall, but the value of each individual application will plummet as the marginal cost to develop software goes down. This means that we need infrastructure that can support software development at minimal marginal cost. Crucially, the architecture must also account for the fact that any one of these small, ephemeral databases can become a production system with a lot of traffic, making the ability to suppor t seamless, elastic growth a fundamental architectural requirement.
  3. Open ecosystems will become a strict operational requirement, not just a preference.

Here is a deeper look at each of these trends and how Lakebase is uniquely architected to support them.

Rapid Evolutionary Software Development

Because a large part of the software development lifecycle was historically very costly (writing code, testing, operations), building and operating a new application required significant engineering investment. Consequently, traditional software development was optimized for careful planning and a relatively linear process.

Agents change this dynamic. Applications can now be generated, modified, and redeployed in minutes. Instead of building one carefully designed system, developers and agents increasingly explore large spaces of possible implementations. Development begins to resemble an evolutionary algorithm:

  1. Generate an initial version of an application.
  2. Rapidly create variants with different schemas, prompts, or logic.
  3. Evaluate the results.
  4. Continue development from the most successful versions.

Depending on the complexity, each evolutionary iteration might last from seconds to hours, which is 100x to 1000x faster than the pre-LLM development cycles. In fact, our telemetry from Lakebase production environments shows that on average, each database project has ~10 branches and some databases with nested branches reaching depths of over 500 iterations (i.e., 500 iterations in the evolution).

Code infrastructure such as Git already supports this workflow very well. Developers or agents can create a branch of the codebase with git checkout -b instantly. However, legacy database infrastructure offers no quick, cost-effective way to branch off the database state.

Lakebase is designed to support this agentic evolutionary workflow natively. Agents can create a branch of a production or test database instantly and at near-zero cost. Because Lakebase uses an O(1) metadata copy-on-write branching mechanism at the storage layer, no expensive physical data copying is required. You simply branch the data alongside the code and only pay for the database compute for the duration of the experiment.

Cost Sensitivity

As mentioned earlier, although software will become more valuable overall, the value of each individual application will plummet as the marginal cost to develop software goes down. Many agent-generated services are small internal tools, prototypes, or narrow workflows. They may run only occasionally or serve highly bursty, event-driven workloads.

In this world, we need infrastructure that can support new software development at minimal marginal / incremental cost. Any database that imposes hundreds of dollars per month as a baseline price floor is impossible to justify if the application itself provides limited or experimental value. Our data shows that for about half of these agentic applications, the database compute lifetime is less than 10 seconds.

Traditional databases were designed as always-on infrastructure components with fixed provisioning and operational overhead. That model fits large, stable applications but fails economically when applications are numerous, ephemeral, and short-lived.

The serverless, elastic nature of Lakebase directly addresses this cost imperative. By fully decoupling the compute instances from the storage layer, Lakebase can automatically scale database compute based on the load in sub-second time. Crucially, it also scales the database down to zero when not utilized, completely eliminating the cost floor and achieving near-zero idle costs.

Growing From Small to Large

The nature of agent-driven development means that a huge volume of small, ephemeral databases are constantly being created for testing, prototyping, and narrow workflows. The crucial architectural challenge is that developers, and the agents themselves, cannot predict which of these nascent applications will suddenly take off and require massive production scale.

The database architecture must therefore inherently support seamless, elastic growth from a tiny, low-cost instance to a full-scale production system with heavy traffic. This transition must occur without requiring any manual re-platforming, provisioning, or complex migration steps from the user. The architecture alone should handle the evolution, making the ability to instantly scale from near-zero to massive capacity a fundamental requirement for a world where agentic exploration is the default development model.

Open Source Ecosystems

Agentic systems derive their capabilities from LLMs trained on extensive corpora of publicly available source code and technical documentation. This training bias gives them a deep, operational familiarity with open-source ecosystems, APIs, and error semantics.

Databases such as Postgres are deeply embedded in the open-source world. Their interfaces, behaviors, and error codes appear throughout the training data that modern models learn from. As a result, agents can generate queries, schemas, and integrations for them far more reliably. Proprietary databases face an inherent disadvantage because agents simply lack sufficient context to operate them effectively.

For agent-driven development, openness is no longer just a philosophical preference—it is a practical requirement for reliable automation. But this requirement must extend beyond just the query interface; it must reach the storage layer itself. While second-generation cloud databases might use open-source execution engines, they still lock your data in proprietary, internal storage formats.

Lakebase is built on Postgres, but takes openness a step further. It stores data in standard, open Postgres page formats directly in cloud object storage (the data lake). This allows agents, external analytical engines, and new tools to interact with the data natively, without ever being bottlenecked by a single, proprietary compute engine.

Databases for the Agentic Era

The shift is not hypothetical — it is already underway. In Databricks’s Lakebase service, AI agents now create roughly 4x more databases than human users.

This data point captures the trends described above in a single chart. Agents are prolific creators of database environments — spinning up instances for experiments, branching for testing, and discarding them when done. The infrastructure serving these workloads must support this pattern economically and operationally.

Properties like cost efficiency, agility, and openness have always been desirable. But the rise of agentic software development has turned them from nice-to-haves into fundamental requirements. Databases that impose high cost floors, lack branching primitives, or lock data in proprietary formats will increasingly fall out of step with how software is being built.

This is precisely the design space of Lakebase. It was built for the specific economic and technical realities that AI-driven development creates: evolutionary branching at zero cost, true scale-to-zero elasticity, open Postgres storage on the lake, and self-managing operations. As agents increasingly participate in building and evolving software, the databases best suited for this new world are those designed for experimentation, openness, and elasticity from the ground up.

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox