Built on Databricks: Fueling Data and AI innovation in modern software products
May 26, 2023 in Data Strategy
The era of AI is upon us. Every product builder must ask themselves how to leverage new data and AI capabilities, or their product will not survive - even products as dominant as Google Search are being challenged. Traditional descriptive analytics are table stakes. Modern applications must incorporate real-time insights and AI-driven action to meet user expectations.
The cloud has enabled a dizzying array of data stack choices that significantly complicate the design and operations of software applications. Product builders that take the best-of-breed approach very quickly find themselves stitching together and managing multiple incompatible data silos. Developer productivity grinds to a halt, and data maintenance costs spiral out of control.
Top software companies - from startups like Abnormal Security to enterprises like Adobe - have built products on the Lakehouse. The Databricks Lakehouse combines the strengths of data lakes and traditional data warehouses. It unifies traditional analytics with modern capabilities of AI and real-time so that products builders do not have to choose between the past and the future. This unified approach accelerates developer productivity, reduces costs, and enables leading-edge innovation.
To fuel this Data and AI revolution, Databricks is increasing our investments in companies that build on the Databricks Lakehouse platform. Databricks for Startups and Built on Databricks programs provide financial, technical, and GTM investments to help product builders succeed.
Tale of too many databases
Product builders usually start with one data use case in mind, and they build quickly with one database.
But the data needs for an application grows continuously. It might start with embedded charts, then evolve to streaming alerts, and soon add generative AI. Before long, the product depends on a mosaic of multiple services - some open source, some native to their IaaS platform, and some proprietary. Each product feature takes longer to develop and manage, and innovation and productivity grind to a halt.
When building a new product, builders often make the mistake of kicking the can down the road, assuming they won't develop new data and AI capabilities until far down the roadmap. However, the database creep happens quickly for data-forward products. We often see startups requiring multiple data pipelines at a very early stage.
Building on the Databricks Lakehouse allows product builders to avoid this database creep.
Unlock productivity on a unified platform
The Lakehouse unifies data lake and data warehouse architectures by implementing data management characteristics of a data warehouse directly on a low-cost cloud data lake. As a result, applications built on the Lakehouse can access all data - structured, semi-structured, unstructured - and execute any processing the application needs - data engineering, BI, ML, and real-time streaming.
VIZIO adopted the Databricks Lakehouse Platform to unify the diverse needs of their data-as-a-service offering. VIZIO is the leading Smart TV manufacturer in America, harnessing data from TVs to power their platform business and create engaging customer experiences. Before Databricks Lakehouse, they had no single platform to run a data-as-a-service business at this scale. So they got creative by stitching together many data services and a data warehouse. As the data volumes and new features grew, managing this system became prohibitively expensive and time-consuming to manage. Furthermore, the data architecture limited their innovation potential. It would have been a massive undertaking to bolt on a separate real-time streaming and production ML system on top of the data warehouse to support advanced new features.
Ultimately, Databricks was the only platform that could handle ETL, monitoring, orchestration, streaming, ML, and Data Governance on a single platform. Not only was Databricks SQL + Delta able to run queries faster on real-world data (3x faster and 60% cheaper than any other data warehouse vendor), but they no longer needed to buy other services just to run the platform and add features in the future.
In short, Databricks Lakehouse enables building products with no compromise:
- Innovate at the leading edge - leverage leading-edge AI/ML and streaming capabilities (as well as traditional BI analytics).
- Move fast - develop new data and AI features rapidly as the product grows.
- Save costs - drastically reduce ETL and storage costs by running all use cases on a single source of truth.
- Improve efficiency - manage and collaborate on one platform, avoiding the complexity of multiple systems.
- Scale to your potential - deliver robust performance from gigabyte to petabyte scale.
Product builders free themselves from the constraints and complexities of specialized data stores and can innovate without limits.
Build on your Lakehouse, your customer's Lakehouse, or both
The Databricks Lakehouse platform enables a separation between the application and data processing, thereby offering flexible options to product builders. The Lakehouse processes data as a microservice that serves the application through APIs, connectors, and Delta Sharing.
When building a product, you can choose to build data processing on your Lakehouse, on your customer's Lakehouse, or share processing between both environments. Databricks Lakehouse empowers you to build the best architecture that meets customers' needs.
Hunters chose to build the Hunters SOC Platform on their customers' Lakehouse instances. Hunters SOC Platform is a modern SIEM alternative that transforms their customer's Lakehouse into a security data lake. It ingests and performs the ETL of all security-related data into the customer's Databricks Lakehouse using the customer's cloud storage: the customer retains ownership of all the security data. This involves terabytes of data from dozens of security products. Hunters ETL follows the Databricks' Medallion Architecture model in storing the raw data and normalizing the data into a unified schema. While Hunters provides a rich set of analytical capabilities, customers with advanced cybersecurity analytics teams can augment Hunters' capabilities by leveraging Databricks Data Science and Machine Learning capabilities and the partner technologies in the Databricks ecosystem.
With this architecture, Databricks customers can attain an end-to-end, security operations platform on their own Databricks Lakehouse Platform, while keeping the flexibility of owning all the data.
Databricks invests in programs that help companies succeed
Databricks for Startups supports startups through their product journey. Databricks invests free credits that allow startups to explore without worry and technical resources to get them building fast. Once customers and partners have built a product, Built on Databricks program offers joint marketing and sales collaboration opportunities to help them grow on the Databricks Lakehouse platform. Kubit's accelerated journey on the Databricks Lakehouse exemplifies the benefits of these programs.
Kubit is a startup that is disrupting the product analytics industry, as the first product analytics tool that leverages modern data-sharing capabilities and is designed for the unlimited volume and scale of data in today's enterprises. Established product analytics companies have built their products on proprietary data stacks, which restrict customers' analytics use cases in data silos. Kubit is built on the Databricks Lakehouse Platform to deliver superior flexibility, scalability, and performance. Kubit's platform utilizes Delta Sharing to enable secure and seamless access to customers' product analytics data and full data model. Kubit is able to process trillions of rows - petabytes - of data to serve their large-scale enterprise customers.
Building an application of this scale is a daunting endeavor. By joining the Databricks for Startups program, Kubit received free credits and prompt technical advice to immediately start building. The team developed a prototype in weeks and built an enterprise-ready product in 4 months. With a product built on the Databricks Lakehouse, Kubit also participates in the Built on Databricks program and enjoys Databricks support in reaching joint customers. "Our final product delivered 10x to 20x performance improvement relative to the MVP. Now we're exploring opportunities with joint customers. Databricks' support has been essential to our success, and we couldn't be happier with the outcome." said Alex Li, Kubit's CEO.
Get started with the Databricks Lakehouse
The rise of cloud platforms such as AWS, Azure, and Google Cloud freed software companies from the need to build infrastructure and engendered a decade of unprecedented software innovation. By giving developers a comprehensive unified data platform on which to build, the Databricks Lakhouse is unleashing the next wave of AI software innovation.
- Find more information on the Built on Databricks website.
- Startups building the next generation apps - visit Databricks for Startups website.
- Learn more about the Built on Databricks partner program
- Join us at Data and AI Summit where Adobe, Action IQ, Hunters, and many other Built-on Databricks customers and partners will share their stories