Skip to main content

This is a collaborative post from Databricks and Microsoft Azure. We thank Rajeev Jain, Senior Product Marketing Manager at Microsoft, for his contributions.

 

Data + AI Summit 2023: Register now to join this in-person and virtual event June 26-29 and learn from the global data community.

Microsoft is a Platinum Sponsor of Data + AI Summit 2023, the premier event for the global data community. Satya Nadella, Chairman and CEO of Microsoft, will be speaking at the Data + AI Summit Keynote Day 1. Join Data + AI Summit to learn from joint Databricks and Microsoft Azure customers like Akamai, Waterstons, Providence Health Services, ABN Amro, Bradesco, and Department of Veterans Affairs who have successfully leveraged the Databricks Lakehouse Platform for their business, bringing together data, AI and analytics on one common platform.

At Data + AI Summit, Databricks and Microsoft customers will take the stage for sessions to help you see how they achieved business results using the Databricks Lakehouse on Azure. Attendees will have the opportunity to hear data leaders from Akamai and Waterstons on Tuesday, June 27th, then join Providence Health Services, Bradesco, and ABN Amro on Wednesday, June 28th and Department of Veterans Affairs on Thursday, June 29th. At Data + AI Summit, learn about the latest innovations and technologies and hear thought-provoking panel discussions along with the ability for networking opportunities where you can connect with other data professionals in your industry.

The sessions below are a guide for everyone interested in Azure Databricks and span a range of topics -- from real-time clinical operations monitoring to shopper analytics to IoT data gathering. If you have questions about Azure Databricks or service integrations, connect with Azure Databricks Solutions Architects at Data + AI Summit at the Microsoft booth on the Expo floor.

Keynote

Satya Nadella Keynote

Wednesday, June 28 @8:15 AM

Visionary leaders will share insights on:

Data, analytics and AI landscape: Discover what's driving so much focus on data and why data professionals are zeroing in on new ways to tackle their database challenges. Learn why there is so much interest in LLMs, what is happening across the data, analytics and AI landscape and the future of the market

Evolution of the lakehouse: Take a look at the larger universe that the lakehouse lives inside of, learn what's new and explore the evolution with us

Open source technologies: Hear from the open source community about what's new and what's to come for Apache Spark™, Delta Lake and MLflow and learn how this affects the lakehouse and the overall market at large

Learn more

Microsoft customer breakout sessions

Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks

Tuesday, June 27 @1:00 PM

Akamai's content delivery network (CDN) processes about 30% of the internet's daily traffic, resulting in a massive amount of data that presents engineering challenges, both internally and with cloud vendors. In this session, we will discuss the barriers faced while building a data infrastructure on Azure, Databricks, and Kafka to meet strict SLAs, hitting the limits of some of our cloud vendors' services. We will describe the iterative process of re-architecting a massive scale data platform using the aforementioned technologies.

We will also delve into how today, Akamai is able to quickly ingest and make available to customers terabytes of data, as well as efficiently query Petabytes of data and return results within 10 seconds for most queries. This discussion will provide valuable insights for attendees and organizations seeking to effectively process and analyze large amounts of data.

Learn more

Getting Insight From Anything: Gathering Data With IoT Devices and Delta Live

Tuesday, June 27 @4:00 PM

The drive to make data-driven decisions shapes the strategy of many companies. However, this dream has a problem; just because you need to make a decision does not mean you have the data to make it. Often these decisions require data from sources typically unconnected to the internet. This could be an architect determined to understand how the construction of her building is progressing, or a manufacturer wanting to live their best industry 4.0 life. It's not uncommon for companies to find this journey to become "smart" is a long one; with difficult ethical ground to cover, or prohibitive costs associated with procuring the necessary equipment. However, the development of cheap and accessible IoT devices allows for the gathering of data from anything to be more within reach than ever.

This session will look at efforts to build proof-of-concept IoT systems. We will discuss our (continuing) journey of using and managing hardware and the problems we faced along the way. The focus will be the software stack. Azure IoT Hub is used for device management, with Power BI supporting the front end. However, the centerpiece is Delta Live Tables - allowing us a method of near real-time analysis of our data. We will discuss the advantages of using Delta Live for this, and how we wish to extend our IoT systems in the future.

Learn more

Improving Hospital Operations with Streaming Data and Real Time AI/ML

Wednesday, June 28 @11:30 AM

Over the past two years, Providence has developed a robust streaming data platform (SDP) leveraging Databricks in Azure. The SDP enables us to ingest and process real-time data reflecting clinical operations across our 52 hospitals and roughly 1000 ambulatory clinics. The HL7 messages generated by Epic are parsed using Databricks in our secure cloud environment and used to generate an up-to-the minute picture of exactly what is happening at the point of care.

We are already leveraging this information to minimize hospital overcrowding and have been actively integrating AI/ML to accurately forecast future conditions (e.g., arrivals, length of stay, acuity, and discharge requirements.) This allows us to both improve resource utilization (e.g., nurse staffing levels) and to optimize patient throughput. The result is both improved patient care and operational efficiency.

In this session, we will share how these outcomes are only possible with the power and elegance afforded by our investments in Azure, Databricks, and increasingly Lakehouse. We will demonstrate Providence's blueprint for enabling real-time analytics which can be generalized to other healthcare providers.

Learn more

ABN Story: Migrating to Future Proof Data Platform

Wednesday, June 28 @3:30 PM

ABN AMRO Bank is one of the top leading banks in the Netherlands. It is the third largest bank in the Netherlands by revenue and number of mortgages held within the Netherlands, and has top management support of the objective to become a fully data-driven bank. ABN AMRO started its data journey almost seven years ago and has built a data platform off-premises with Hadoop technologies. This data platform has been used by more than 200 data providers, 150 data consumers, and more than 3000 datasets.

To become a fully digital bank and address the limitation of the on-premises platform requires a future-proof data platform DIAL (digital integration and access layer). ABN AMRO decided to build an Azure cloud-native data platform with the help of Microsoft and Databricks. Last year this cloud-native platform was ready for our data providers and data consumers. Six months ago we started the journey of migrating all the content from the on-premises data platform to the Azure data platform, this was a very large-scale migration and was achieved in six months.

In this session, we will focus on three things:

  1. The migration strategy going from on-premises to a cloud-native platform
  2. Which Databricks solutions were used in the data platform
  3. How the Databricks team assisted in the overall migration

Learn more

Scaling Up Healthcare Data Innovations: Lessons Learned from VA's Summit Data Platform Journey

Thursday, June 29 @2:30 PM

The Department of Veterans Affairs (VA) is the nation's largest health system with 170 VA Medical Centers and over a 1000 outpatient clinics, serving over 9 million beneficiaries, and employing almost 400,000 employees. Three years ago, VA Office of Technology (OIT) embarked on the development of the Summit Data Platform (SDP) to centralize and modernize data and analytics across the VA enterprise. Today, SDP supports over 30 use cases from Long COVID care, suicide prevention, cancer care tracking, clinical deterioration, veteran experience, and many more. These use cases include some of the most progressive technological efforts at VA with the use of Large Language Models (LLMs), self-service analytics on the cloud, and real-time streaming analytics.

Learn more

Data Democratization with Lakehouse: An Open Banking Application Case

Wednesday, June 28 @11:30 AM

Banco Bradesco represents one of the largest companies in the financial sector in Latin America. They have more than 99 million customers, 79 years of history, and a legacy of data distributed in hundreds of on-premises systems. With the spread of data-driven approaches and the growth of cloud computing adoption, we needed to innovate and adapt to new trends and enable an analytical environment with democratized data.

We will show how more than eight business departments have already engaged in using the Lakehouse exploratory environment, with more than 190 use cases mapped and a multi-bank financial manager. Unlike with on-premises, the cost of each process can be isolated and managed in near real-time, allowing quick responses to cost and budget deviations, while increasing the deployment speed of new features 36 times compared to on-premises.

The data is now used and shared safely and easily between different areas and companies of the group. Also, the view of dashboards within Databricks allows panels to be efficiently "prototyped" with real data, allowing an easy interaction of the business area with its real needs and then creating a definitive view with all relevant points duly stressed.

Learn more

Microsoft-led breakout sessions

Sponsored by: Microsoft | Next-Level Analytics with Power BI and Databricks

Wednesday, June 28 @2:30 PM

The widely-adopted combination of Power BI and Databricks has been a game-changer in providing a comprehensive solution for modern data analytics. In this session, you'll learn how self-service analytics combined with the Databricks Lakehouse Platform can allow users to make better-informed decisions by unlocking insights hidden in complex data. We'll provide practical examples of how organizations have leveraged these technologies together to drive digital transformation, lower total cost of ownership (TCO), and increase revenue. By the end of the presentation and demo, you'll understand how Power BI and Databricks can help drive real-time insights at scale for organizations in any industry.

Learn more

Communications, Media & Entertainment Industry Forum

Wednesday, June 28 @3:30 PM

Data is at the core of nearly every innovation in the Communications, Media & Entertainment (CME) industry. Leaders across advertising, broadcast & film, communications, gaming, and sports are harnessing the power of data and analytics to digitally transform and make smarter decisions that minimize risk, accelerate innovation and drive sustainable value creation. As part of our industry programming at Summit, we're hosting a dedicated Forum for CME. This 90-minute capstone event brings together over 250 in-person attendees and more online for presentations, networking and engaging discussions on the trends shaping the use of data + AI across the industry.

Learn more

Processing Prescriptions at Scale at Walgreens

Thursday, June 29 @1:30 PM

We designed a scalable Spark Streaming job to manage 100s of millions of prescription-related operations per day at an end-to-end SLA of a few minutes and a lookup time of one second using CosmosDB.

In this session, we will share not only the architecture, but the challenges and solutions to using the Spark Cosmos connector at scale. We will discuss usages of the Aggregator API, custom implementations of the CosmosDB connector, and the major roadblocks we encountered with the solutions we engineered. In addition, we collaborated closely with Cosmos development team at Microsoft and will share the new features which resulted. If you ever plan to use Spark with Cosmos, you won't want to miss these gotchas!

Learn more

Partner-led breakout sessions

Sponsored by: Avanade | Accelerating Adoption of Modern Analytics and Governance at Scale

Wednesday, June 28 @11:30 AM

To compress transformation, enable reinvention as the strategy, and use data as the primary source of competitive advantage, clients that we work with are investing into modernizing the foundational elements powering their digital core. These building blocks supporting the enterprise reinvention are the enterprise metadata and data management services, data management foundation, and data services and products that enable business units to fully use their data and analytics at scale.

In this session, Avanade data leaders will highlight how Databricks modern data stack fits Azure data management foundation ecosystem, how Unity catalog metadata supports automated data operations scenarios, and how we are helping clients measure modern analytics and governance business impact and value.

Learn more

Sponsored by: Infosys | Topaz and Our AI First Approach

Wednesday, June 28 @12:30 PM

A discussion covering the journey of enterprise AI First adoption including standardized framework and toolset, realized with Azure Databricks as a key framework component, for Global Organizations.

How Infosys Topaz, our AI-first foundation along with Databricks platform will help our customers navigate the AI journey for their organizations

Learn more

Health Care Service Corporation's Cloud Migration and Data Analytics Modernization Journey

Wednesday, June 28 @12:30 PM

HCSC is a leading health care insurer that collaborated with Deloitte to modernize their data analytics and intelligence capabilities by taking the on-premise data lake and data warehouse platforms and moving them to the cloud.

In this session, Deloitte and HCSC present how to leverage Databricks and automation to migrate solutions from an on-premise Apache Hadoop platform to Microsoft Azure Cloud. This modernization provides HCSC greater agility with an elastic, scalable and responsive engineered solution approach, which is paramount as they strive to be the partner of choice for benefits and care coordination, improving their members' health and well-being through all phases of life.

Learn more

Sponsored by: Avanade | Enabling Real-Time Analytics with Structured Streaming and Delta Live Tables

Wednesday, June 28 @1:00 PM

Join the panel to hear how Avanade is helping clients enable real-time analytics and tackle the people and process problems that accompany technology, powered by Azure Databricks.

Learn more

Scaling MLOps for a Demand Forecasting Across Multiple Markets for a Large CPG

Wednesday, June 28 @1:30 PM

In this session, we look at how one of the world's largest CPG company setup a scalable MLOps pipeline for a demand forecasting use case that predicted demand at 100,000+ DFUs (demand forecasting units) on a weekly basis across more than 20 markets. This implementation resulted in significant cost savings in terms of improved productivity, reduced cloud usage and faster time to value amongst other benefits. You will leave this session with a clearer picture on the following:

  1. Best practices in scaling MLOps with Databricks and Azure for a demand forecasting use case with a multi-market and multi-region roll-out.
  2. Best practices related to model re-factoring and setting up standard CI-CD pipelines for MLOps.
  3. What are some of the pitfalls to avoid in such scenarios?

Learn more

Real-Time Streaming Solution for a Call Center Analytics: Business Challenges and Technical Enablement

Wednesday, June 28 @1:30 PM

A large international client with a business footprint in North America, Europe and Africa reached out to us with an interest in having a real-time streaming solution designed and implemented for its call center handling incoming and outgoing client calls. The client had a previous bad experience with another vendor, who overpromised and underdelivered on the latency of the streaming solution. The previous vendor delivered an over-complex streaming data pipeline resulting in the data taking over five minutes to reach a visualization layer. The client felt that architecture was too complex and involved many services integrated together.

Our immediate challenges involved gaining the client's trust and proving that our design and implementation quality would supersede a previous experience. To resolve an immediate challenge of the overly complicated pipeline design, we deployed a Databricks Lakehouse architecture with Azure Databricks at the center of the solution. Our reference architecture integrated Genesys Cloud -> App Services -> Event Hub -> Databricks <-> Data Lake -> Power BI.

The streaming solution proved to be low latency (seconds) during the POV stage, which led to subsequent productionalization of the pipeline with deployment of jobs, DLTs pipeline, including multi-notebook workflow and business and performance metrics dashboarding relied on by the call center staff for a day-to-day performance monitoring and improvements.

Learn more

Sponsored by: Capgemini | Driving Speed to Insights with an Extended Lakehouse Architecture

Thursday, June 29 @1:30 PM

Enterprises have long struggled to enable data as an enterprise asset for the full range of data consumption patterns. While we continue to face obstacles with the traditional patterns that include reporting, ad-hoc analytics, real-time analytics and AI/ML, we also now face an entirely new pattern that we call autonomous analytics.

In one example, within the context of the Azure Public Cloud, we illustrate how firms can leverage the open data format of the Databricks Lakehouse, using Delta Lake, combined with other cloud native and ISV solutions to organize and distribute enterprise data in an insights first architecture. Learn how to leverage this architecture to provide an expanded business semantic data layer with richer data in real-time that drives critical insights at the speed of business.

Learn more

Sponsored by: Qlik | Extracting the Full Potential of SAP Data for Global Automotive Manufacturing

Thursday, June 29 @1:30 PM

Every year, organizations lose millions of dollars due to equipment failure, unscheduled downtime, or unoptimized supply chains because business and operational data is not integrated. During this session you will hear from experts at Qlik and Databricks on how global luxury automotive manufacturers are accelerating the discovery and availability of complex data sets like SAP. Learn how Qlik, Microsoft, and Databricks together are delivering an integrated solution for global luxury automotive manufacturers that combines the automated data delivery capabilities of Qlik Data Integration with the agility and openness of the Databricks Lakehouse platform and AI on Azure Synpase.

We'll explore how to leverage the IT and OT data convergence to extract the full potential of business-critical SAP data, lower IT costs and deliver real-time prescriptive insights, at scale, for more resilient, predictable, and sustainable supply-chains. Learn how organizations can track and manage inventory levels, predict demand, optimize production and help their organizations identify opportunities for improvements.

Learn more

Delta Sharing: The Key Data Mesh Enabler

Thursday, June 29 @2:30 PM

Data Mesh is an emerging architecture pattern that challenges the centralized data platform approach by empowering different engineering teams to own the data products in a specific business domain. One of the keys to the success of any Data Mesh initiative is selecting the right protocol for Data Sharing between different business data domains that could potentially be implemented through different technologies and cloud providers.

In this session you will learn about how the Delta Sharing protocol and the Delta table format have enabled the historically stuck-in-the-past energy and construction industry to be catapulted to the 21st century by way of a modern Data Mesh implementation based on Azure Databricks.

Learn more

We also invite you to visit the Microsoft booth on the Expo floor, where you'll get to talk 1:1 with Azure data engineering on how to address your toughest analytics challenges with Azure.

Register now to join this free virtual event and join the data and AI community. Learn how companies are successfully building their Lakehouse architecture with Azure Databricks to create a unified, open, and scalable data platform. Get started using Databricks with $200 in Azure credits and a free trial.

Try Databricks for free

Related posts

See all Company Blog posts