Modern Architecture of a Cloud-Enabled Data and Analytics Platform
On Demand
Type
- Session
Format
- In-Person
Track
- Industry and Business Use Cases
Industry
- Healthcare and Life Sciences
Difficulty
- Intermediate
Room
- Moscone South | Level 3 | 314
Duration
- 35 min
Overview
As part of a strategic decision to adopt cloud and to make data FAIR ( findable, accessible, interoperable, and findable ) the data team at Bayer Enabling Functions decided to build a new data and analytical platform using Data Lakehouse as the core element of that platform. The platform was built to support the creation of Data Products utilizing certain key architectural principles in mind such as providing a high level of automation, flexibility, scalability and security. From automation perspective, extraction of data from sources happens in new real-time and is controlled via metadata. The platform provides extreme flexibility in terms of solutions it can support ranging from both as near real-time to batch-based access for data science to business intelligence applications. Another important aspect of the platform is the complete separation of data from compute. The key company wide data sets are centrally managed via creation of complex data pipelines and accessible provided to teams that can build solutions using their own independent compute. This simple concept brings data warehousing, data mart and data sharing concept into a single solution which allows for central data management of Data Products with complete freedom to develop and manage individual solutions by teams and functions thereby promoting agility together with proper governance of data sets. The presentation will highlight how Databricks is used to create near real-time data pipelines, use of workbooks for complex transformations, use of security principles and end points for secure data access and use of workspaces to manage specific solutions as independent project spaces.
See the best of Data+AI Summit
Watch on demand