John Haddad is vice president at Informatica, where he runs product and technical marketing for the Big Data, Enterprise Data Catalog and Cloud/Hybrid data management product portfolios. He has over 25 years’ experience developing and marketing enterprise software, focusing on enterprise cloud data management over the last 10 years. Previously, John held various positions in product marketing, R&D, and management at Oracle and Right Hemisphere (acquired by SAP). John holds an AB in applied mathematics from UC Berkeley.
October 16, 2019 05:00 PM PT
Interacting with customers in the moment and in a relevant, meaningful way can be challenging to organizations faced with hundreds of various data sources at the edge, on-premises, and in multiple clouds.
To capitalize on real-time customer data, you need a data management infrastructure that allows you to do three things:
1) Sense-Capture event data and stream data from a source, e.g. social media, web logs, machine logs, IoT sensors.
2) Reason-Automatically combine and process this data with existing data for context.
3) Act-Respond appropriately in a reliable, timely, consistent way. In this session we'll describe and demo an AI powered streaming solution that can tackle the entire end-to-end sense-reason-act process at any latency (real-time, streaming, and batch) using Spark Structured Streaming.
The solution uses AI (e.g. A* and NLP for data structure inference and machine learning algorithms for ETL transform recommendations) and metadata to automate data management processes (e.g. parse, ingest, integrate, and cleanse dynamic and complex structured and unstructured data) and guide user behavior for real-time streaming analytics. It's built on Spark Structured Streaming to take advantage of unified API's, multi-latency and event time-based processing, out-of-order data delivery, and other capabilities.
You will gain a clear understanding of how to use Spark Structured Streaming for data engineering using an intelligent data streaming solution that unifies fast-lane data streaming and batch lane data processing to deliver in-the-moment next best actions that improve customer experience.