Eugene Mandel is Head of Product at Superconductive and a core contributor to the Great Expectations open source library. Prior to Superconductive, Eugene led data science at Directly, was a lead data engineer on the Jawbone data science team, and co-founded 3 startups that used data in diverse fields – internet telephony, marketing surveys and social media. Eugene’s core interest has been turning data into real products that make users happy.
June 25, 2020 05:00 PM PT
Untested, undocumented assumptions about data in data pipelines create risk, waste time and erode trust in data products. Automated testing has been one of the biggest productivity boosters in modern software development and essential for managing complex codebases. Data science and engineering have been largely missing out on automated testing. This talk introduces Great Expectations, an open-source python framework for bringing data pipelines and products under test. Great Expectations is a python framework for bringing data pipelines and products under test. Like assertions in traditional python unit tests, Expectations provide a flexible, declarative language for describing expected behavior. Unlike traditional unit tests, Great Expectations applies Expectations to data instead of code. We strongly believe that most of the pain caused by accumulating pipeline debt is avoidable.
We built Great Expectations to make it very, very simple to:
We hope it helps you as much as it's helped us. Main takeaways: