With a strong foundation in Arts and Bioinformatics, Oliver has gathered further knowledge in the fields of web development and data science at various (startup) companies and research institutions. In his previous internships/employments he was focusing on bioinformatics algorithms and NLP on social media data.
After that, he made a hard change in industries and has been working at ENGEL, Austria’s largest machine manufacturer, for 2 years now as the leading engineer for data science. Since then, he is facing new challenges every day (e.g., understanding injection moulding machines) and is trying to organize and make sense of machine-data – while attempting to adopt the value of data science and creativity in the traditional machine manufacturing sector.
ENGEL, which was founded in 1945, now is the leading manufacturer for injection moulding machines on the global market. Since then, and especially in the current era, the amount of data has grown immensely and has also become more and more heterogenous due to newer generations of machine controls. Taking a closer look at the conglomerations of each and every machine’s log files, one can find 13 different types of timestamps, different archive types and more peculiarities of each control generation. Apparently, this has led to certain problems in automatically processing and analysing the data.
In this talk, you will explore how ENGEL managed to centralise this data in only one place, how ENGEL set up a data pipeline to ingest batch-oriented data in a streaming fashion and how ENGEL migrated their pipeline from an on-premise Hadoop setup to the cloud using Databricks.
Together with Oliver Lemp, Data Scientist for ENGEL, dive into the journey of integrating legacy data where you will learn how to manage the following aspects:
Speaker: Oliver Lemp