Connor McCambridge is a Senior Data Scientist at T-Mobile on the Fraud Reporting & Analytics Team where he leverages technology and machine learning to better understand and prevent fraud. Previously, he worked at Sprint as a Data Scientist in Fraud Management before the merger with T-Mobile and started his data science career as intern for Sprint’s Prepaid Division. He holds an MS in Business Intelligence and Analytics from Rockhurst University and BS in Finance and Economics from the University of Missouri. He enjoys the complex problem solving and unique coding challenges that the field of data science provides.
May 26, 2021 05:00 PM PT
Here at T-Mobile when a new account is opened, there are fraud checks that occur both pre- and post-activation. Fraud that is missed has a tendency of falling into first payment default, looking like a delinquent new account. The objective of this project was to investigate newly created accounts headed towards delinquency to find additional fraud.
For the longevity of this project we wanted to implement it as an end to end automated solution for building and productionizing models that included multiple modeling techniques and hyper parameter tuning.
We wanted to utilize MLflow for model comparison, graduation to production, and parallel hyper parameter tuning using Hyperopt. To achieve this goal, we created multiple machine learning notebooks where a variety of models could be tuned with their specific parameters. These models were saved into a training MLflow experiment, after which the best performing model for each model notebook was saved to a model comparison MLflow experiment.
In the second experiment the newly built models would be compared with each other as well as the models currently and previously in production. After the best performing model was identified it was then saved to the MLflow Model Registry to be graduated to production.
We were able to execute the multiple notebook solution above as part of an Azure Data Factory pipeline to be regularly scheduled, making the model building and selection a completely hand off implementation.
Every data science project has its nuances; the key is to leverage available tools in a customized approach that fit your needs. We are hoping to provide the audience with a view into our advanced and custom approach of utilizing the MLflow infrastructure and leveraging these tools through automation.