CVS Health delivers millions of offers to over 80 million customers and patients on a daily basis to improve the customer experience and put patients on a path to better health. In 2018, CVS Health embarked on a journey to personalize the customer and patient experience through machine learning on a Microsoft Azure Databricks platform. This presentation will discuss how the Microsoft Azure Databricks environment enabled rapid in-market deployment of the first machine learning model within six months on billions of transactions using Apache Spark. It will also discuss several use cases for how this has driven and delivered immediate value for the business, including test and learn experimentation for how to best personalize content to customers. The presentation will also cover lessons learned on the journey in the evolving industries of cloud computing and machine learning in a dynamic healthcare environment.
– Hey everyone. And welcome to spark summit. My name’s Michelle. And today we’re going to be talking about our journey journey of bringing personalization to customers and patients at CVS health.
I’ve been with CVS, just under five years and currently lead the data engineering team for the pharmacy personalization initiative, where my team supports the productionalization of machine learning models to personalize and improve the experience for retail pharmacy patients. Prior to working at CVS, I worked in the public policy, nonprofit sectors, but always with a passion of using data to solve challenging problems. And here I’m here today with, Raghu Nakka who’s gonna also introduce himself. – Hello everyone. I’m very excited to be here. My name is Raghu Nakka I lead data engineering team for CVS retail, CVS health retail and store business I have been at CVS health for the past seven years. And before that I worked as a consultant for a large healthcare and financial organizations. Working for CVS has been a very satisfying experience for me personally. I like the fact that I can do what I love, which is working on innovative technological products, using cutting edge technologies, to achieve our companies, healthcare mission, which is helping people on their path to better health, at CVS health. I built their platforms, cloud infrastructure, and machine learning at scale. I typically, my typical day would involve solving challenges of handling big data in healthcare space. Today. We’re going to cover seamless serious personalization journey and the lessons we learned during the process and challenges posed by constant growth and ever changing technological space. And finally, we’re going to give a sneak peek on how we are going to tackle these challenges in future using new tools. We are currently exploring, exploring to strengthen our data architecture. – Great, so before we jump to into the personalization, I just want to give everyone a little bit of background on CVS health as an enterprise. So CVS has a diverse set of assets, all, all focused first driving, our mission of becoming a healthcare innovation company, with the goal of making quality care , more affordable, accessible, and simple for customers and patients. And so at the core of that, we have advanced analytics to help drive and improve the customer experience, across the enterprise, whether that’s from CVS pharmacy, the PBM caremark business, MinuteClinic Retin-A. So today’s focus for the presentation will be on the front store and pharmacy, businesses, before Raghu and I, support, support on the pharmacy and front store businesses. We have about 10,000 retail locations across The country. And so our focus will be on, the journey to drive personalization within, within, those areas. – with all of those business lines that Michelle has mentioned earlier.
It poses a unique set of problems, to do a machine learning project of any kind. So for any personalization project, understanding the customer behavior is really the key. So, understanding, CVS customer behavior is really unique. So we have a unique set of customers, because we have different, large number of micro segments, which of which poses challenges to understand oral behavior of the customer. Because we have definite, like Michelle mentioned, we have like 10,000 stores across the United States, and we have, pharmacy patients, health needs are varied and sometimes unpredictable. So all of these challenges, make our customer really unique and our data is unique too. We are not dealing with a typical grocery store customer. It is really hard to predict customer behavior, off a convenience shopper because a customer could go to a convenience store just because he forgot milk, or he could go to a convenience store like CVS to pick up his prescription and stop by for go grab candy. So it’s really the predictor of the behavior could be unpredictable, which, which is why our data sparsity and dimensionality, leads into over fitting issues for any kind of machine learning model and coming to the third. I guess that I would like to point out is the situations that we typically deal with in a retention store. For example, the covid situation, will instantly make, or has instantly Made our stactic machine learning model. either outdated or invalid because the customer foot traffic, has significantly decreased or their purchase action has significantly changed. So all of the historical behavior, that, that, that we use to, to model the customer behavior is completely used as, so these are the, these are these, three, unique, things that we would like to point out. And, before, before, before we jump into the personalization, goals of CVS, we thought it is really important to understand the CVS customer B when it comes to, personalization goals of CVS, see a personalization is delivering the right individual experience in the right channel at the right time.
So we have, we have to provide, the customer experience. We have to provide the right experience to the customer patient, whether it’s, whether they, whether it is a pharmacy product or whether, whether it’s, a reward or a coupon that we offer, or whether, whether it is a minute clinc service or health club or whether they’re enrolled in an engagement in a loyalty program, like extra care card. So it’s all about providing the right experience. And it is also important for us to understand what channel the customer or a customer would respond better, whether it’s a text message or a phone call, or an in store offer. So optimizing our, process to define , to come up with the right channel that a customer or patient would respond is also really important for us. And the the timing of the offer as well, right? So if a patient, could, could be up for a prescription renewal, and that could be a right timing, to send an offer to the customer, rather than, the customer already filled a prescription, and there is no reason for the customer to have the trip up, right? So this is one example of, the significance of the right timing for the customer, and then the tailored message to the customer is always important for us. That is one important goal because a customer preferred language could be different than what we’re interacting from. So we, we define all these goals and we use testing experimentation framework, which we’re going to Cover in a later slide to come up with these various attributes, which is going to help us define the 360 degree view of a customer.
So in order to support those goals, where here is a quick overview of our existing text texts time that we started building any area over the past eight year and half, right. We have, like any traditional company, we have, traditional data, relational database houses, and we have several desparate fires coming over from, POS sales and, several other places. And we have used these cloud technologies to ingest all of the data into, in this case. Our entire workshop is in Microsoft Azure, and we, our entire range runs on Databricks. Of course which is on sparke Apache spark. And, we use, airflow and, get lab for our orchestration and dev ops as well. We do use Tablo for, reporting, what our operationally and financial, metrics.
So as we could see on the top right corner, our crossing lists are pretty difficult to articulate typical machine learning, project. We, we create a unique data layer that we’re, feed into feed into feeding grade and process the way we’re going to create our features. And then we’re going to create opportunity. So definitely reach out can be tied into the mission learning models. And then we’re going to send these offers to the PRS BT daily based on the cadence.
– So, as Raghu mentioned, our personalization journey started in mid 2018. Where initial use case development started. We were able to build our first pilot on from actually Hadoop environment with within just a couple of months of that first, that first ideation, launched our first, personalized offers to 1% of customers a few months, a few months later. We quickly hit some roadblocks when we tried to scale up from there, from 1% to 5%, there’s actually an actual constraint of being able to build additional hardware to support the scale that we want it to get to. And so at that decision, we made the, at that point, we made the decision to transition to a cloud based environment using Azure Azure databricks as Raghu had mentioned. So it made that transition in early 2019. And from there, we were able to, expand the number of use cases that we were able to support double the number of personal personalized products, and then eventually scale to offer personalized, personalized offers to patients by the end of 2020. In parallel, we were also, the focus was specifically on, building this, test and experimentation framework so that we could, rapidly iterate and test lots of different iterations of, giving different experiences to patients based off of, the, the patients side just kind of iterate over incrementally, make improvements to the, with the experience we’re providing to patients and customers. And so, as you can see, this red line has shows that the size of the team, we started off with a very small team of just a handful of data engineers, data scientists, quickly as the, as the use great use cases and a number of products that we were offering through personalization application group asset, our team size, luckily, cloud-based, infrastructure was able to support and help us to scale up. And we’ll talk a little bit more later in the presentation about, what that looked like.
So just to give a little bit more context to, kind of what personalization really means at CVS. I’m going to give a quick overview of some of the use cases that, use cases and solutions, ways that we approach personalization. And then we’ll dive into to two specific use cases, one on the pharmacy, and then one on the front store side. So a couple, a couple of examples here of just at which I think Raghu had mentioned earlier. So, we have many of these, clinical products, just as, we might have many of these, coupon offers that we can offer protects your care patients. I’m really, so is it going to scarce resource that we want to prioritize? And so, what better way to prioritize than using machine learning to determine the propensity of a patient’s likelihood to accept a certain offer? But to start, actually you know the offers that we were sending out were very rules driven. So everyone got the same message at the same time through the same channel. So there wasn’t actually a lot of data that we had at our disposal to kind of make those predictions. So start it off with, randomization, testing, testing, different channel assignments for timings, so that we can, collect information about patients preferences and their behaviors, then from there, move to support both kind of from an experimentation perspective, but also from a, a machine learning perspective. So, we can use the data that we’ve collected to implement, different experiences or different experiments, segmenting off different populations, maybe, age and gender, or, agent engagement with channel, and then customize the experience within, kind of those, those segments. That’s kind of the concept of experimentation that we’re trying to drive as well, as, as we were able to collect more and more information like, can use some more, advanced analytics, like machine learning, propensity models, as well as, looking for, really, to find, opportunities where we can, like, provide an offer that, where the patient is likely to engage with. And so some of those, some of the, outcomes that we’re trying to drive really are centered around healthcare. And, at least from the pharmacy side, around medication adherence, which I’ll I’ll speak to in the next slide, goal is to, increase engagement with products that we have. But also to just create a better experience for patients, through both the person and the pharmacy.
And so to go into a little bit more depth into kind of one example use case within the pharmacy space, as I mentioned, one of the problems that the pharmacy is well situated to solve and to, kind of support patients with is medication nonadherence. So CVS has different, products that we can products and services that we can offer to patients, that help with, barriers that patients might face to medication adherence. So barriers may include forgetfulness or cost or access, especially now with, with Covid. And, there’s examples of those services could be, reminding a patient to fill their medication or reminding patient to pick up their medication, providing counseling when we, we might see indication that the patient is, is having side effects and is dropped off therapy, practically like, offering to refill or to get new prescriptions when a patient has run out of, fills on their prescription. And then also like in times right now with Covid and, offering, delivery or even, free delivery on medications is a great service, for patients that might have some kind of access barrier. And so the way that we’re able to, understand that, so goal is to first understand the patient’s profile. So what is their, adherence profile? Are they truly a nonadherent patient, or are they, it’s possible Are they truly an nonadherent patient or is it is possible to, possible for them to have, maybe their prescriber, or maybe they no longer need the medication or maybe their prescriber, getting them up, a lower dosage. And so they don’t need to come to the pharmacy as often. So there, those are all kind of conditions in our, in our, in the data that we can leverage to understand better. the timing of when patients need to come to the pharmacy so that we can better prompt them, in the way that’s most meaningful and relevant for the patient. And then I think once we know for the patient, the, the kind of, service or intervention that they might need, we also want to package that for the patient in a way that, that makes the most sense to them. So things like, choosing the best channel that we can reach out to the patient, like Raghu mentioned, like customizing the messaging so that we’re, like, reaching a patient in the language that they prefer and using content that’s, kind of speak to the patient and really help them to engage. And then a lot of this work has also just been about kind of streamlining, creating a, a longer term of, approach for streamlining, and offering a more omnichannel experience for patients. So, we’ve partnered very closely with, the, the IT systems at CVS (mumbles) The delivery channels to ensure, to ensure a more streamlined, delivery System, as well as, increasing the personalized content through these channels. – So, here’s an example of a typical retail front store, use case, right?
Like we give, we send an offer to the customer through a channels that we have covered before, and that influences the next trip, to the, to the store. And, the customer presents that offer redeems that coupon and then also expands the backpack basket size during that . So this is just a typical retail customer journey.
And the problem statement that we have is like we have, to grow, growing the engagement of our most valuable customers is really important, right. For any retail organization, and reengage the lapsed customers, and engage all active product shoppers is really the core of the problem, right? That, that that’s the, that’s, that’s what we would like to solve using the personalization we have built in. So the solution that we came up was like, okay, we need to personalize the communication and provide relevant and most exciting offers using the data from the customer profile, the 360 degree view of the customer profile that I talked about earlier. So how do we do this? Right? So we first, track the customer behavior for the past one year and, based on the transactions that we understand the patient’s behavior based on the brands and categories that the customer repurchases already And we figured out the customer’s affinity to various other brands and categories now identify their recent purchase pattern on those, on the identified definitely brands earlier and you evaluate, what offers, exactly matches or correlates with the customer behavior, that we have predicted, in earlier steps. And finally we take all of the data and then we use that to identify the customer probability, to buy a particular product or to really make coupon using various, machine learning hollow diagrams So that’s really the core of our personalization in general, like to, to predict the behavior of the customer and then, make our offers more relevant and more personable, personable, personable.
So, we, we started, using the, linear-ed models, like logistic Regression but due to due to the performance issues, which we are going to cover in the next slide, we switched to exubles models, which are nonlinear actually performing really well. And finally we take all of these outputs and we optimize, we optimize based on the constraints we have with the provider, we have sort of the budgetary constraints. And, obviously we cannot give unlimited coupons. So we optimize that, accordingly and then send out those offers to the customer.
So typically, at CVS, we don’t measure our performance using, using scans metrics. Or anything we measure our path but as a healthcare focused company, we always try to measure our performance using health care the agency metrics, and, we saw a huge uptake of 1.6%, based onimplement adherence based on various analysis, that, our measurement team has performed. So, So yeah, we, we always evaluate our, our performance as the healthcare agent’s metrics verses any sales scores – So now that we’ve given you an in depth, look at personalization at CVS, we’ll use the next couple of minutes to talk about, the growth that was enabled by the cloud based, cloud based environment, and transit transition, as well as some of the challenges that we’ve faced along the way.
So first, as I mentioned, kind of the personalization journey, how we made that transition to, cloud based, environment, in 2019. And I think as I had mentioned, that that transition definitely helped to, enable us to both get to market quickly, but also expand, expand the application, as we, as the use cases grew. So, through Azure Databricks, we have the flexibility to spin up clusters that, meet the, the unique business needs that, unique needs that we have to support various business use cases. We’re also not restricted by the physical hardware constraints that I had mentioned. And we really don’t need to spend as much time worrying about, tuning as we did, where, where we were, very constrained. So also just, I think another thing to call it is just the ease of use.
So, Databricks, in general, centralizes, all of the assets that developers need to make their jobs easy. So putting, interactive notebooks in one place, cluster management performance monitoring data meta store, really makes it easy to, onboard, onboard and quickly grow the team.
As well as just makes, supporting, supporting, growing use cases, that much more easy. Also just, less of a focus on infrastructural support and more focused on the work itself.
Although, obviously with, with, the growth that we’ve experienced over the past couple of years, we’ve certainly encountered many challenges along the way. Many of which we continue to still tackle to this day. So, even though, we’re, have transitioned to cloud based environment, like there’s still always going to still be challenges with handling big data. So, the team has explored, different optimizations like Delta, partition sizing, really focusing on optimal cluster usage.
And we’ll, continue to push forward in that regard. Second area where, has been kind of a core focus of the team has been around cost management. So as the team has grown quickly and you saw in that graph, as have our cloud costs, and so, there’s a lot of different, things that we can be doing on our side, and, Azure Databricks help to enable some of those things. So, we can spin up different types of clusters for different types of jobs. Your ETL jobs can use different, cluster and different amount of compute capacity than your model training and your feature creation jobs. There’s also just a lot that we can do from a developer standpoint. So, how do we promote best practices, within our development team in terms of, like best practices of code optimization, leveraging a sample data environment, which we’ve created for the team to use as well as, maxing out, when we’re, auto, we have the capability to auto-scale our clusters, but, really trying to max out the cluster usage, to the best of our ability, then there’s also just some great features, provided by Databricks and continually added. So things like cluster policies and pools, and different types of jobs versus interactive clusters that we want to continue to explore, to really make sure that we’re making the most of the, making the most of the environment and features that are available to us.
And then, another, another area, of that has been a challenge has been just around the evolving nature of the technology itself. So, while cloud based technology has been around for many years, it’s still relatively new within the healthcare industry and, CVS specifically. So, while we always want to, push, push the boundaries of innovation within CVS, we still have to, take into consideration, security and compliance and, healthcare regulation, all of which are critical that we kind of, meet guidelines with. So, wanting to, test and try new services, but, making sure that we’re doing that within the constraints of have the environment that we’re in.
And then one final challenge area just broadly, and they probably would jump into a little bit more around some of the challenges specific to machine machine learning journey itself. It just have been around, finding talent and, continuing to develop the talent on the team. So, with the technology being relatively, new and constantly changing, it’s, it’s difficult to find, difficult to find some five people to join the team, especially when we’re growing super quickly that have that cross functional skill set of, data science, data engineering, but also kind of understanding of dev ops best practices. And so, we’ve, been working to, both, grow the, grow the, training opportunities that we present to the team, but also, give more informal opportunities to just play around with new technology test and also share across the team. So as Raghus team is focusing on some of, some, he’ll share some of, new services that, within Azure or, other types of tools and technology, like sharing that information across the teams to help, kind of cross train, and skill everybody up.
Thank you Michelle. So while we had our own challenges on, in terms of built and performance, we also have our own, set of few new challenges, during our machine learning journey, which I’m going to cover quickly, in this slide.
So I would like to like highlight four areas where, we had our, our, own set of challenges and how we all came came out, what kind of solutions we would experience so that it would be helpful for, other organizations, where actually, starting their machine learning journey.
So in terms of when it comes to future engineering, we have our, our, predatory pipeline, these you any feature engineering, heavy and complex vision, you need quite a lot of computing power.
So, to have a center centralized features of which can be leveraged across many use cases is going to be really, really helpful.
And, otherwise you would be, You would be reusing, same features or using more features using more than needed features, which could lead into all fitting issues. And in model training area like I mentioned, like we have lot of, segments, micro segmentation, there’s just great use case as requiring many models to be, trained and implemented parallelly.
So that would require, the pilot was pipeline and we cannot generalize these use cases. So, and almost production pipeline is required as well, which we were using a native mixed jobs pipeline as a production pipeline And then when it comes to Models, selection process, too, we usually went with, for example, we went with k-means and the logistics, admission. But however, we quickly realized that, logistic regulation doesn’t do well with sparse data and leads to all fitting issues and similarly with K-means like in a question is not suited to solve the high dimensionality problem. So we feel like I mentioned earlier with the switch to non linear models, like extra boost, in terms of implementation, tuning, and pipeline integration, there are like various challenges that are, involved, that, that, that we were able to solve.
And then in terms of implementation, we have, very, initially we went with very manual process and we could atleast started exploring them in ML flow and to be able to, manage our machine learning life cycle.
So implementation is really key as you, as we grow as a team.
And, and when it comes to collaboration, one of the things that we have quickly identified is that, people are always think like, data scientists and data engineers. They both serve a role all in bit, which is, the MLR role, which, which we could, quickly identify the identified an increase, the cross training to go fill that gap as well, which has clearly helped us scale much more quickly and moves that gap between data engineering and data engineering and data scientists.
So with those challenges in mind, I would like to give you a quick sneak peak on like what we are, how we are solving the challenges that I have just explained, in future.
So, we, so we are, we, we quickly realized after reviewing our challenges and, the way, like our core pipeline was performing in a group, when we observed all of this we quickly realized that, like, we know we need to manage and use the right tool for the right job.
In other words, so one, we have a lot of tools that were exploring, like I will, like, I will quickly point one, a huge change, in thinking, which is, we’re going to switch from from using CPU based, machine learning to GPU based machine learning. So we’re going to use GPS as a compute resource for, both of our training and then transcend we’re exploring, Rapids. And then another area that we’re exploring is, Kubernetes, as well. So with Kubernetes orchestration, we could use multiple tools and integrate multiple tools into our production pipeline without, being boxed into one, one single tool. So that is an area, we were exploring as well. – And from the pharmacy side, as there’s, as, as the use cases continued to evolve, there’s an increasing interest in, enabling, some, more real time use cases. So we’ve been exploring things like Azure event hubs, Kafka as Azure streaming analytics, Azure functions, to try to, enable a more real time, personalized approach for patients. So I think, with that, we hope that you, we hope that this presentation was helpful and just providing a bit of context into how, personalization, how we think about it, through the, the, the healthcare and, retail lens at CVS health.
And, I guess with that, we’ll be happy to take any questions on.
Michelle Un is a Director of Data Engineering at CVS Health. Michelle is currently focused on the productionalization and deployment of machine learning models to personalize and improve the patient experience for CVS™ retail pharmacy customers. Previously, Michelle supported CVS pharmacy's compliance analytics workstreams, including outlier detection and visualization, and has also worked in the public policy and non-profit sector, always with a passion for using data to answer challenging problems.
Raghu Nakka leads the Front Store Data Engineering team at CVS Health and is responsible for building and maintaining the Front Store Personalization Engine. His big data journey with CVS Health began in 2013 where he initially built the big data frame work for PBM Book of Business and Finance applications and later as a lead big data architect for Front store personalization. Before CVS Health Raghu worked as a consultant working with Finance and Healthcare companies.