Background: Classifying diseases into ICD codes has mainly relied on human reading many written materials, such as discharge diagnoses, chief complaints, medical history, and operation records as the basis for classification. Coding is both laborious and time-consuming because a disease coder with professional abilities takes about 20 minutes per case on average. Therefore, an automatic code classification system can significantly reduce human effort.ICD-10(International Classification of Diseases 10th revision) is a classification of a disease, symptom, procedure, or injury. Diseases are often described inpatients’ medical records with free texts, such as terms, phrases and paraphrases, which differ significantly from those used in ICD-10 classification.
Objectives: This paper aims at constructing a machine learning model forICD-10 coding, where the model is to automatically determine the corresponding diagnosis codes solely based on free-text medical notes.Methods: This paper applies Natural Language Processing (NLP) and Recur-rent Neural Network (RNN) architecture with Self-attention mechanism and transformers to classify ICD-10 codes from natural language texts with super-vised learning. Results: Our predicting result can reach F1-score of 0.82 on ICD-10-CM code in the experiments on extensive teleconsultation data.
Conclusion: The developed model can significantly reduce human resources in coding time compared with a professional coder.
Nirav Kumar: Hello friends. This is Nirav Kumar. I lead the data science at Halodoc. First of all, I would like to thank Databricks for organizing and giving us an opportunity to present our paper here. I will be joined by my colleague, Joinal, who has been a data scientist at Halodoc and we both are going to talk about how, through AI, we are assigning ICD-10 codes to tele-consultation. As part of the agenda, first, I would take you through what we are doing at Halodoc, give you some statistics on how we fare in Indonesia. Second, I would be taking through the machine learning aspects, the work that we are doing using AI. And third, we are going to go through how we are utilizing AI for insurance claim adjudication and how we use the international classification of diseases for claim processing. I would be taking into this journey till the insurance claim adjudication point three, whereas the rest part of the journey, with regards to building the solution, would be taken over by my colleague Joinal.
Let me give you an idea about Halodoc. We, as an organization are working towards simplifying access to healthcare and as part of which we are one of the most predominant player in Indonesia market. The number one health tech company, and we have 18 million active users and we have 38 million monthly active users. And when we compare to the population of Indonesia, we are well spread across the entirety. What are the services we as an organization are providing as part of the tele-consultation? We have chatting with the doctor. As part of pharmacy delivery, we have health stores where people can come in, buy medicine of their healthcare needs and they would have it delivered at their home. We provide appointments, we provide lab tests. We also are an insurance TPA service by means of which we work with mostly all of the insurance providers in Indonesia.
And we do the claim adjudication on behalf of the insurance provider. I just wanted to give you an idea of how our tele-consultation services are provided across the entire of the country. As you can see, these red dots cover practically the entire country, and this surely would give you an idea of the excess as well as the demand for Halodoc within Indonesia market. As part of machine learning at Halodoc, we provide quality of care through SAP protocol, which is subject to assessment planning as well as etiquette as a process that we follow. We have developed article related recommendation systems. We do order fulfillment wherein it comes from the nearest store. We identify cohorts of users, which are for marketing. Those could be the mothers, those could be caretakers.
As part of the personalization, we go ahead and provide for the user coming onto our platform an opportunity to interact with the set of doctors who are related to the ailment that the person is having. We use ICD-10 for insurance adjudication. We have been also utilizing optical character recognition for automatically identifying users when they upload the KTP cards. And we do the filtering of the obscenity when people come for tele-consultation with the doctors as well as we are utilizing upselling or cross-selling when people put items onto the card. Machine learning at Halodoc has been at the forefront of all the technology development that we do. And a good part of the company is utilizing this as a differentiator across the Indonesia market.
This is where I would really want to take you through how we are utilizing ICD-10 for claim adjudication. ICD is International Classification of Diseases. And as part of the industry, we have manual coders who go ahead and identify these codes for any insurance claim that comes in. Typically, for every ailment that the person comes in and the doctor notes that exists there, it takes around 20 minutes for a coder to identify the right codes. This is where machine learning at Halodoc makes a difference where in within seconds, using AI we are able to infer what is the code based on the doctor notes that the doctor has prepared while chatting with the patient.
Let me give you an idea of how ICD-10 code is organized. Generally it tells us about the diagnosis and the procedures. Here, since we are into the tele-consultation side, we are more focused on the diagnosis side and the use cases that we have is mostly towards billing and the insurance. And as you can see, there is a well-defined approach to representing diseases. If we look at the anatomy of the code, the first part gives the section, the second part gives the part of the body and similarly, what kind of body, reason, what is the approach, what is the device and the qualifier. So these specifically represent the category, the etiology and the extension. So just for an example, if the code is S86.011 D, S signifies the injuries, the poisoning that would have happened. Now, 86 specifically gives an idea about injury of the muscle, the facia, and the tendon at lower leg.
When you look at the second part, which is after the dot the 011 tells me that it is a strain of right Achilles tendon. So as I was talking, it tells about the body part, which body part. This is what is captured here, and when we look at the extension, D signifies that this is not the first time you had this, it’s a subsequent encounter. So this code gives a full information about the ailment the person is having. And this is what our model predicts automatically. Now I would want to take you through how we have designed our system. As part of the training, we take the doctor notes.
We have labeled our doctor notes with different ICD codes and this labeling has been done approximately on 150K data sets. Once we have the label onto the doctor notes, we feed it into the neural network model that we have, and the resulting ICD-10 multilevel classify model that we have developed, that is where, as part of the inference when we have a doctor note that becomes available, the same set of feature extraction go ahead and utilize this ICD-10 multi-level classifier to give us the predicted ICD-10 code. Now I will pass on the baton to Joinal who will take you through the journey of classification. Over to you Joinal.
Joinal Ahmed: Thank you, Nirav for the introduction and explaining the solution why we are building it and also what we are doing at Halodoc for simplifying healthcare in Indonesia. Going forward, I’ll walk everyone through how we are creating the models, getting the data to train the models. As Nirav said, we have about 150K consultations tagged, and also we are generating a huge number of tag consultations and insurance claims every day. To do this, we are leveraging AWS technologies specifically since that data is stored in multiple RDS instances, we first use AWS data migration service to dump the raw data to an S3 bucket. And from there we also use Lambda to trigger our data processor and which we are also leveraging for this Apache Hudi and Spark to process our data and then we get all the data together to basic transformations. And then from the data to an S3 bucket. Here, all our insurance claims, which are tagged by maybe a clean analyst or a doctor, are picked up by these services and then processed on an EMR cluster using Hudi and Spark and then dumped on S3 bucket.
And from there, the data scientist pick the data, train our models. And also this is used for real time inferences. A bit on or a bird eye view on [inaudible] here we use deep neural nets with attention mechanism. So you have to start with, you have an embedding layer. So since these are texts [inaudible] regarding Bahasa, we first convert these texts into vectors, and then we’d feed these on BILSTM [inaudible] so that we capture the context information from these a combination of words. And once we have the vectors generated by the BiLSTM layer, we then feed this to a label Attention mechanism. Why we do is this, this Label Attention mechanism helps us to capture all the important words in the sentences. And once we have these words, we then use sigmoid to make n-binary classification decisions where for each of the ICD-10 code, we are deciding if this code can be assigned to a consultation or not.
This also solves another purpose because of the sheer number of business ICD-10 CM code, it will be hard to make an multi-class classification. And also, mostly the nature of these consultation can have more than one ICD-10 code so these binary decisions helps us to tag more than one code to a consultation. Going a bit deeper into each of the layers, first, we leverage Jealousy, thanks to them for the wonderful library, and we use [inaudible]. What vectorize it? For these, we use a combination of both the doctor notes and also the standard definitions for each of the codes as a Corpus. And then we train our COVID model of four sites [inaudible] six, this number we had an experiment and figure it out and came up with a decision that 256 is the best dimension with both gives us… They can see for this model [inaudible] also represents the context well, and then from this model, we then generate the vector embeddings for all the doctor notes.
Yeah a bit on how Word2Vec works. So we use a CBOW Word2Vec network. What it does is given a word embedding, it uses n future works as well as some past words to generate an embedding for the firewall. And for our use case, we had to train this model from scratch and not use any existing models, because all our consultations of transactions that are happening on the platform are in Bahasa. Yeah. Going forward. Once we have these vector embeddings generated using the Word2Vec model, we then pass on these vector embeddings to a BiLSTM layer. This BiLSTM layer then helps us to capture the contextual information of the words.
And also since BiLSTM or LSTM in nature takes care… I have some hidden states which capture the past information. So multiple listing, multiple sentences coming, if a consultation has by nature, multiple listings or multiple sentences, these information are kept intact. And then we are able to make a better decision. Next, when we have the output from the BiLSTM layer, which captures the contextual context of it, we then pass on these vectors to the Label Attention layer. This is work on top of attention mechanism. So what this does is for a given input, and also for the specific level vector, it tries to gives the words in the sentence, which have more contextual information, which have more meaningful… For example, if a word has, if a sentence has the word like fever or maybe high temperature, so these words would be highlighted.
And then we can also use these later on to make decisions on why a specific code was assigned. So we can use explainable AI in this case tool to figure out why a decision has been made or why a specific code has been assigned to a consultation. At last, once we have this vector decisions, we then pass on these vectors to a label classification layer. [inaudible] for this, so this label classification layer makes binary decisions basically. So for each of the codes, in our corpus or in our data, it will make a binary decision, whether this code can be assigned or not. And this helps us also to take more than one code. For example, if someone has fever by nature, he might also have headache. He might also have nausea, so we can add, or we can tag our own consultation with multiple secondary diagnosis also. And this helps in insurance because if we only take one code, most of the other medicines on his prescription might not be covered. But once you take all the possible diagnosis that has happened here, the insurance claim the patient gets is more. And, yeah.
Yeah. And for the whole training the model, we leveraged binary cross entropy loss and [inaudible] optimize it. So how it helps us to optimize the model further for the specifics of the codes. And we also experimented with… On the BiLSTM layer, we also experimented with LSTM. We also experimented with GRUs and [inaudible] layers, so all of them have their own advantages, but in our case, BiLSTM layers helped us the most because they were able to have more parts information and to make a better decision for us and yeah. Going a bit further. Yeah.
For the training and testing validation of the data, we were fortunate enough to tune the model to an accuracy where we were having a validation accuracy of 96. And once in production, the model is also working or giving out a very good results, and we are sustaining it around 94% on production. And we are continuously working on it to make the predictions better and also adding more ICD-10 codes that we can support. [inaudible] Yeah. There’s a bit about ourself. Yeah. We also run a blog on where we document all the work we do and how this is impacting, or at least the tech part of it, how it is impacting the health care remuneration. Yeah. That’s all. Yeah We are open to questions now, if…
Senior Data Scientist with 4+ years of experience in Data Science with strong technical understanding. Experience ranging from understanding business requirements to translating into data-driven solut...
Nirav Kumar is the Director of Data Science at Halodoc. With 8+ years of experience across both Data Science and Data Engineering, responsible for development of new insights, advanced modeling techni...