Accelerating clinical trials to improve patient outcomes
Mendel uses Databricks Mosaic AI to improve clinical trial matching
Model training to provide patient insights
Mendel, an AI leader in healthcare and life sciences, tackles the challenges of fragmented clinical data with their HyperCube platform, which enables precision in tasks like clinical trial matching. However, the complexity of building disease-specific ontologies and managing data across multiple systems slowed progress. To overcome these hurdles, Mendel leveraged Databricks Mosaic AI to accelerate model training, reducing the timeline from three months to just one, while ensuring HIPAA compliance and enhancing the accuracy of their AI solutions.
Overcoming the complexities of ontology building
Co-founded by a physician and a computer scientist in 2017, Mendel is at the forefront of using data and AI to revolutionize the Healthcare and Life Sciences (HLS) sector by simplifying the search and analysis of clinical data. Their mission is to accelerate clinical data workflows, enabling healthcare providers and researchers to quickly and accurately identify patient cohorts, match candidates for clinical trials and support precision medicine initiatives. By working closely with diagnostic companies that partner with biopharmaceutical firms, Mendel plays a crucial role in streamlining the complex processes involved in drug development and patient care.
However, despite their advanced AI capabilities, Mendel was unable to fully realize their mission due to several key challenges. One of the most pressing issues was the fragmentation of clinical data across various systems, including electronic medical records (EMRs) and different health networks. This data was often siloed, making it difficult for users to perform real-time queries or extract actionable insights within Mendel’s HyperCube product. For example, when clinical researchers needed to find specific patient cohorts based on factors like cancer stage or medication usage, the lack of a unified data structure made it nearly impossible to retrieve relevant information on HyperCube efficiently. This fragmentation not only slowed down clinical workflows for Mendel’s clients but also limited the accuracy and effectiveness of the insights provided to their partners in diagnostics and biopharma.
Adding to the challenge, Mendel faced the daunting task of building disease-specific ontologies, essential for enabling semantic querying within clinical data. Developing these ontologies required extensive resources, including teams of data engineers, data scientists and medical experts. The process was both time-consuming and complex, involving the mapping of various medical concepts and relationships into a cohesive knowledge graph. Without a scalable approach to building these ontologies, Mendel faced hurdles in conducting the precise data analysis necessary for tasks such as clinical trial matching and advancement of precision medicine. Caleb Li, Director of Partnerships and Corporate Development at Mendel, explained, “We wanted to work with Databricks because we found our customers have two key challenges: they want to make sure everything is compliant with the Health Insurance Portability and Accountability Act (HIPAA) when managing data egress, and they want to explore analytics.” Yet, the challenges with scaling their AI capabilities collectively impeded Mendel from delivering on their promise of accelerating clinical data workflows and supporting the HLS industry.
Enhancing model accuracy with Databricks Mosaic AI
To accelerate their generative AI (GenAI) research and better serve their customers, Mendel turned to Databricks, attracted by their strong reputation and the hands-on support provided by the Mosaic AI engineering team. The collaboration was crucial for Mendel’s AI initiatives, particularly their HyperCube product, which required advanced tools for pretraining custom large language models (LLMs) like Llama 3.1. The Mosaic AI engineering team helped Mendel fine-tune these models with exceptional accuracy and relevance, significantly boosting their ability to deliver precise AI solutions in clinical settings.
Mendel started with a partially trained model, continuing its training on public datasets before integrating their proprietary data — around 200 billion raw tokens — into the process. With help from Databricks, Mendel was able to unify their data and ETL pipelines, which further optimized the performance of their AI solutions. The integration of data federation capabilities through the Databricks Data Intelligence Platform also allowed Mendel to seamlessly run queries across multiple data sources, ensuring that models are trained on the most comprehensive and accurate datasets available. As Li noted, “Even though we’re a smaller company, Databricks doesn’t treat us that way. We’re getting the personal attention that a larger company would. Their engineers are almost like an extension of our team and have enabled us to have confidence that we’re working on the right approach and training on the right dataset."
Investing in advanced AI projects
Looking ahead, Mendel plans to leverage Lakehouse Federation capabilities within Unity Catalog. This will enable Mendel to manage and audit data access seamlessly across multiple sources, ensuring that their federated queries are both secure and efficient. The Mendel team anticipates that Databricks will continue to play a critical role in integrating new technologies and helping Mendel navigate complex regulatory environments. The close collaboration positions Mendel to further solidify their standing as a leader in healthcare AI, with Databricks providing the essential tools and expertise needed to succeed. Li concluded, “As Databricks expands their reach in life science and clinical data, it gives us confidence that our partnership is headed in the right direction.” To learn more about Mendel’s mission to help healthcare practitioners learn from patient journeys in real time, visit www.mendel.ai.