Skip to main content

Training 10,000 Anomaly Detection Models on One Billion Records with Explainable Predictions

Minimal costs

Training 10,000 Anomaly Detection Models on One Billion Records with Explainable Predictions

Published: June 26, 2025

Energy4 min read

Summary

  • DAXS (Detection of Anomalies, eXplainable and Scalable) uses transparent, scalable machine learning to detect equipment failures in manufacturing, enabling proactive maintenance and reducing downtime.
  • The approach leverages the ECOD algorithm for clear, actionable insights and employs Databricks’ cloud platform to efficiently process billions of sensor records across thousands of assets.
  • By standardizing and scaling predictive maintenance, manufacturers can improve quality control, cut costs, and quickly adapt the solution across multiple sites and asset types.

The Power of Anomaly Detection Across Industry

Anomaly detection is a crucial technique for identifying unusual patterns that could signal potential problems or opportunities. Some early uses of the technique include cybersecurity for detecting intrusions and in finance to identify potential fraud, but today its applications now span healthcare patient monitoring, telecommunications network maintenance, and more. In manufacturing specifically, anomaly detection has transformed quality control and operational efficiency by identifying deviations from expected patterns in real-time production data.

Advancing Data and Analytics in Manufacturing

Manufacturers have embraced data analytics for decades, using statistical process control and Six Sigma methodologies to optimize production and change point detection for machinery maintenance. While these approaches revolutionized quality in the 1980s and 90s, today's connected machinery generates orders of magnitude more data - from vibration sensors to thermal readings. This exponential increase in real-time data has pushed manufacturers to adopt sophisticated techniques to analyze thousands of variables simultaneously, extending Six Sigma principles to a scale impossible with traditional statistical methods. For instance, vibration and tension sensors on elevators can reveal early signs of mechanical wear, while turbines equipped with temperature and speed sensors can flag performance drops that might indicate impending part failure. By addressing these issues ahead of time, downtime is reduced, equipment runs more smoothly, and critical production deadlines become easier to meet.

The Challenges Moving Beyond Statistics

Despite any large potential benefits, implementing machine learning for predictive maintenance presents several challenges:

  1. Scalability: Industrial environments generate massive amounts of data, often reaching billions of records, which creates significant challenges for large manufacturers. Creating and managing thousands of models individually across numerous assets or facilities is challenging, requiring both substantial computational resources and efficient algorithms to process without incurring prohibitive costs.
  2. Explainability: Many advanced machine learning models operate as "black boxes," offering little insight into how they make predictions. For maintenance engineers and operators, understanding which specific component is causing an anomaly is crucial for timely and effective interventions. Sensor data are often used to gain insights into anomalies. For instance, knowing that "Sensor 5's temperature is above 80°C" provides hints to an actionable insight.
  3. Cost and Complexity: The computational costs and complexity associated with large-scale machine learning can be substantial. Organizations need solutions that are not only effective but also cost-efficient to implement and maintain.

The DAXS Methodology

To address these challenges, DAXS (Detection of Anomalies, eXplainable and Scalable) has been developed as an anomaly detection technique that provides an explainable, scalable, and cost-effective approach to predictive maintenance in manufacturing. DAXS utilizes the ECOD (Empirical Cumulative Distribution Functions for Outlier Detection) algorithm to detect anomalies in sensor data. Unlike traditional black-box models, ECOD offers transparency by identifying which specific sensors or features contribute to an anomaly prediction. DAXS can handle datasets with over a billion records and train thousands of models efficiently leveraging distributed computing platforms to ensure reliable performance and cost efficiency.

Wind Turbine Demonstration

In this series of notebooks, we show how DAXS can be applied at scale. The task involves monitoring thousands of turbines in the field for potential failures. We demonstrate how 1,440 readings from 100 sensors embedded in 10,000 turbines can be utilized to train 10,000 models and make predictions on new readings—all in under 5 minutes. This is achieved through the efficient implementation of ECOD, combined with Databricks' robust capabilities for scaling compute operations.

Why Databricks?

Databricks provides an ideal platform for implementing DAXS due to its robust capabilities in handling big data and advanced analytics. With Databricks, organizations can leverage:

  • Unified Analytics Platform: A collaborative environment that integrates data engineering, data science, and machine learning, streamlining workflows and improving productivity.
  • Scalability and Performance: Databricks' scalable computing resources and optimized Spark engine enable rapid processing of large datasets, essential for training models on billions of records.
  • Cost Efficiency: By optimizing resource allocation and utilizing cloud-based infrastructure, Databricks helps reduce operational costs, aligning with DAXS's goal of providing a super cheap solution.
  • Advanced Tooling: Support for popular machine learning libraries and frameworks, allowing for seamless integration of the ECOD algorithm and other advanced analytics tools.

Summary

DAXS (Detection of Anomalies, eXplainable and Scalable) anomaly detection offers a standardized approach to monitoring manufacturing operations at scale. By training models on normal equipment behavior, manufacturers can deploy this technique cost-effectively across multiple production lines, facilities, and asset types. This reusability enables enterprises to quickly implement predictive maintenance and quality control, driving consistent improvements in efficiency and output quality across their operations.
 

Start monitoring your operations for anomalies at scale with DAXS’ scalable and explainable anomaly detection.

Never miss a Databricks post

Subscribe to the categories you care about and get the latest posts delivered to your inbox