Skip to main content
CUSTOMER STORY

Accelerating the future of railways with AI

INDUSTRY: Manufacturing
PLATFORM USE CASE: Delta Lake
CLOUD: Azure

Rail transportation plays a pivotal role globally, serving as a lifeline for economies’ efficient and reliable movement of people and goods. Hitachi embodies this mission with innovative railway solutions that contribute to an efficient rail infrastructure and an enhanced passenger experience. But one of the big challenges with railways is maintaining infrastructure. Monitoring and tracking potential issues with overhead lines is traditionally a manual task, which is why Hitachi leveraged Databricks Lakehouse as a platform to build digital solutions and transform the way the company works with data to drive game-changing insights. Now, Hitachi has unlocked new potential to transform transportation while ensuring rail network owners’ infrastructure remains safe by using AI to identify potential issues in near real-time and alert operators and railway workers. Not only has Databricks Lakehouse helped Hitachi save their customers significant maintenance costs, but it has also greatly sped up the company’s time to market with their ML models and improved the efficiency and effectiveness of their data engineers and scientists. With Databricks Lakehouse, Hitachi is making railways safer and more reliable for their customers everywhere.

Maintaining railways is a constant battle

Keeping railway infrastructure running smoothly is a costly and labor-intensive endeavor. In the past, network operators would have to manually maintain infrastructure by walking the tracks to inspect overhead lines on foot, monitoring it from the window of a moving train to visually inspect potential issues, or relying on measurement-taking trains that ran infrequently and were limited to detecting certain types of issues. And anytime an overhead line would break, it would result in delays and unplanned disruption costs that could impact train schedules. Needless to say, this was a time-consuming and costly effort.

Hitachi realized they needed to help network operators by modernizing their trains and rail connecting equipment, although it faced another challenge. Hitachi built an end-to-end solution by installing cameras on trains and uploading the videos to the cloud. Now the company was generating a huge volume of video data from their trains, capturing images from a total of 40,000 km of overhead lines — the distance equal to one lap around the world! — that they’d have to analyze for line issues. They needed to cleverly plan their data strategy, taking into account total cost of ownership, without sacrificing the ability to efficiently scale data engineering given their available resources.

“We set out to innovate and disrupt traditional railway maintenance, leveraging existing train fleets and empowering them with state-of-the-art AI technology,” said Andreas Herman, Lead Data & AI Architect at Hitachi. “For that, we needed a digital solution that was flexible, scalable and cost-efficient.” That’s why Hitachi looked to Databricks Lakehouse.

Lakehouse powers safer and more reliable railways

For Hitachi, using Databricks Lakehouse was a no-brainer. Databricks was a fully managed platform — which meant Hitachi wouldn’t need to support Apache Spark™ or the data infrastructure — and it also addressed all of the AI needs. On top of that, Databricks was simple to use, and its scalability would allow the platform to handle the large amounts of video data coming from the trains. Other solutions, like cloud data warehouses, were too limited in their ability to support a variety of data types and, most importantly, were constrained by their large amount of data and requirements around near real-time insights and AI.

Hitachi’s customers can now perform predictive maintenance of the railway overhead lines rather than having to react to outages, the way they did in the past. Hitachi uses Delta Lake — the default storage format for all operations on Databricks — for data storage and pipelines, and MLflow for model management, training and deployment, and tracking experiments. The team uses Databricks SQL for monitoring dashboards, which provide stats on alerts in production, model performance and more. Plus, with shared workspaces and notebooks, the team can also better collaborate on the code, reuse models, and just work more seamlessly in general.

“Our clients can now be much more proactive instead of reactive,” said Avinash Kumar, a Data Scientist at Hitachi. “Now the rail network owners we support can identify issues much faster to prevent costly disruptions to their services.”

All of this culminates in more reliable railway management. Now, as video data gets ingested and analyzed, the system seamlessly generates alerts on defects or displacement of overhead lines, and a worker gets sent to the area to fix it.

Saving costs and driving innovation in transportation

Databricks helped Hitachi greatly speed up their time to market for new solutions. The company could now operate with a smaller team that was more flexible, allowing them to focus on solving the problem for the customer rather than worrying about the data infrastructure.

“Using Databricks has made us more efficient with a smaller team, which is incredibly powerful,” said Herman. “Now, a single team can create impactful innovations that benefit thousands of customers every day.”

And these solutions have already driven a real impact for Hitachi, their customers, and all their customers’ passengers. What used to take days (engineers traveling to sites to look for faults) can now be done in near real-time, remotely. Hitachi has been able to discover thousands of equipment risks and line displacements that could be fixed proactively by their clients. The net result is estimated to be millions of pounds in savings for Hitachi’s customers, and many more happy passengers.

For more information, please contact Ben Earl, IM and Digital Services Manager.