Equipment Failure Diagnosis and Analysis Agent Technology That Identifies the Causes of New Failures with the Speed and Accuracy of Experienced Personnel

2026-05-21

The control and operational technologies used to run systems and equipment in manufacturing and other on-site settings are called Operational Technology (OT). Through OT, a diverse range of knowledge, data, and know-how is generated daily on the front lines, making it necessary to transform this often-person-dependent on-site knowledge into value. Hitachi, Ltd. has developed AI agent technology that highly accurately estimates the causes of manufacturing equipment failures. The AI agent system works by first converting design drawing data into knowledge graphs that the generative AI can read. Then analysis is performed combining this data with OT skill data using the systematic safety analysis method STAMP/CAST. By mimicking the thought processes of skilled technicians, the system can handle even unknown failures. Its effectiveness has already been confirmed in a joint demonstration with Daikin Industries, Ltd. We interviewed Dr. Kentaro Yoshimura, a Principal Researcher at the Autonomous Control Research Department, Mobility & Automation Innovation Center, Digital Innovation R&D, Research & Development Group, Hitachi, Ltd.

Written by: Kazumichi Moriyama (science writer)

How can we quickly and accurately estimate failures and their causes in equipment composed of “machines that shouldn’t break” and develop countermeasures?

Enabling even inexperienced maintenance personnel to perform advanced maintenance work— this has long been a challenge in manufacturing sites. This issue is becoming increasingly urgent due to the aging and retirement of skilled technicians, as well as talent shortages resulting from global expansion. However, challenges remain. Approaches involving training AI on data such as maintenance records have been attempted for some time, and have shown some effectiveness. However, due to factors such as the lack of clarity and uniformity in the level of information recorded, the abundance of tacit knowledge that has not been verbalized, and the fact that the content described is sometimes only understandable to the person who wrote it, the results have been limited.

Since around 2022, methods such as retrieval-augmented generation (RAG) using generative AI have emerged. While these can handle known events documented in past records and manuals, they often fail to handle events that differ slightly from past cases, leaving the fundamental problem unresolved.

Factory equipment is fundamentally a system composed of highly reliable machinery designed for continuous operation. Consequently, the failures that occur are unknown, new failures or those without similar precedents. These are phenomena such as “something that shouldn’t break has broken,” which cannot be addressed in principle by methods that search for similar cases from records.

Furthermore, since the AI does not fully learn all the physical connections within the equipment, or the flow of signals or fluids as data, it can only generate abstract responses such as “the valve is broken.” With their difficulty in pinpointing the specific problem among dozens of parts to propose effective countermeasures, these methods remain impractical for on-site use. To solve this problem, the development of an AI agent using a new approach was required.

The mechanism of an AI agent capable of mimicking the thinking and analytical methods of experts

To overcome the limitations of conventional AI, Hitachi adopted a unique approach that analyzes the very thought processes used by skilled technicians when analyzing the causes of failures, and reproduces them using AI. This AI agent system works based on the hypothesis that experts identify root causes through analysis using their OT knowledge (i.e., target OT data [design data] and OT skills founded on that information). Therefore, the core technology consists of two elements: data that enables the AI to think logically and skills representing the expert’s analysis process.

First, “data” refers to the understanding of various production equipment design information. Just as an expert opens a drawing to grasp the overall structure of the equipment, Hitachi’s proprietary technology is used to convert design drawings into a knowledge graph format that the AI can understand, thereby enabling it to comprehend the information.

Specifically, it extracts the connection relationships of parts and hierarchical structures (such as the relationships between pumps, filters, valves, etc.) from piping and instrumentation diagrams (P&IDs) and electrical drawings. By converting this extracted structural information into a knowledge graph or Bill of Materials (BOM) format, we enable the AI to logically understand the physical structure of the equipment. Currently, humans interpret the drawing information and provide the AI with structured data on part configurations and connection relationships; however, the goal for the future is complete automation using AI capable of interpreting drawings.

“Skills” are acquired through learning the equipment failure cause analysis process. To logically analyze the causes of failures occurring in complex systems, the AI was trained on the system safety analysis method STAMP/CAST.

STAMP/CAST stands for System-Theoretic Accident Model & Process and Causal Analysis based on System Theory. It is an analytical methodology originally developed for extremely complex and highly reliable systems such as NASA’s aerospace systems. While it takes time for even specialists to master this methodology, by training a large language model (LLM) with high reasoning capabilities on this methodology as OT knowledge, the AI can now perform systematic and logical root cause analysis just like an expert.

A key technique playing a crucial role here is a paradigm called Retrieval Augmented Reasoning (RAR). By searching for appropriate drawings and standard operating procedures (SOPs) while combining the equipment structure data with the analytical skills of an expert, the system can propose specific inspections and safe countermeasures supported by evidence.

In other words, rather than merely searching for information, by combining structural analysis of the equipment with the diagnostic logic of experts, it is possible to derive evidence-based inspection points and corrective actions even in unknown scenarios.

Conventional RAG excels at searching for similar cases from manuals and past failure records. However, as mentioned earlier, it struggles to deal with new failures unseen in the past or events that are similar yet distinct. In mission-critical environments, it is rare for major failures to recur in the exact same form. RAR, however, goes beyond simple case searches; by analyzing the structure of the equipment, it can narrow down the cause even when an event occurs in an unknown scenario.

Through this combination of OT data and OT skills—specifically, the creation of a knowledge graph, the learning of OT skills (equivalent to expert wisdom), and the use of methodologies like STAMP/CAST that analyze the causes of system abnormalities through the interaction of elements—AI no longer merely searches past cases. It can now logically infer the causes of unknown failures based on an understanding of the structure of manufacturing equipment and systemic causal relationships.

画像2: The mechanism of an AI agent capable of mimicking the thinking and analytical methods of experts

Realizing a fast, highly accurate AI agent: “Thinking AI” receives high praise from the field

The AI Agent for Equipment Failure Diagnostics began a trial operation for practical use in April 2025 as part of a joint demonstration with Daikin Industries, Ltd. When used to identify the causes of equipment failures and present inspection items and corrective actions, the accuracy of cause estimation improved from 67% with general-purpose AI to over 90%. This directly reduces the time required for identifying causes and planning countermeasures. While response speed is also important in practical use, the agent generates a response containing the specific part name suspected of being faulty in approximately 10 seconds from the inquiry. This enables workers to move quickly to the next action.

Regarding the diagnostic level, experienced maintenance staff at Daikin Industries reportedly praised it highly from the field, stating that it is “equivalent to or better than a typical maintenance technician,” “clearly superior to a novice,” and “at a level that can be used in the field.” In particular, the AI’s ability to correctly comprehend drawings and identify specific causes even for new failures was highly evaluated.

At the end of 2025, joint verification with Mitsubishi Chemical Corporation also began. By enhancing advanced AI with Hitachi’s proprietary deep domain knowledge, the system has already achieved concrete results as part of digital service HMAX™ Industry—a suite of solutions that embodies Lumada 3.0.

Toward a future “super expert” built from the collective expertise of experts

This AI agent, which quickly and accurately estimates the causes of manufacturing equipment failures, has demonstrated its effectiveness as a “thinking AI” and “on-site answering AI” running on a cloud platform managed by Hitachi. Beyond factory equipment and production lines, we anticipate expanding into mission-critical areas that demand high reliability, such as power, railways, automobiles, and IT infrastructures. This technology contributes to improving the safety and reliability of the entire social infrastructure by supporting frontline workers.

With further development, it will be possible to absorb maintenance data from locations across the globe, transforming tacit knowledge from specific sites into global organizational knowledge. Even parameters that are currently difficult to determine, such as force adjustment, will be addressed in the future within the framework of physical AI, which utilizes on-site data to generate outputs.

Yoshimura envisions that further development of this AI technology could create “super experts.” By consolidating on-site knowledge from various domains and the OT skills of experts, it may be possible to construct an AI agent that possesses the collective expertise of experts. Experts themselves will also be able to learn the thought patterns of other specialists. This will result in the transformation of the tacit knowledge of the entire organization into explicit knowledge.

Once the skills of experts, born from highly reliable control and operational technologies, are widely available as AI, humans will be able to devote their time to innovation creation that requires greater creativity, such as envisioning the future of the factory. By mastering AI as a tool that extends human capabilities, new experts who work alongside it may transform the manufacturing landscape in the future.

Equipment Failure Diagnosis and Analysis Agent Technology That Identifies the Causes of New Failures with the Speed and Accuracy of Experienced Personnel

How can we quickly and accurately estimate failures and their causes in equipment composed of “machines that shouldn’t break” and develop countermeasures?

The mechanism of an AI agent capable of mimicking the thinking and analytical methods of experts

Realizing a fast, highly accurate AI agent: “Thinking AI” receives high praise from the field

Toward a future “super expert” built from the collective expertise of experts

Related Links

Hitachi develops AI-based failure identification technology to support rapid recovery from equipment failures and continuous enhancement of diagnostic capabilities - Research & Development : Hitachi

Equipment Failure Diagnosis and Analysis Agent Technology That Identifies the Causes of New Failures with the Speed and Accuracy of Experienced Personnel

How can we quickly and accurately estimate failures and their causes in equipment composed of “machines that shouldn’t break” and develop countermeasures?

The mechanism of an AI agent capable of mimicking the thinking and analytical methods of experts

Realizing a fast, highly accurate AI agent: “Thinking AI” receives high praise from the field

Toward a future “super expert” built from the collective expertise of experts

Related Links

Hitachi develops AI-based failure identification technology to support rapid recovery from equipment failures and continuous enhancement of diagnostic capabilities - Research & Development : Hitachi

関連記事