Domestic Passenger Vessel Accidents Are Preventable Using a Management System (Part One)

Dr. IJ Arora:

Think of any accident, mishap, or tragedy involving a passenger vessel through history (or in recent times) and then look at the post-event investigation report. If you do this, you will find one shortcoming common to these tragedies: a poor appreciation of risk and the practical nonexistence of a management system. Occasionally, in slightly less disastrous events, you may see the existence of a system, but it is usually poorly implemented.

This two-part article considers the domestic passenger vessel industry in the United States, where there have been several tragedies. I hope (although hope is not a plan) that this work will inspire the industry to look at the proper implementation of management systems. In trying to narrow the discussion, we will analyze and learn lessons from the 2019 sinking of the Conception and to a limited extent the 2023 fire aboard the Spirit of Boston cruise ship. I will mention a few other incidents as well to make the connection and bring out the failure of the various systems that broke down.

A systems-based approach in analyzing accidents in the domestic U.S. passenger vessel industry involves looking at the various components and process interactions that could potentially lead to incidents. This can include factors such as crew training, vessel design, regulatory compliance, maintenance practices, and emergency preparedness. However, the major factor is usually the absence of a management system (or a badly designed and/or poorly implemented one). This is a tragedy in the making.

I am studying these accidents to demonstrate how a systems approach could have helped prevent many of these mishaps. The reluctance to implement an effective management system pains me, not to mention primary investigation agencies like the National Transportation Safety Board (NTSB), the United States Coast Guard (USCG), and other responsible bodies.

Note that I am not discussing technical processes here. Yes, those often fall short of the mark as well, but the bigger issue is the failure to apply simple systematic thinking based on existing management system standards. This reluctance to work systematically surprises me. I’ve recently expressed my views on the Baltimore Bridge collapse, the implosion of the Titan submersible, the collision between an American Airlines flight and a military helicopter over the Potomac, and the Boeing 737 Max inspection failures. In all cases, I cannot understand why a simple, cost-effective action such as properly implementing a management system should be such a critical weakness within so many different organizations. It is a leadership flaw, for (as W. Edwards Deming said) “A bad system will let down a good person every time!”

Titanic and Herald of Free Enterprise

When discussing this topic, many will think back to the Titanic tragedy which goes back more than 100 years. This is of course perhaps the most well-known sinking of all time, so I will not rehash the details, which are easily available online. However, I do want to mention that events like the sinking of the Titanic create the ultimate push—it caused a reaction and, ultimately, the creation of a workable system to help save lives and the vessels themselves. Depending on owners, operators, and masters, to use their judgment and do the right thing at the time of crisis was no longer enough. What the Titanic demonstrated was that the industry needed enforceable regulations and requirements. The result was the Safety of Life at Sea (SOLAS) Convention, which formalized a systematic approach to safety.

Before studying incidents occurring in U.S. domestic waters, I also want to mention the tragedy of the Herald of Free Enterprise, which occurred on March 6, 1987, at Zeebrugge, Belgium. The Herald of Free Enterprise was a roll-on/roll-off ferry owned by the Townsend Thoresen company. On that day, the ship capsized shortly after leaving port and 193 people lost their lives. It had departed with its bow doors open, allowing seawater to flood the car deck. Within minutes, the ship was lying on its side in shallow water.

The tragedy exposed severe deficiencies in the company’s safety culture and operational practices. Justice Barry Sheen was appointed to head the official inquiry into the disaster. His report, published in October 1987, was scathing and unprecedented in its criticism of the ferry operator, management, and the broader safety practices in the maritime industry. Justice Sheen’s report identified a “… disease of sloppiness and negligence at every level of the hierarchy.” This became one of the most quoted phrases from the report. Sheen emphasized that the disaster was not due to a single act of negligence but rather a “… catalogue of failures…” including the failure to ensure the bow doors were closed, poor communication between crew and bridge, inadequate safety procedures, and the absence of proper checks before sailing.

The report placed heavy blame on the senior management, asserting that safety was not a high priority for the company. It also noted that management failed to implement procedures that could have prevented such a tragedy.

It is indeed shocking and surprising that even today, decades later, investigations reports are still pointing out these same drawbacks. Lessons learned seem to be forgotten. I particularly wanted to focus on this incident because Justice Sheen’s report was a turning point in maritime safety regulation. It directly influenced the creation of the ISM Code under the International Maritime Organization (IMO), which mandated formal safety procedures and accountability in international shipping operations.

Conception

The Conception was a dive boat that caught fire off the coast of California, resulting in the deaths of 34 people in 2019.

Investigations into this disaster revealed several deficiencies, including inadequate fire safety procedures, lack of a proper emergency escape route, and insufficient crew training. There were also issues related to the vessel’s sleeping arrangements, where most of the passengers were asleep below deck at the time of the fire.

A systems approach would emphasize the need for comprehensive safety protocols, regular training for crew members, proper vessel design for evacuation, and effective regulatory oversight to ensure the robust implementation of safety measures.

Spirit of Boston

This incident involved a fire that broke out on the dining cruise ship Spirit of Boston while docked in 2022.

The fire was linked to a potential electrical malfunction, but it highlighted issues related to maintenance practices and emergency response protocols.

By applying a systems approach, stakeholders could focus on root cause analysis, looking into how maintenance schedules, crew training, and emergency responses are integrated and managed.

Overall recommendations for the systems approach

There are several important elements to consider in favor of the systems approach, as follows:

  • Interdisciplinary collaboration. Promoting collaboration among various stakeholders, including regulatory bodies, ship management companies, and safety experts, to share information and best practices
  • Root cause analysis. Encouraging investigations that go beyond the immediate causes of accidents to identify systemic failures that could contribute to unsafe conditions
  • Regular training and drills. Implementing continuous training and emergency drills for crew members to ensure readiness, competence and enhance situational awareness
  • Maintenance and safety protocols. Establishing stringent protocols for vessel maintenance and safety checks, with thorough documentation and compliance checks
  • Regulatory oversight. Advocating for robust regulatory frameworks that require adherence to safety standards and proactive risk management strategies
  • Cultural change. Fostering a safety-first culture within organizations that prioritize safety above operational pressures

We can see in these two recent incidents that, as with the case of the Herald of Free Enterprise, a systems approach enables a comprehensive understanding of the complexities involved in maritime operations, leading to better prevention measures and enhanced safety outcomes in the passenger vessel industry.

Other examples

Over the years, the NTSB has investigated numerous accidents involving passenger vessels. A few notable examples follow:

  • Estonia. Although this accident occurred in European waters, its implications affected international passenger shipping, including practices adopted in the United States. The Estonia sank in the Baltic Sea in 1994, resulting in the deaths of 852 people. The investigation revealed that the key issues were related to vessel design, including hull integrity and cargo securing. This incident led to enhanced safety regulations regarding passenger vessel construction and operational safety protocols.
  • Andrew J. McHugh. This collision involving the ferry Andrew J. McHugh and another vessel occurred in the narrow Houston Ship Channel, leading to the deaths of 17 passengers in 1980. The key factors included poor visibility, navigational errors, and inadequate communication between vessels. Subsequent recommendations from the NTSB aimed at improving navigational practices and vessel traffic control in critical areas.
  • Benson. The Benson, a tour boat in New York, capsized during a sudden storm. A total of 10 people died in this 2000 incident. The investigation pointed out questionable weather assessment practices and inadequate safety measures for handling sudden weather changes. The NTSB recommended better training for crew members regarding weather evaluation and emergency response.
  • Dawn Princess. A fire aboard this cruise ship in the South Pacific led to emergency evacuations in 2003. Although there were no fatalities, more than 150 passengers were affected. The fire was linked to flaws in electrical systems. The NTSB emphasized improved fire safety systems and crew training on firefighting and evacuation protocols.
  • Emotion. This fishing vessel capsized near Alaska in 2010, resulting in several fatalities. The investigation pointed out structural problems and issues with the vessel’s stability while loaded. Recommendations focused on vessel stability assessments and the importance of adherence to safety regulations during fishing operations.
  • Explorer. In 2007, the Explorer ran aground off the coast of the Antarctic Peninsula, leading to evacuations. All passengers were saved, but the incident raised alarms about navigational practices and inappropriate response to weather changes. The NTSB highlighted the need for enhanced navigational training and real-time communication.

For each of these incidents, a systems approach would involve comprehensive training programs for crew related to emergency preparedness, rigorous maintenance and operational checks, research and implementation of advanced technologies for navigation and safety, and collaboration among regulatory bodies to create uniform safety standards that encompass all aspects of vessel operation. These historical examples underscore the importance of a proactive stance on maritime safety, highlighting that every component of the system must work together to prevent accidents and improve safety outcomes in the passenger vessel industry.

A poor approach that fails to be proactive can significantly contribute to accidents such as these. When risks are not systematically identified and appreciated, several detrimental consequences can arise. Without a systematic approach to risk assessment, potential hazards may go unnoticed, increasing the likelihood of incidents. Vessels may not be adequately equipped to handle specific risks, such as extreme weather or equipment failures. There is a requirement for safety protocols, adequate training, and improvement of communications.

On the other hand, a reactive approach undermines effective communication within the organization and between vessels. Without established systems for reporting and discussing risks, lessons learned from previous incidents may be ignored.

The other factors are regulatory compliance lapses. In the absence of a proactive culture, vessels may not adhere to regulatory requirements consistently or may develop a compliance mindset that prioritizes minimum standards over comprehensive safety practices. Neglecting lessons learned from past incidents is another flaw. A failure to learn from past accidents can lead to repetitive mistakes. If organizations do not analyze historical incidents and implement changes based on those insights, they risk encountering similar situations again and again.

In the second part of this article, we will discuss the importance of using the Plan-Do-Check-Act cycle in embracing a safety management system.

To read Part 2 of the article – Click here

Note – The above article was recently published in an Exemplar Global publication – ‘The Auditor’

Click here to read the article.

What Is Risk-Based Thinking in ISO Standards?

Over the past two decades of working closely with clients in both the manufacturing and service sectors, I’ve witnessed firsthand the transformation that occurs when organizations stop treating compliance as a checklist exercise and start thinking in terms of risk and opportunity. With the 2015 revisions to many ISO standards, particularly ISO 9001, we saw a deliberate shift away from siloed “preventive actions” toward an integrated, strategic approach known as Risk-Based Thinking (RBT). 

This wasn’t just a semantic change. It marked a cultural evolution, an acknowledgment that uncertainty is inherent in every business process, and that success belongs to those who plan for it, not those who simply react to it. RBT has empowered organizations to navigate complexity with greater confidence, embedding foresight into their planning and decision-making at all levels. 

In this article, I’ll draw from real-world consulting experiences across diverse industries to demystify Risk-Based Thinking. We’ll explore what it really means, why it matters, how it supports proactive leadership, and what tools you can use to bring it to life within your own management system. Whether you’re guiding a mature enterprise or a fast-scaling startup, the principles of RBT are not only practical, but they’re also essential.

What Is Risk-Based Thinking (RBT)?

Risk-Based Thinking (RBT) is the proactive approach embedded in ISO standards like ISO 9001:2015, ISO 14001:2015, and ISO 45001:2018. Rather than treating risk as a separate component, RBT integrates it into every facet of an organization’s management system. This shift moves organizations from a reactive stance to a proactive culture, where potential issues are anticipated and addressed before they escalate. 

In my consulting journey, I’ve observed that organizations embracing RBT don’t just prevent problems, they identify opportunities for improvement and innovation. For instance, a manufacturing client leveraged RBT to streamline their supply chain, resulting in reduced lead times and increased customer satisfaction.

How Risk-Based Thinking Supports Proactive Decision-Making:

  • Identifying Potential Risks and Opportunities: By assessing both internal and external factors, organizations can foresee strategic and operational challenges and capitalize on opportunities. 
  • Integrating Risk Assessment into Planning: This ensures that objectives are achievable, and resources are allocated effectively. 
  • Enhancing Stakeholder Confidence: Demonstrating a proactive approach to risk management builds trust among customers, suppliers, and regulators.

A service industry client I worked with implemented RBT in their project management processes. This led to improved project delivery times and a significant reduction in unforeseen issues.

Key Objectives of Risk-Based Thinking:

The primary goals of RBT include: 

  • Enhancing Organizational Resilience: By anticipating potential disruptions, organizations can develop contingency plans. 
  • Promoting Continuous Improvement: Regular risk assessments lead to ongoing enhancements in processes and systems. 
  • Aligning Risk Management with Strategic Objectives: Ensuring that risk considerations are integral to achieving business goals. Read clause 6.1 connected to clause 4.1 and 4.1 per ISO harmonized structure. 
  • Fostering a Culture of Risk Awareness: Encouraging employees at all levels to consider risk in their daily activities. Clause 7.3 drives awareness to employees on how they can contribute to the system.

Practical Application of Risk-Based Thinking:

Implementing RBT involves: 

  1. Contextual Analysis: Understanding the organization’s internal and external environment. 
  2. Risk Identification: Recognizing potential events that could impact objectives. 
  3. Risk Assessment: Evaluating the likelihood and impact of identified risks. 
  4. Risk Treatment: Determining appropriate actions to mitigate or capitalize on risks. 
  5. Monitoring and Review: Continuously tracking risk factors and adjusting strategies accordingly.

Comparison: Preventive Action (Old) vs. RBT (New):

Previously, ISO standards emphasized preventive actions as separate clauses. However, this often led to a checkbox mentality, where organizations implemented measures without truly integrating them into their processes. 

With RBT: 

  • Integration: Risk considerations are embedded throughout the management system. 
  • Proactivity: Organizations anticipate and address potential issues before they occur. 
  • Flexibility: RBT allows for tailored approaches based on the organization’s specific context. 

This evolution encourages a more dynamic and effective approach to risk management. 

Tools & Techniques to Support Risk-Based Thinking:

1. SWOT Analysis (Strengths, Weaknesses, Opportunities, Threats) 

Use: SWOT analysis helps organizations evaluate their internal strengths and weaknesses, alongside external opportunities and threats. It’s particularly useful during strategic planning sessions or when entering new markets or launching new products. 

When to Use: Early in the business planning process or during the review of the organization’s context. 

Clause Alignment: ISO 9001:2015 – Clause 4.1 (Understanding the organization and its context) and Clause 6.1 (Actions to address risks and opportunities). This tool ensures that strategy and quality objectives are grounded in a realistic assessment of the internal and external environment. 

2. Failure Mode and Effects Analysis (FMEA) 

Use: FMEA systematically evaluates potential failure points in a product, process, or system and ranks them by severity, occurrence, and detection. It’s widely used in manufacturing, healthcare, and aerospace sectors. 

When to Use: During product design, process development, or when implementing changes that could introduce new risks. 

Clause Alignment: ISO 9001:2015 – Clause 8.3 (Design and development of products and services) and Clause 6.1 and 8.1. It supports risk-based planning and preventive strategies by analyzing “what could go wrong” and mitigating those risks before implementation. 

3. Risk Registers 

Use: A risk register is a living document that captures identified risks, assesses their likelihood and impact, and outlines mitigation actions and responsible parties. It provides transparency and traceability for risk management activities. 

When to Use: Continuously throughout project lifecycles or operational management, especially in industries like construction, logistics, or IT. 

Clause Alignment: ISO 9001:2015 – Clause 6.1 and Clause 9.1 (Monitoring, measurement, analysis and evaluation). It helps document ongoing risk review processes and links actions to strategic and operational plans. While not a requirement it is beneficial. 

4. Root Cause Analysis (RCA) 

Use: RCA investigates underlying causes of nonconformities, defects, or failures to prevent recurrence rather than just treating symptoms. It’s a staple in corrective action processes. 

When to Use: After incidents, near misses, or nonconformities—often triggered by audit findings or customer complaints. 

Clause Alignment: ISO 9001:2015 – Clause 10.2 (Nonconformity and corrective action). It supports continual improvement by ensuring lessons are learned and corrective actions address the source of problems. 

5. ISO/IEC 31010 – Risk Assessment Techniques 

Use: This standard outlines a variety of risk assessment tools including brainstorming, checklists, fault tree analysis, and bowtie analysis. It offers structured approaches tailored to industry-specific needs. 

When to Use: Depending on organizational maturity, criticality of operations, or regulatory environment. 

Clause Alignment: Supports ISO 9001:2015 – Clause 6.1, as well as clauses in ISO 14001 and ISO 45001 related to risk and opportunity planning. This framework provides flexibility for choosing appropriate methods suited to specific organizational risks. 

These tools, when chosen and applied correctly, don’t just satisfy audit checklists, they cultivate a culture of resilience and foresight. Over the years, I’ve seen organizations evolve by not just using these techniques mechanically, but integrating them into daily decision-making, making risk-based thinking a true operational philosophy rather than a compliance exercise. 

Understanding ISM Code Compliance for Maritime Operators

ISM

Having spent over 15 years in the maritime and compliance world, and a further decade working with various international Flag Administrations, I’ve seen firsthand the shift from traditional shipping operations to a more safety- and systems-driven industry. One of the major forces behind that transformation? The International Safety Management (ISM) Code. For maritime operators today, ISM Code compliance isn’t just about ticking boxes, it’s about embedding a culture of safety, responsibility, and continual improvement into every layer of their operation.

What is the ISM Code?

There is a saying that regulations are written in blood. The ISM Code was born out of hard lessons learned from major marine accidents. The major event that acted as a catalyst in its development was the MV Hearld of Free Enterprise. Introduced by the International Maritime Organization (IMO) under the SOLAS convention, the code mandates that every shipping company operating SOLAS compliant vessels implement a Safety Management System (SMS), a system that governs practices for the safe operation of ships and prevention of marine pollution.

I remember when the ISM Code first rolled out in the ’90s. Many shipowners were skeptical, and some even resistant. Back then, I was sailing with a company who was navigating the early implementation. The real challenge was shifting the mindset, from reactive firefighting to proactive risk management. From a documentation exercise to a shift in the way operations were done. That’s where I learned: policies are easy to write, but real compliance starts with people.

Why ISM Code Compliance Matters More Than Ever

Today, ISM Code compliance is not optional—it’s foundational. For operators navigating increasingly complex global regulations, it offers several key benefits:

  • Safety First: The SMS serves as a blueprint for safe operations at sea. I’ve seen it reduce incidents dramatically when implemented properly.
  • Environmental Responsibility: With public scrutiny and environmental regulations tightening, having structured pollution control measures is non-negotiable.
  • Credibility & Trust: In one of my past sailing tenures with a major operator, ISM compliance helped secure long-term contracts with charterers. Clients want to work with companies that can prove they’re managing risks responsibly.
  • Operational Clarity: When roles, responsibilities, and procedures are clearly outlined, decision-making becomes faster and more consistent.

The Core Objectives of the ISM Code

The ISM Code objectives listed in clause 1.2 remain as relevant now as when the code was first introduced. Clause 1.2 is about outcomes, not just documents. It’s about creating a system that actually prevents harm, not just reacts to it.

For me, ISM Code compliance under Clause 1.2 isn’t just about passing an audit, it’s about building a culture where every person onboard understands their role in safeguarding lives, the vessel, and the environment. It requires integrating risk assessments into planning, ensuring safe working practices, maintaining the ship properly, and always being prepared for emergencies.

I always emphasize these objectives when training ship and shore staff. It’s not about overwhelming them with paperwork, it’s about aligning them with a purpose. The code provides the structure; we provide the commitment.

Key Elements of ISM Code Compliance

A fully compliant SMS includes:

  • Safety and Environmental Protection Policy
  • Defined Roles and Responsibilities
  • Safe Operating Procedures
  • Emergency Preparedness
  • Reporting and Analysis of Incidents
  • Internal Audits and Continuous Improvement

One of the best implementations I facilitated was for a regional bulk carrier. We not only developed the vessel SMS but aligned office procedures, and built an SMS that didn’t just sit in a manual, it lived on the bridge, in the boardroom and in the daily practices of personnel.

The Compliance Process for Maritime Operators

Getting compliant involves more than a checklist. Here’s a simplified roadmap:

  1. Gap Analysis – Review what you already do and what the code expects. Does it reflect the operational reality or is it a fictional system?
  2. SMS Development/Update – Build or refine your safety management system. Comprehensive reviews when done after many years can lead to a reduction in documentation by over 20 percent.
  3. Training & Awareness – Everyone onboard and ashore must know their part. How do they contribute to the effectiveness of the system.
  4. Certification – Obtain the Document of Compliance (DOC) and Safety Management Certificate (SMC) through audits.
  5. Ongoing Monitoring – Regular internal audits and management reviews keep the system alive and evolving.

Common Challenges in ISM Code Compliance

Let’s be real, compliance has its hurdles:

  • Top-down Disconnect: Without leadership buy-in, the SMS becomes a box-ticking exercise.
  • Crew Resistance: “We’ve always done it this way” is a common attitude.
  • Training Gaps: If your crew doesn’t understand the ‘why’ behind procedures, they won’t follow them.
  • Audit Fatigue: Poor recordkeeping and rushed preparation can derail audits.

My advice? Keep it simple. Make procedures practical, not bureaucratic. Involve the crew in developing routines. That’s how you make compliance sustainable.

The Future of ISM Code Compliance and Technology’s Role

The maritime industry is changing fast. Digital tools are making compliance easier and smarter:

  • Cloud-based SMS systems offer real-time updates and reduce paperwork.
  • Remote audits became mainstream during the pandemic—and they’re here to stay. Where a full remote audit is not feasible consider hybrid audits.
  • Data analytics can identify patterns in incidents and help prevent them.
  • Mobile apps for onboard reporting are empowering seafarers to be active players in the compliance process.

Look at mistake proofing of the system. So even if a human wanted to make an error the system would prevent it.

In Conclusion, ISM Code compliance isn’t just about certificates. It’s about creating a safety culture that protects your people, your assets, and the environment. For maritime operators willing to invest the effort, the returns in safety, efficiency, and reputation are well worth it.

If you’re a maritime operator looking to simplify or strengthen your ISM safety management system, I’m happy to share more from my experiences. As someone who’s walked ship decks, sat in boardrooms, worked with Flag Administrations and led audits, I believe that compliance done right isn’t a burden—it’s a competitive advantage.

Internal vs External Audits: What Every Business Owner Should Know

The Strategic Importance of Audits for Business Owners

Audits are more than compliance checks; they are strategic tools that provide insights into performance, risk, and improvement opportunities. Engaged business leaders use audit results to drive better decision-making and long-term success. When conducting well, they provide leadership insights into where they may have to re-prioritize or allocate resources, where policies may be in conflict, what may be working well and where the system needs their leadership intervention.

What Are Internal and External Audits?

Internal Audits: Performed by or for the organization to check its own processes. These may be process audits or full system audits.

External Audits: These could be supplier audits (second party) or certification regulatory audits (third party). Third party audits are conducted by a third-party or certification body to verify compliance with standards.

Internal and external audits differ in breadth and depth of the audit based on scope and objective.

Why External Audits Should Be Taken Seriously?

External audits affect certification, reputation, and client confidence. A successful external audit demonstrates credibility and reliability.

Tip: Be prepared, be honest, and see auditors as partners in your improvement journey.

How to Prepare for Both Audits?

  • Keep documentation current
  • Review and close previous findings
  • Train staff on audit processes
  • Conduct mock audits
  • Engage leadership in the audit process

Conclusion:

ISO audit and their findings are not to be feared. They are valuable tools for identifying weaknesses and driving continuous improvement. With the right mindset and preparation, audits can move beyond mere compliance and become a core part of your strategic growth. Organizations that stay audit-ready show that they are not only compliant but also committed to excellence.

Human Error or a Bigger Problem? When to Dig Deeper

by Julius DeSilva

In the world of process improvement and problem-solving, human “user” error can often become the go-to explanation when things go wrong. A mis-entered data point, a forgotten step in a procedure, or a misconfigured setting—blaming the user is quick and easy. But how do you know when an issue is bigger than just user error?

Understanding when to dig deeper and identify systemic flaws is critical. By integrating structured approaches like Root Cause Analysis (RCA) and the PDCA (Plan-Do-Check-Act) cycle, organizations can shift from a reactive blame culture to a proactive, continual improvement mindset that eliminates recurring problems at their source.

The Prevalence of User Error in Different Industries

Human error has been identified as a significant contributor to operational failures across multiple sectors:

  • Cybersecurity: According to the World Economic Forum, 95% of cybersecurity breaches result from human error.
  • Manufacturing: A study by Vanson Bourne found that 23% of unplanned downtime in manufacturing is due to human error, making it a key contributor to production inefficiencies. The American Society for Quality (ASQ) reports that 33% of quality-related problems in manufacturing are due to human error.
  • Healthcare: The British Medical Journal (BMJ) estimates that medical errors—many due to human factors—cause approximately 250,000 deaths per year in the U.S. alone.
  • Aviation & Transportation: The Federal Aviation Administration (FAA) attributes 70-80% of aircraft incidents to human error, but deeper analysis often reveals process design issues, poor training, or missing safeguards.

These statistics reinforce a key point: Human error isn’t always the root cause—it’s often a symptom of a deeper, systemic issue.

Recognizing When to Look Beyond User Error

Here’s how to tell when an issue isn’t just a one-time mistake but a signal that the system itself needs improvement:

  1. Recurring Issues Across Multiple Users – If multiple employees are making the same mistake, the problem likely isn’t individual human error—it’s a flaw in the process, system design, or training. For example, if multiple operators incorrectly configure a machine setting, it might indicate confusing controls, inadequate training, or unclear documentation rather than simple user mistakes.
  2. Workarounds and Process Deviations – If employees consistently find alternative ways to complete a task, the system may not be designed for real-world conditions. If workers routinely bypass a safety feature because it “slows them down,” the process needs reevaluation; either through retraining, redesign, or better automation. At QMII, we always reinforce building a system for the users, built on the as-is of how work is done and then making incremental improvements.
  3. High Error Rates Despite Training – If errors persist even after proper training, the issue might be process complexity, unclear instructions, or a lack of intuitive system design. If employees consistently make minor mistakes, the system interface or workflow rules might need simplification rather than just retraining staff.
  4. Error Spikes in High-Stress Situations – Mistakes often increase under time pressure, fatigue, or stress. This suggests a workload or process issue rather than simple carelessness. In a maritime environment, high error rates during critical operations could signal staffing shortages, inefficient safety interlocks, or poor user interfaces on devices.

Instead of just fixing errors after they happen, organizations should use the PDCA (Plan-Do-Check-Act) cycle to continually improve processes and reduce the probability of recurring failures.

The PLAN-DO-CHECK-ACT Approach

PLAN – Identify the context and potential risks

  1. Identify the context of the process including the competence of personnel, user environment, complexity and influencing factors.
  2. Apply Failure Mode and Effects Analysis (FMEA) to predict where failures are likely to happen before they occur.
  3. Identify and involve representatives of users through the development of FMEAs and the process.
  4. When predicting controls and resources, determine the feasibility of implementing and providing them.
  5. Simplify procedures, redesign workflows, or introduce automation to eliminate failure points.

DO – Implement the Process and Improvements

  1. Implement the process and test it to check its effectiveness. In the initial stages more frequent monitoring and measurement will be required. The periodicity between checks can be reduced as the process matures.
  2. Provide user training and assess its effectiveness. When errors occur retrain personnel, but only if training is truly the issue—don’t use training as a Band-Aid for bad system design.
  3. Look beyond documented “standard-operating” procedures. As an example: The company implements a visual step-by-step guide near machines to ensure operators follow a standard calibration process.

CHECK – Evaluate the Results

  1. Track performance data to see if the changes have reduced errors.
  2. Get user feedback to ensure the new system is intuitive and efficient. For example, Error rates drop by 40%, but operators still struggle with a specific step—prompting another refinement.

ACT – Standardize & Scale

  1. If the improvement is successful, integrate it as the new standard process.
  2. Scale the change across other departments or sites where similar issues might exist. For example, the company implements the same calibration guide and training approach across all locations, preventing similar errors company-wide.

Conclusion: From Blame to Solutions

While human error is a reality, it’s often a symptom of a deeper process flaw, not the root cause. Those involved in conducting a root cause analysis process or investigation process, must ask “How did the system fail the individual” and “Why did the system fail the individual”. By shifting from a blame mindset to a continual improvement approach, organizations can:

  • Reduce costly errors and downtime
  • Improve employee engagement (less frustration = higher productivity)
  • Enhance conformity and compliance
  • Increase process reliability and efficiency

Monitoring the system will continue for as the context changes the controls implemented may not be as effective as before. A proactive system will not guarantee that things never go wrong. When they do, however, the key is to dig deeper. Using tools like PDCA, FMEA, and RCA will help in identifying long-term solutions to recurring problems. Because in most cases, fixing the system is better than blaming the human.

ISO 13485: QMS Requirements of Medical Devices for Regulatory Purposes

by Dr. IJ Arora

ISO 13485:2016 is a standard that addresses quality management system requirements for those within the medical device industry. It is based on the systems-based approach found in ISO 9001:2015, but because it emphasizes requirements for regulatory purposes, it does not align with ISO’s harmonized structure (HS). In many ways, ISO 13485 does align with the HS, particularly in the structure and foundational principles of quality management.

The introduction of ISO 13485 explicitly states that the standard is aligned with ISO 9001, and this connection is important for understanding how the two standards relate to each other. I am a bit surprised as to why ISO 13485 isn’t fully harmonized with the HS as defined in Annex SL, which is the specific document within ISO standards that outlines the HS. I believe that if this standard were aligned to the HS, it would make implementation much less laborious for all involved.

The ISO 9001 foundation

The 2015 version of ISO 9001, which is presently under revision, provides a good basis for all standards. As mentioned, ISO 13485 has its roots in ISO 9001, which is why the key QMS principles (e.g., customer focus, leadership, process approach, continual improvement, and evidence-based decision making) central to ISO 9001 are also embedded in ISO 13485.

ISO 13485 includes several core concepts and clauses from ISO 9001. Clause 4 on quality management systems (e.g., structure, documentation requirements, and the scope of the QMS); cause 5 on management responsibility (e.g., top management involvement, resource allocation, and internal audits); and clause 8 relating to measurement, analysis, and improvement (e.g., monitoring, corrective actions, and continual improvement), are just some of these examples.

As I study, teach, consult, and audit using ISO 13485, I wonder why the standard Is not fully harmonized with similar standards as laid out in Annex SL. In consulting, I feel the pain of organizations that must meet regulatory requirements and so tend to overlook the process-based management system (PBMS) approach as the fundamental to the plan-do-check-act (PDCA) cycle. This regulatory focus is one reason why, although ISO 13485 shares many similarities with ISO 9001, it is not fully aligned with the HS. ISO 13485 places a strong emphasis on compliance with regulatory requirements specific to the medical device industry. The standard’s clauses addressing design and development, post-market surveillance, risk management, and traceability requirements are all far more extensive than those found in ISO 9001. Annex SL focuses more on general management practices and less on industry-specific regulatory controls. The detail and specificity required for medical device safety and compliance often necessitates a structure that goes beyond the framework of the HS.

Overcoming differences

Different scopes and audiences are also a consideration in that, while ISO 9001 is a general quality management standard applicable across industries, ISO 13485 is designed specifically for organizations that manufacture medical devices. These organizations must meet stringent regulatory requirements that go beyond what ISO 9001 addresses. Because of this, ISO 13485 requires more detailed processes related to product lifecycle management, post-market activities, risk management, and regulatory controls, which aren’t adequately covered under the more generalized HS. ISO 13485 includes a much stronger emphasis on managing the product’s entire lifecycle, from design and development to post-market activities (e.g., complaint handling and vigilance). Although ISO 9001 mentions product realization, ISO 13485 goes into much greater depth, including extensive requirements for design control and risk management. These elements reflect the higher level of scrutiny needed in the medical device industry, where safety and compliance are paramount.

With that said, I believe that these differences don’t prevent ISO 13485 from being organized according to the HS format. The standard would not only help medical device manufacturers’ management systems conform with specific regulatory requirements but also meet the obligations for continual improvement. After all, registered organizations in the aerospace and automobile industries already do just that via sector-specific management system standards that are harmonized with ISO 9001.

The structural differences in the clauses found in ISO 13485 and the standards adopting the HS are not too far apart. Although ISO 13485 is aligned with ISO 9001, it diverges when it comes to specifics that are unique to the medical device sector and regulatory requirements.

ISO 13485’s clause 7, “Product Realization” includes additional elements, such as design controls and regulatory compliance requirements, that are critical in the medical device industry. Post-market surveillance and complaint handling are central to ISO 13485, but the HS doesn’t go to the level of detail necessary for medical device manufacturers.

ISO 13485 emphasizes the need for continuous monitoring of device performance, even after they are on the market, ensuring any issues are identified and addressed in a timely manner. I believe ISO 9001’s subclause 9.1.2, “Customer Feedback,” can be updated to incorporate this requirement.

Risk management is a vital consideration. ISO 13485 integrates risk management into the standard in a way that is far more structured and pervasive than what is found in ISO 9001. ISO 13485 has a more detailed approach to identifying, assessing, and mitigating risks throughout the lifecycle of medical devices. However, these added requirements could be added to subclause 6.1.1 (““Actions to Address Risks and Opportunities”) or subclause 8.1.1 (“Operation Planning and Control”) found in the HS.

ISO 13485 includes specific requirements for design and development processes, which are critical in medical devices due to their complexity and potential risk to patient safety. The HS doesn’t provide this level of detail for other types of products or industries.

Identifying similarities

Notwithstanding the differences between ISO 13485 and the standards that align with the HS, there are also some key similarities. As with ISO 9001, ISO 13485 is built around seven quality management principles: customer focus, leadership, engagement of people, process approach, improvement, evidence-based decision making, and relationship management. Continual Improvement of the quality management system is part of both standards, emphasizing the need for a strong focus on monitoring, auditing, corrective actions, and reviews. Document control is another similarity. Both ISO 13485 and ISO 9001 stress the importance of clear and accurate documentation to ensure that quality management processes are defined, monitored, and maintained effectively.

In keeping itself separate from the HS, ISO 13485’s clause structure, despite being based on ISO 9001, serves to meet the unique needs of the medical device industry. The decision not to fully harmonize the standard with the structure seen in Annex SL likely stems from the need to ensure a tailored regulatory focus. ISO 13485 is aligned with a variety of regulatory frameworks across different countries and regions (e.g., FDA, EU MDR, TGA, etc.). These regulations require specific processes that go beyond the generic, high-level harmonized framework provided by Annex SL to facilitate combined/ integrated management systems. The structure of ISO 13485 allows for a more detailed, industry-specific approach to product safety, efficacy, risk management, and compliance. Product lifecycle control is an essential part of the medical device industry, and it has a complex lifecycle that includes design controls, manufacturing processes, and post-market activities that require more attention than the HS would provide.

Looking at a few additional clauses reveals that ISO 13485 follows a specific structure that allows it to emphasize the unique aspects of medical device quality management while maintaining consistency with other ISO standards.

For example, Clause 1, “Scope,” is relatively straightforward and outlines the scope of the standard, which is specific to organizations that design, manufacture, and maintain medical devices. The clause also highlights exclusions (for example, aspects not applicable to the organization), which is quite typical in a quality management standard.

Clause 2, “Normative References,” lists the documents referenced within ISO 13485, which is typical for any ISO management system standard. The important point here is that ISO 13485 requires compliance with relevant regulations and standards, particularly those in the medical device sector.

Clause 3, “Terms and Definitions,” is crucial because the terminology in the medical device industry can be very specifically. Definitions clarify terms that might have different meanings in other industries (e.g., what qualifies as a “medical device,” “design verification,” or “post-market surveillance”). This ensures uniformity and understanding across the industry.

Clause 4, “Quality Management System (QMS),” describes the basic requirements for establishing and maintaining a QMS, which is a fundamental aspect of ISO 13485. This clause outlines the need for a quality policy, the establishment of objectives, and the requirement to continually improve the QMS. These are common in all ISO standards but are tailored here to fit the needs of the medical device industry.

Clause 5, “Management Responsibility,” covers executive involvement as a key theme. In ISO 13485, it emphasizes top management’s responsibility for ensuring that quality objectives are met. This clause also requires that management provide resources for quality activities and review the performance of the QMS regularly, ensuring alignment with regulatory requirements and customer needs.

Clause 6, “Resource Management,” could have been aligned to clause 7, “Support,” found in the HS. This clause in ISO 13485 requires the organization to manage resources effectively, which includes personnel training and competence (a critical area in the medical device industry). This ensures that employees have the skills needed to produce safe and effective devices. It also covers infrastructure and the control of the work environment, ensuring that conditions are suitable for maintaining product quality.

Clause 7, “Product Realization,” diverges further from the HS. Product realization in the medical device sector involves the entire lifecycle of the device—from planning, design, development, and manufacturing to service and post-market activities. This clause is extensive and includes requirements for design controls, risk management, validation, and traceability, all of which are critical in the medical device industry. The detailed focus on design and development, verification and validation, and product monitoring ensures that all aspects of a medical device’s journey, from conception to post-market surveillance, are covered.

Clause 8, “Measurement, Analysis, and Improvement,” requires organizations to evaluate the effectiveness of their QMS through regular monitoring, measurement, and audits. It also focuses on corrective and preventive actions (CAPA) to improve quality. Preventive action in the HS has not been thrown out like the proverbial baby with the bath water. It has instead been replaced by requirement to appreciate risk. For medical devices, complaints and nonconformance reporting are key to ensuring ongoing safety and compliance. ISO 13485 could also have gone from preventive action to risk.

Post-market surveillance and vigilance is a requirement of the medical device standard. Unlike many other ISO standards, ISO 13485 places significant emphasis on post-market surveillance, which is the process of monitoring the performance of medical devices once they are in use. This is a major distinguishing factor from other ISO standards. Manufacturers are required to establish processes for post-market feedback, complaint handling, and field safety corrective actions (FSCA), which are essential for identifying and managing risks after the product is on the market.

In conclusion, I would opine and agree that although ISO 13485 is indeed based on ISO 9001, it diverges from the HS identified in Annex SL because the unique needs of the medical device industry—such as regulatory compliance, product lifecycle management, and patient safety—require a more detailed and specialized approach than the HS can provide. The clause structure of ISO 13485 reflects these specific requirements, making it a robust and industry-specific standard that ensures the safety and quality of medical devices while maintaining alignment with the foundational principles of quality management in ISO 9001.

This balance of maintaining core quality principles while addressing the needs of the medical device industry is why ISO 13485 has not fully adopted the HS but instead continues to incorporate elements of ISO 9001 alongside medical-device-specific regulatory needs. That it could still at the least attempt to align the primary clauses as risk to the HS would help all parties involved.

Note – The above article was recently featured in Exemplar Global’s publication called “The Auditor”. Click here to read it.

One-Off or Systemic: The Search for Root Causes

by Julius DeSilva

Accidents and failures, whether in maritime, aviation, healthcare, or nuclear settings, are often subjected to intense scrutiny to determine their root causes. However, the challenge lies in distinguishing whether an event is an anomaly or a symptom of a deeper systemic issue. This analysis is crucial as it directly influences the actions taken to prevent a recurrence or occurrence elsewhere. A management system approach, such as those outlined in ISO 45001 for occupational health and safety, ISO 9001 for quality management, or ISO 14001 for environmental management, provides a structured framework for systematically and proactively addressing risks when data exists.

Analysis of root causes: systemic failures

Root cause analysis is a fundamental investigative tool used to trace an incident to its origins. However, many organizations focus on immediate, apparent causes rather than examining systemic contributors and true root causes. Systemic failures result from weaknesses in policies, processes, or culture, and therefore, often recur in different forms over time.

The management system approach advocated by ISO standards and other industry-specific standards like the ISM code emphasize continual improvement and risk-based thinking. The intent of these standards is to reduce the probability of systemic failures by integrating safety, quality, efficiency, security, and environmental management into everyday operations.

Systemic failure example: Chernobyl

I recently read the book Midnight in Chernobyl, which outlined the 1986 Chernobyl nuclear disaster and the underlying systemic failures that contributed to this incident. Unlike isolated accidents, Chernobyl resulted from a combination of design flaws, operational errors, and a deficient safety culture. Key systemic issues included:

  • Design flaws. The RBMK reactor used in Chernobyl had an inherent positive void coefficient, meaning an increase in steam production could accelerate the reaction uncontrollably.
  • Operational failures. A safety test was conducted under unsafe conditions, including a reduced power level and disengaged emergency shutdown mechanisms.
  • Cultural and regulatory gaps. A lack of safety culture, insufficient training (and thus competency), and an authoritarian management style amounting to complacency discouraged questioning of unsafe practices.

These root causes culminated in an explosion that released massive amounts of radioactive material. European countries are so tightly packed that winds freely spread the outfall without borders. The systemic nature of the disaster was later addressed through international nuclear safety reforms, including the establishment of the International Atomic Energy Agency’s safety standards and stricter ISO frameworks such as ISO 19443, which outlines quality management system requirements for organizations working within the nuclear sector.

Other systemic failures

Deepwater Horizon oil spill (2010)

Another example of a systemic failure is the Deepwater Horizon oil spill. This incident was not merely the result of a single mistake but a consequence of systemic lapses in safety practices, regulatory oversight, and risk management. Contributing factors included:

  • Cultural deficiencies. The organization prioritized cost cutting over risk mitigation
  • Inadequate risk assessments. There was poor well-integrity testing and misinterpretation of pressure data.
  • Regulatory weaknesses. There was insufficient government oversight and a lack of stringent industrywide safety protocols.

This catastrophe led to significant regulatory changes, including the implementation of stricter safety and environmental policies within the oil and gas industry, aligned with ISO 45001 and ISO 14001.

The Boeing 737 MAX crashes (2018, 2019)

The Boeing 737 MAX crashes further illustrate systemic failure. Investigations revealed that flaws in the aircraft’s Maneuvering Characteristics Augmentation System (MCAS) were not adequately addressed due to:

  • Design and engineering oversights. Critical safety features were made optional rather than standard.
  • Regulatory gaps. The FAA relied excessively on Boeing’s self-certification.
  • Organizational pressures. The corporate culture emphasized speed-to-market delivery over comprehensive safety testing.

This resulted in significant regulatory reforms, including tighter oversight and compliance with international aviation safety standards.

Fixes vs. systemic longer-term improvement

Addressing failures can be approached through quick fixes or long-term systemic improvements. Each approach has its advantages and disadvantages:

Quick fixes

Pros:

  • Immediate resolution of pressing issues
  • Cost-effective in the short term
  • Prevents further damage or loss

Cons:

  • Does not address underlying systemic issues
  • Can lead to recurring problems if not supplemented with deeper analysis
  • Often reactive rather than proactive

Systemic longer-term improvements

Pros:

  • Addresses root causes, reducing the likelihood of recurrence
  • Enhances organizational resilience and safety culture
  • Aligns with ISO management systems, ensuring continuous improvement

Cons:

  • Requires significant time and resources
  • May face resistance from stakeholders due to cultural inertia
  • Implementation complexity can slow down immediate corrective actions

A balanced approach is often necessary—implementing short-term fixes to mitigate immediate risks while developing long-term systemic improvements to ensure sustainable safety and risk management practices.

What if we cannot foresee all risks?

Even with rigorous management systems and risk assessments, not all risks can be predicted. Organizations must be prepared to address unforeseen risks through:

  • Resilient systems. It is important to develop adaptable and robust safety management frameworks that can respond effectively to new threats.
  • Proactive learning. The organization can encourage a culture of continuous learning and scenario planning to anticipate emerging risks.
  • Redundancies and safeguards. Implementing fail-fail safe redundancies and contingency plans can mitigate the effects of unforeseen events.
  • Stakeholder collaboration. Engaging industry experts, regulators, and other stakeholders to share knowledge can help improve collective risk awareness.

Despite the lessons from Chernobyl, 25 years later the Fukushima disaster occurred. An earthquake of this magnitude was not foreseen as a risk even though in 1896 (as highlighted by an engineer on the project) an earthquake of magnitude 8.5 hit near the coast where the reactor was to be built. After Chernobyl, the 1970s-built reactor in Fukushima was not upgraded with the latest safety features due to high costs. Japan’s nuclear industry had a history of regulatory complacency and reluctance to accept international recommendations

ISO 31000, which addresses risk management, emphasizes the importance of resilience and adaptability in the face of unpredictable risks. By fostering a commitment to learning and preparedness across the organization, businesses can better navigate uncertainties while maintaining operational safety and efficiency.

The benefits of a management system approach

A management system approach, as defined by ISO standards, provides the following advantages:

  • Structured risk management. ISO 31000 ensures systematic identification, assessment, and mitigation of risks.
  • Continuous improvement. The Plan-Do-Check-Act (PDCA) cycle described in ISO 9001, ISO 45001, and ISO 14001 encourages learning from incidents to prevent recurrence.
  • Organizational culture change. Implementing ISO standards fosters a risk-oriented mindset, reducing the likelihood of systemic failures.

ISO management systems, when implemented and sustained, can act as a preventive tool to proactively manage risk.

Conclusion

Understanding whether an accident is an anomaly or a systemic failure is critical in determining the appropriate response. Sadly, at times industry must incur the cost of the nonconformity to learn the lesson. Organizational “can-do” attitudes lead to risk normalizations where dangerous conditions are seen as normal. Further, organizational and demographic cultures do not encourage challenging authority or questioning of decisions. Absence of accidents, incident reports, and near misses give a false sense of complacency that things are working well. This may lead to over-confidence in decision making, lapses in regulatory oversight, and deferring of resource allocation to other “priorities.”

Systemic failures indicate deeper vulnerabilities requiring long-term corrective actions. The application of ISO management systems offers a proactive and structured approach to accident prevention, ensuring that organizations move beyond reactive responses to fostering a culture of continuous improvement and risk management. By embracing these principles, industries can mitigate systemic risks, ensuring safer and more resilient operations.

Note – The above article was recently featured in Exemplar Global’s publication ‘The Auditor’. Click here to read.

The Baltimore Bridge Collapse—Another Case of a Failed Management System

By – Dr. IJ Arora

Can good management systems make organizations immune to disasters? The Baltimore bridge (or, more precisely, the Francis Scott Key Bridge) collapsed in 2023 because the container vessel MV Dali collided with it. This was a tragedy, perhaps caused by the failure of several management systems, the ship, the port, the state, and whoever else was involved.

The National Transportation Safety Board (NTSB) investigation is ongoing, and will no doubt look at the part played by MV Dali, its crew, and its operator. However, my thought is that MV Dali or other ships plying the waters should have, by simple statistical probability, been considered as risks by the authorities. Between the water channel, the high number of ships sailing in and out regularly, and the bridge itself, there was likely to be an collision someday. Perhaps it was not a matter of if, but when! Therefore, should the bridge have been better designed and made safer based on these known and appreciated risks? After all, not all accidents can be completely avoided, but each tragedy has lessons learned as responsive action. The lessons become the data that drives risk identification and trends, thus making the system proactive. I am sure the NTSB is considering all this. In the meantime, without going into the ongoing investigation, there would seem to be some basics which are common indications of systemic failures. Be it the Titan submersible, or the Boeing management system,  as a subject-matter experts in  process-based management systems, I see a common cause: the failure of the system to  deliver conforming products and services.

In this short article, I want to discuss this bridge collapse in the context of the management system, considering ISO 9001:2015 generically and the requirements of ISO 55001:2024—“Asset management—Vocabulary, overview and principles” specifically. ISO 55001 was first published in 2014. It was developed as a standalone standard for asset management, building upon the principles of ISO 9001 and other relevant standards.

Could simply designing a good system based on the standard have enabled the organization to better assess the associated risks? Perhaps they were assessed, and a bridge allision was considered an extremely low-probability occurrence. If that were the case, the discussion would be on prioritization of risks.

As of the time of this writing (September 2024), the investigation into the Baltimore bridge collapse is still ongoing, and the lawsuits are starting to fly. Although the exact cause of the collapse remains under investigation, we can consider several factors that might have contributed to the incident. MV Dali experienced a series of electrical blackouts before the allision. The implementation of the vessel’s safety management system (SMS, based on the ISM Code) could be a factor. The stability, age, and condition of the bridge are, I am sure, being investigated as a potential contributing factor. Then, there is always human element. There may have been errors on the part of the ship’s crew or the bridge’s operators. Was the SMS designed to support them in such a scenario? What factors may have caused operators at all levels to perhaps not follow requirements and mitigate the risks? The NTSB’s investigation will highlight a detailed analysis of the ship’s navigation systems, the bridge’s structural integrity, and the actions of the individuals involved in this tragedy. Their final report will provide a comprehensive understanding of the incident and may include recommendations to prevent similar occurrences in the future.

However, even at this stage we can agree that bridges in general are national assets. They are valuable infrastructure that provides essential services to communities. Although it is not publicly known whether the state of Maryland specifically implemented ISO 55001 for its bridges, the principles and practices outlined in this standard could have been beneficial in managing the risks associated with the Baltimore bridge. Through the implementation of this standard (and/or ISO 9001), the authorities could have performed:

  • Risk assessments. ISO 55001 requires organizations to conduct regular risk assessments to identify potential threats and vulnerabilities. A thorough assessment of the bridge’s condition, age, and traffic load could have helped identify potential risks and inform maintenance and repair decisions, as could have changes in procedures, protection of navigation channels, and so on.
  • Lifecycle management. The standard emphasizes the importance of managing assets throughout their entire lifecycle, from planning and acquisition to maintenance and disposal. By following ISO 55001, the state could have developed a comprehensive plan for the bridge’s maintenance, upgrades, and eventual replacement.
  • Performance measurements. ISO 55001 requires organizations to establish measurable objectives or key performance indicators (KPIs) to measure the effectiveness of their asset-management activities. This could have helped the state monitor the bridge’s condition and identify any signs of deterioration.
  • Continual improvement. The standard promotes a culture of continual improvement, encouraging organizations to learn from past experiences and make necessary adjustments to their asset-management practices.

It is impossible to say definitively whether ISO 55001 would have prevented the Baltimore bridge collapse. However, the principles and practices outlined in the standard could have helped to reduce the risk inherent in such incidents. By adopting a systematic and proactive approach to asset management, organizations can improve the reliability and safety of their infrastructure. A systematic study must go beyond what the MV Dali contributed to the Baltimore bridge collapse; it is also important to consider the broader context and the potential contributions of other factors:

  • Bridge design and maintenance. The age and condition of the bridge are likely to be factors in the investigation. Older infrastructure may be more susceptible to damage or failure, especially if it has not been adequately maintained or upgraded.
  • Vessel traffic. The frequency and intensity of vessel traffic in the area can also influence the risk of allisions. The bridge is in a busy shipping channel; therefore, the likelihood of incidents was higher.
  • Safety measures. The presence or absence of safety measures such as buoys, warning systems, or restricted areas can also affect the risk of allisions. This needs to be studied and are factors the authorities would know.
  • Human elements and factors. Errors on the part of both the ship’s crew and bridge operators can contribute to accidents. Factors such as fatigue, inexperience, or inadequate training may play a role. What led to these issues? Error proofing, mistake proofing, and failure mode and effects analysis (FMEA) are tools that could be part of the effective management system.

Let us therefore consider ISO 55001 and the relevant clauses of the standard which could apply to the collapse of the Baltimore bridge.

Clause 4—Context of the organization

  • Clause 4.1—Understanding the external context, such as the age of the bridge, traffic volume, and environmental factors, is crucial for risk assessment.
  • Clause 4.2—Identifying the needs and expectations of relevant interested parties, including the public, commuters, and regulatory bodies, is essential for effective asset management.

Clause 6—Planning

  • Clause 6.2.1—The bridge’s asset management plan should have included clear objectives for its maintenance, repair, and replacement.
  • Clause 6.2.2—Specific objectives related to safety, reliability, and cost-effectiveness should have been established.
  • Clause 6.2.3—Detailed planning for maintenance, inspections, and upgrades would have been necessary to ensure the bridge’s structural integrity.

Clause 7—Support

  • Clause 7.1—Adequate resources, including funding, personnel, and expertise, should have been allocated for bridge maintenance and inspection.
  • Clause 7.2—Ensuring that personnel involved in bridge management have the necessary competence and training is essential.
  • Clause 7.3—Raising awareness among all relevant stakeholders about the importance of bridge maintenance and safety is crucial.

Clause 8—Operation and maintenance

  • Clause 8.1—Regular inspections and monitoring of the bridge’s condition would have helped identify potential problems early on.
  • Clause 8.2—A well-defined maintenance schedule, including preventive and corrective maintenance, would have been necessary to address issues before they escalated.

Clause 9—Performance evaluation

  • Clause 9.1—Establishing KPIs to measure the bridge’s performance, such as safety records, traffic flow, and maintenance costs, would have provided valuable insights.
  • Clause 9.2—Regular monitoring and evaluation of these KPIs would have helped identify areas for improvement.

Clause 10—Improvement

  • Clause 10.2—The bridge’s management should have implemented a system for monitoring and measurement, including data collection and analysis.
  • Clause 10.3—Predictive maintenance techniques could have been used to identify potential failures before they occurred.

My objective in writing this article is help demonstrate that by applying the principles of a standard, be it generic ISO 9001 or a more specific standard (as in this case, the asset-management system standard ISO 55001) the organization (in this case the state of Maryland) could have strengthened its asset-management practices and potentially mitigated the risks associated with the Baltimore bridge collapse.

The above article was recently published in the Exemplar Global publication – ‘The Auditor’.

Can We Trust AI? 

We see the use of Artificial Intelligence or AI all around us in uses that may be visible to us as also in uses not directly visible to us. It is here to stay and as we learn to live with it, however, there remains a concern about whether we can totally trust AI. Hollywood may have painted a picture of the rise of machines that may instill fear in some of us. Fear of AI taking over jobs, of AI reducing intelligent human beings, and of AI being used for illegal purposes. In this article we discuss what actions can be taken by organizations to build trust in AI, so it becomes an effective asset. The idea is as old as 1909, EM Foster’s “The Machine Stops”. 

What does it mean to trust an AI system? 

For people to begin to trust AI there must be sufficient transparency of what information AI has access to, what is the capability of the AI and what is the programming that the AI is basing its outputs on. While I may not be the guru in AI systems, I have been following its development over the last seven to eight years delving into several types of AI. IBM has an article that outlines the several types of AI that may be helpful. I recently tried to use ChatGPT to provide me with information and realized the information was outdated by at least a year. To better understand how we can trust AI, let us look at the factors that contribute to AI trust issues.  

Factors Contributing to AI Trust Issues 

A key trust issue arises in the algorithm used within the neural network that is delivering the outputs. Another key factor is the data itself that the outputs are based upon. Knowing the data that the AI is using is important in being able to trust the output. It is also important to know how well the algorithm was tested and validated prior release. AI systems are run through a test data set to determine if the neural network will produce the desired results. The system is then tested on real world data and refined. AI systems may also have biases based on the programming and data set. Companies face security and data privacy challenges too when using AI applications. Additionally, as stated earlier there remains the issue of misuse of AI just as cryptocurrency was in its initial phases.  

What can companies do to improve trust in AI? 

While there is much to be done by organizations to address the issues listed above and it may take a few years to improve public trust in AI, companies developing and using AI systems can use a system-based approach to implementing these systems. The International Organization for Standardization (ISO) recently published ISO/IEC 42001 – Management System Requirements for Information Technology AI systems. The standard provides a process-based framework to identify and address AI risks effectively with the commitment of personnel at all levels of the organization.  

The standard follows the harmonized structure of other ISO management system requirement standards such as ISO 9001 and ISO 14001. It also outlines 10 control objectives and 38 controls. The controls based on industry best practices asks the organization to consider a lifecycle approach to developing and implementing AI systems including conducting an impact assessment, systems design (to include verification and validation), control of quality of data used and processes for responsible use of AI to name a few. Perhaps one of the first requirements that organizations can do to protect themselves is to consider developing an AI policy that outlines how AI is used within the ecosystem of their business operations.  

Using a globally accepted standard can deliver confidence to customers (and address trust issues) that the organization is using a process-based approach to responsibly perform their role with respect to AI systems. 

To learn more about how QMII can support your journey should you decide to use ISO/IEC 42001, or to learn about our training options, contact our solutions team at 888-357-9001 or email us at info@qmii.com.  

-by Julius DeSilva, Senior Vice-President