Domestic Passenger Vessel Accidents Are Preventable Using a Management System (Part Two)

Dr. IJ Arora:

In the first part of this two-part article, we began to consider the key commonality of accidents involving domestic vessels such as the Conception and the Spirit of Boston, namely, the absence of a fully functional management system. Here in part two, we will examine this in more depth from the perspective of the Plan-Do-Check-Act (PDCA) cycle.

Emphasizing a proactive safety culture and systematically addressing risks can greatly enhance safety in the domestic passenger vessel industry. By being vigilant and forward-thinking, companies can significantly reduce the likelihood of accidents and ensure the well-being of both crew and passengers. A comprehensive systems approach that prioritizes safety at all levels is essential for fostering a resilient maritime environment.

As a consultant with almost four decades of experience, I feel that my emphasis on fostering a proactive safety culture within the domestic passenger vessel industry is both timely and essential. The sector has historically witnessed incidents that stem not just from operational failures but from lapses in systematic risk management. The simple PDCA cycle makes risk appreciation essential and helps create a proactive management system. A proactive safety culture is not reactionary, but anticipatory. It is focused on identifying and mitigating risks before they evolve into incidents.

In domestic passenger operations, where crew and passengers coexist in dynamic and sometimes unpredictable environments, the safety culture must be leadership-driven, with management exemplifying and enforcing safety values. It must also be behavior-based, encouraging crew to speak up about near-misses or unsafe practices. An environment for quality, health, safety, and security must be built and maintained. The overall management system must be systems-supported, with procedures that make it easy to report, track, and correct hazards. A genuine safety culture is evident when every level of the organization—from executives to deckhands—considers safety an integral part of their responsibilities, not an afterthought.

Right at the start of the PDCA cycle, at the Plan stage, organizations must commit to identifying, evaluating, and mitigating risks. This is not just a best practice, but a requirement under clause 6.1 of ISO 9001:2015, which requires “… actions to address risks and opportunities.” It emphasizes understanding internal and external issues and planning actions accordingly to mitigate risk. In a similar vein, clause 8 of the ISM Code requires organizations to evaluate all identified risks to their ships, personnel, and the environment and establish appropriate safeguards. Failure to account for risks at this stage can cascade into the Do stage, with flawed procedures or untrained personnel resulting in increased chances of accidents.

In a systems approach it should be completely unacceptable to transfer uncertainty to the crew. Uncertainty in procedures, poorly defined emergency roles, or ambiguous hazard controls lead to hesitation and confusion during critical moments. The vessel crew should never be the first line of discovery for unanticipated risks. The shore-based organization must do the heavy lifting in identifying, documenting, and training for these risks. This principle aligns with clause 5 of the ISM Code, which mandates the establishment of safe practices in ship operations and a safe working environment.

Systemic safety as a shield against repetition must be created from lessons learnt. Clause 7.6 of ISO 9001 on knowledge is relevant and a requirement. As can be seen from various NTSB investigation reports, many vessel accidents share common causal factors: complacency, procedural lapses, miscommunication, or design flaws. These can be mitigated when a systems approach is employed linking technical systems, human factors, procedures, and training into one cohesive safety net. Lessons learned from past accidents are institutionalized not just in the safety management system (SMS) but in organizational memory and training routines.

Most importantly, risk appreciation must be the foundation of resilience. The ability to appreciate (not just assess) risk is what distinguishes a compliant company from a truly resilient one. Appreciating risk means embedding foresight into the organizational DNA, training teams to ask, “What if?” before a situation turns critical. This should holistically lead to and support the creation of maritime systems that do more than tick boxes—they save lives.

Applying the PDCA Cycle

Connecting these insights to the 2019 Conception tragedy not only reinforces the urgency of implementing a proactive safety culture but also illustrates precisely how systemic failures in risk appreciation, planning, and organizational accountability can lead to devastating outcomes.

As you will recall, the dive boat Conception caught fire while anchored off Santa Cruz Island, California. This resulted in the deaths of 34 people, which was the deadliest domestic maritime disaster in modern California history. The victims were asleep in a bunkroom below deck, and none of them survived. Only five crew members escaped. This tragedy was a catastrophic failure of planning, risk management, and safety culture.

The Conception disaster links clearly to a breakdown in the PDCA cycle, as follows:

  • Plan. Inadequate risk appreciation was a vital failure. There was no comprehensive risk assessment identifying the dangers of leaving charging lithium-ion batteries unattended overnight in a confined space. The lack of clearly marked and accessible escape routes was a known risk that was neither mitigated nor escalated. There was no SMS, nor was one legally required for that vessel. Still, a proactive operator would have voluntarily implemented one. As has been said, “Failing to plan is planning to fail,” and in this case, a lack of foresight into fire hazards, emergency egress, and nighttime watchkeeping was fatal.
  • Do. Lapses in implementation are apparent and have been pointed out in the NTSB report. A night watchman was required by regulation and the vessel’s certificate of inspection but was not on duty. The crew had no fire detection system below deck that could alert sleeping occupants of danger. Emergency drills and preparedness procedures were either nonexistent or insufficiently enforced.
  • Check. The investigators saw no monitoring or audit mechanisms. The vessel operator, Truth Aquatics, had no self-checking mechanism for compliance with watchkeeping requirements. There was no internal audit or reporting structure that caught repeated violations, such as skipping the night watch.
  • Act. This final stage of the PDCA cycle is intrinsically connected to leadership both ashore and at sea. However, there was almost a complete absence of any corrective action, despite past observations and near-miss warnings about battery charging risks and poor escape routes. The organization normalized deviation, operating under the illusion of safety through habit.

Failure to appreciate risk is a violation of ISO 9001 and ISM principles. The Conception incident demonstrates how not appreciating risk in the Plan stage—especially related to emerging threats like battery fires—can result in fatal vulnerabilities. Had a formal risk-based approach been followed, battery charging, watchkeeping, and egress issues would have been flagged and corrected.

Mitigating risks with an SMS

Although not mandated for this class of vessel, the absence of an SMS and risk-based approach violated the spirit of the ISM Code. Clause 8 calls for evaluating all risks and preparing for emergencies. The lack of a nighttime watch, poor escape design, and no contingency procedures represent failures in both design and culture.

The failure to appreciate hazards and risks by the organization on shore was passed to the crew and passengers, who paid for it with their lives. Passengers had no idea there was no overnight watch, a basic safety expectation. The crew was not empowered with procedures or tools to manage an emergency, placing them in an impossible position once the fire began. I therefore emphasize “companies cannot pass uncertainty to those on board.” The burden of risk must be identified, mitigated, and managed ashore, before the ship even leaves port. All that was required was a proper management system, resourced and implemented effectively and efficiently.

By not having an SMS, organizations are ensuring that there is no safety net in case the worst occurs! A comprehensive, systems-based approach could have identified the risk of charging batteries and flammable materials in confined quarters and ensured continuous watchkeeping practices were in place. The SMS would have required mandated drills, escape route evaluations, and fire detection systems. Simple internal audits would have perhaps given the management the inputs to ensure continual improvement and planned a system to ensure compliance. This would have embodied the PDCA cycle, where each stage feeds the next with learning, foresight, and action.

Conclusion

My final thought on lessons written in loss and tragedy are that having a system is the least those charged with entertaining people can do to guarantee that lives are not lost. The Conception tragedy in particular is a grim testament to what happens when safety is assumed rather than engineered. The call for a systems approach rooted in proactive risk appreciation is exactly the kind of thinking needed to prevent another such disaster.

My argument for the mandated or voluntary adoption of an SMS in the domestic passenger vessel sector draws on evidence from NTSB investigations and international best practices. Domestic passenger vessels, though subject to U.S. Coast Guard inspection regimes, are often not required to implement a formal SMS. This omission has led to repeated safety lapses where identifiable risks were not systematically mitigated. As we have seen, the consequences of such lapses can often be fatal.

It is time for the overall national policy to encourage the U.S. Coast Guard to extend SMS requirements to large domestic passenger vessels and establish tiered SMS models scalable by vessel type and operation. To the industry czars my recommendations are to encourage industry bodies to provide incentives and recognition for SMS adopters and promote voluntary adoption through education and resource support. To the organizations and companies operating in the domestic U.S. waters, I suggest these company-level actions:

  • Begin voluntary SMS implementation aligned with ISO or ISM principles.
  • Train personnel in the PDCA methodology.
  • Perform internal audits and hazard reviews regularly.

The tragedy of the Conception and the other incidents we have discussed reveal that compliance alone does not ensure safety. Only a structured, systems-based approach can prevent recurrence. It is time for the domestic passenger vessel industry to adopt SMS—not only as a regulatory checkbox but as a foundational safety ethos.

Note – The above article (Part 2) was recently published in an Exemplar Global publication – ‘The Auditor’

Click here to read the article.

Click here to read part 1 of the article

Domestic Passenger Vessel Accidents Are Preventable Using a Management System (Part One)

Dr. IJ Arora:

Think of any accident, mishap, or tragedy involving a passenger vessel through history (or in recent times) and then look at the post-event investigation report. If you do this, you will find one shortcoming common to these tragedies: a poor appreciation of risk and the practical nonexistence of a management system. Occasionally, in slightly less disastrous events, you may see the existence of a system, but it is usually poorly implemented.

This two-part article considers the domestic passenger vessel industry in the United States, where there have been several tragedies. I hope (although hope is not a plan) that this work will inspire the industry to look at the proper implementation of management systems. In trying to narrow the discussion, we will analyze and learn lessons from the 2019 sinking of the Conception and to a limited extent the 2023 fire aboard the Spirit of Boston cruise ship. I will mention a few other incidents as well to make the connection and bring out the failure of the various systems that broke down.

A systems-based approach in analyzing accidents in the domestic U.S. passenger vessel industry involves looking at the various components and process interactions that could potentially lead to incidents. This can include factors such as crew training, vessel design, regulatory compliance, maintenance practices, and emergency preparedness. However, the major factor is usually the absence of a management system (or a badly designed and/or poorly implemented one). This is a tragedy in the making.

I am studying these accidents to demonstrate how a systems approach could have helped prevent many of these mishaps. The reluctance to implement an effective management system pains me, not to mention primary investigation agencies like the National Transportation Safety Board (NTSB), the United States Coast Guard (USCG), and other responsible bodies.

Note that I am not discussing technical processes here. Yes, those often fall short of the mark as well, but the bigger issue is the failure to apply simple systematic thinking based on existing management system standards. This reluctance to work systematically surprises me. I’ve recently expressed my views on the Baltimore Bridge collapse, the implosion of the Titan submersible, the collision between an American Airlines flight and a military helicopter over the Potomac, and the Boeing 737 Max inspection failures. In all cases, I cannot understand why a simple, cost-effective action such as properly implementing a management system should be such a critical weakness within so many different organizations. It is a leadership flaw, for (as W. Edwards Deming said) “A bad system will let down a good person every time!”

Titanic and Herald of Free Enterprise

When discussing this topic, many will think back to the Titanic tragedy which goes back more than 100 years. This is of course perhaps the most well-known sinking of all time, so I will not rehash the details, which are easily available online. However, I do want to mention that events like the sinking of the Titanic create the ultimate push—it caused a reaction and, ultimately, the creation of a workable system to help save lives and the vessels themselves. Depending on owners, operators, and masters, to use their judgment and do the right thing at the time of crisis was no longer enough. What the Titanic demonstrated was that the industry needed enforceable regulations and requirements. The result was the Safety of Life at Sea (SOLAS) Convention, which formalized a systematic approach to safety.

Before studying incidents occurring in U.S. domestic waters, I also want to mention the tragedy of the Herald of Free Enterprise, which occurred on March 6, 1987, at Zeebrugge, Belgium. The Herald of Free Enterprise was a roll-on/roll-off ferry owned by the Townsend Thoresen company. On that day, the ship capsized shortly after leaving port and 193 people lost their lives. It had departed with its bow doors open, allowing seawater to flood the car deck. Within minutes, the ship was lying on its side in shallow water.

The tragedy exposed severe deficiencies in the company’s safety culture and operational practices. Justice Barry Sheen was appointed to head the official inquiry into the disaster. His report, published in October 1987, was scathing and unprecedented in its criticism of the ferry operator, management, and the broader safety practices in the maritime industry. Justice Sheen’s report identified a “… disease of sloppiness and negligence at every level of the hierarchy.” This became one of the most quoted phrases from the report. Sheen emphasized that the disaster was not due to a single act of negligence but rather a “… catalogue of failures…” including the failure to ensure the bow doors were closed, poor communication between crew and bridge, inadequate safety procedures, and the absence of proper checks before sailing.

The report placed heavy blame on the senior management, asserting that safety was not a high priority for the company. It also noted that management failed to implement procedures that could have prevented such a tragedy.

It is indeed shocking and surprising that even today, decades later, investigations reports are still pointing out these same drawbacks. Lessons learned seem to be forgotten. I particularly wanted to focus on this incident because Justice Sheen’s report was a turning point in maritime safety regulation. It directly influenced the creation of the ISM Code under the International Maritime Organization (IMO), which mandated formal safety procedures and accountability in international shipping operations.

Conception

The Conception was a dive boat that caught fire off the coast of California, resulting in the deaths of 34 people in 2019.

Investigations into this disaster revealed several deficiencies, including inadequate fire safety procedures, lack of a proper emergency escape route, and insufficient crew training. There were also issues related to the vessel’s sleeping arrangements, where most of the passengers were asleep below deck at the time of the fire.

A systems approach would emphasize the need for comprehensive safety protocols, regular training for crew members, proper vessel design for evacuation, and effective regulatory oversight to ensure the robust implementation of safety measures.

Spirit of Boston

This incident involved a fire that broke out on the dining cruise ship Spirit of Boston while docked in 2022.

The fire was linked to a potential electrical malfunction, but it highlighted issues related to maintenance practices and emergency response protocols.

By applying a systems approach, stakeholders could focus on root cause analysis, looking into how maintenance schedules, crew training, and emergency responses are integrated and managed.

Overall recommendations for the systems approach

There are several important elements to consider in favor of the systems approach, as follows:

  • Interdisciplinary collaboration. Promoting collaboration among various stakeholders, including regulatory bodies, ship management companies, and safety experts, to share information and best practices
  • Root cause analysis. Encouraging investigations that go beyond the immediate causes of accidents to identify systemic failures that could contribute to unsafe conditions
  • Regular training and drills. Implementing continuous training and emergency drills for crew members to ensure readiness, competence and enhance situational awareness
  • Maintenance and safety protocols. Establishing stringent protocols for vessel maintenance and safety checks, with thorough documentation and compliance checks
  • Regulatory oversight. Advocating for robust regulatory frameworks that require adherence to safety standards and proactive risk management strategies
  • Cultural change. Fostering a safety-first culture within organizations that prioritize safety above operational pressures

We can see in these two recent incidents that, as with the case of the Herald of Free Enterprise, a systems approach enables a comprehensive understanding of the complexities involved in maritime operations, leading to better prevention measures and enhanced safety outcomes in the passenger vessel industry.

Other examples

Over the years, the NTSB has investigated numerous accidents involving passenger vessels. A few notable examples follow:

  • Estonia. Although this accident occurred in European waters, its implications affected international passenger shipping, including practices adopted in the United States. The Estonia sank in the Baltic Sea in 1994, resulting in the deaths of 852 people. The investigation revealed that the key issues were related to vessel design, including hull integrity and cargo securing. This incident led to enhanced safety regulations regarding passenger vessel construction and operational safety protocols.
  • Andrew J. McHugh. This collision involving the ferry Andrew J. McHugh and another vessel occurred in the narrow Houston Ship Channel, leading to the deaths of 17 passengers in 1980. The key factors included poor visibility, navigational errors, and inadequate communication between vessels. Subsequent recommendations from the NTSB aimed at improving navigational practices and vessel traffic control in critical areas.
  • Benson. The Benson, a tour boat in New York, capsized during a sudden storm. A total of 10 people died in this 2000 incident. The investigation pointed out questionable weather assessment practices and inadequate safety measures for handling sudden weather changes. The NTSB recommended better training for crew members regarding weather evaluation and emergency response.
  • Dawn Princess. A fire aboard this cruise ship in the South Pacific led to emergency evacuations in 2003. Although there were no fatalities, more than 150 passengers were affected. The fire was linked to flaws in electrical systems. The NTSB emphasized improved fire safety systems and crew training on firefighting and evacuation protocols.
  • Emotion. This fishing vessel capsized near Alaska in 2010, resulting in several fatalities. The investigation pointed out structural problems and issues with the vessel’s stability while loaded. Recommendations focused on vessel stability assessments and the importance of adherence to safety regulations during fishing operations.
  • Explorer. In 2007, the Explorer ran aground off the coast of the Antarctic Peninsula, leading to evacuations. All passengers were saved, but the incident raised alarms about navigational practices and inappropriate response to weather changes. The NTSB highlighted the need for enhanced navigational training and real-time communication.

For each of these incidents, a systems approach would involve comprehensive training programs for crew related to emergency preparedness, rigorous maintenance and operational checks, research and implementation of advanced technologies for navigation and safety, and collaboration among regulatory bodies to create uniform safety standards that encompass all aspects of vessel operation. These historical examples underscore the importance of a proactive stance on maritime safety, highlighting that every component of the system must work together to prevent accidents and improve safety outcomes in the passenger vessel industry.

A poor approach that fails to be proactive can significantly contribute to accidents such as these. When risks are not systematically identified and appreciated, several detrimental consequences can arise. Without a systematic approach to risk assessment, potential hazards may go unnoticed, increasing the likelihood of incidents. Vessels may not be adequately equipped to handle specific risks, such as extreme weather or equipment failures. There is a requirement for safety protocols, adequate training, and improvement of communications.

On the other hand, a reactive approach undermines effective communication within the organization and between vessels. Without established systems for reporting and discussing risks, lessons learned from previous incidents may be ignored.

The other factors are regulatory compliance lapses. In the absence of a proactive culture, vessels may not adhere to regulatory requirements consistently or may develop a compliance mindset that prioritizes minimum standards over comprehensive safety practices. Neglecting lessons learned from past incidents is another flaw. A failure to learn from past accidents can lead to repetitive mistakes. If organizations do not analyze historical incidents and implement changes based on those insights, they risk encountering similar situations again and again.

In the second part of this article, we will discuss the importance of using the Plan-Do-Check-Act cycle in embracing a safety management system.

To read Part 2 of the article – Click here

Note – The above article was recently published in an Exemplar Global publication – ‘The Auditor’

Click here to read the article.

What Is Risk-Based Thinking in ISO Standards?

Over the past two decades of working closely with clients in both the manufacturing and service sectors, I’ve witnessed firsthand the transformation that occurs when organizations stop treating compliance as a checklist exercise and start thinking in terms of risk and opportunity. With the 2015 revisions to many ISO standards, particularly ISO 9001, we saw a deliberate shift away from siloed “preventive actions” toward an integrated, strategic approach known as Risk-Based Thinking (RBT). 

This wasn’t just a semantic change. It marked a cultural evolution, an acknowledgment that uncertainty is inherent in every business process, and that success belongs to those who plan for it, not those who simply react to it. RBT has empowered organizations to navigate complexity with greater confidence, embedding foresight into their planning and decision-making at all levels. 

In this article, I’ll draw from real-world consulting experiences across diverse industries to demystify Risk-Based Thinking. We’ll explore what it really means, why it matters, how it supports proactive leadership, and what tools you can use to bring it to life within your own management system. Whether you’re guiding a mature enterprise or a fast-scaling startup, the principles of RBT are not only practical, but they’re also essential.

What Is Risk-Based Thinking (RBT)?

Risk-Based Thinking (RBT) is the proactive approach embedded in ISO standards like ISO 9001:2015, ISO 14001:2015, and ISO 45001:2018. Rather than treating risk as a separate component, RBT integrates it into every facet of an organization’s management system. This shift moves organizations from a reactive stance to a proactive culture, where potential issues are anticipated and addressed before they escalate. 

In my consulting journey, I’ve observed that organizations embracing RBT don’t just prevent problems, they identify opportunities for improvement and innovation. For instance, a manufacturing client leveraged RBT to streamline their supply chain, resulting in reduced lead times and increased customer satisfaction.

How Risk-Based Thinking Supports Proactive Decision-Making:

  • Identifying Potential Risks and Opportunities: By assessing both internal and external factors, organizations can foresee strategic and operational challenges and capitalize on opportunities. 
  • Integrating Risk Assessment into Planning: This ensures that objectives are achievable, and resources are allocated effectively. 
  • Enhancing Stakeholder Confidence: Demonstrating a proactive approach to risk management builds trust among customers, suppliers, and regulators.

A service industry client I worked with implemented RBT in their project management processes. This led to improved project delivery times and a significant reduction in unforeseen issues.

Key Objectives of Risk-Based Thinking:

The primary goals of RBT include: 

  • Enhancing Organizational Resilience: By anticipating potential disruptions, organizations can develop contingency plans. 
  • Promoting Continuous Improvement: Regular risk assessments lead to ongoing enhancements in processes and systems. 
  • Aligning Risk Management with Strategic Objectives: Ensuring that risk considerations are integral to achieving business goals. Read clause 6.1 connected to clause 4.1 and 4.1 per ISO harmonized structure. 
  • Fostering a Culture of Risk Awareness: Encouraging employees at all levels to consider risk in their daily activities. Clause 7.3 drives awareness to employees on how they can contribute to the system.

Practical Application of Risk-Based Thinking:

Implementing RBT involves: 

  1. Contextual Analysis: Understanding the organization’s internal and external environment. 
  2. Risk Identification: Recognizing potential events that could impact objectives. 
  3. Risk Assessment: Evaluating the likelihood and impact of identified risks. 
  4. Risk Treatment: Determining appropriate actions to mitigate or capitalize on risks. 
  5. Monitoring and Review: Continuously tracking risk factors and adjusting strategies accordingly.

Comparison: Preventive Action (Old) vs. RBT (New):

Previously, ISO standards emphasized preventive actions as separate clauses. However, this often led to a checkbox mentality, where organizations implemented measures without truly integrating them into their processes. 

With RBT: 

  • Integration: Risk considerations are embedded throughout the management system. 
  • Proactivity: Organizations anticipate and address potential issues before they occur. 
  • Flexibility: RBT allows for tailored approaches based on the organization’s specific context. 

This evolution encourages a more dynamic and effective approach to risk management. 

Tools & Techniques to Support Risk-Based Thinking:

1. SWOT Analysis (Strengths, Weaknesses, Opportunities, Threats) 

Use: SWOT analysis helps organizations evaluate their internal strengths and weaknesses, alongside external opportunities and threats. It’s particularly useful during strategic planning sessions or when entering new markets or launching new products. 

When to Use: Early in the business planning process or during the review of the organization’s context. 

Clause Alignment: ISO 9001:2015 – Clause 4.1 (Understanding the organization and its context) and Clause 6.1 (Actions to address risks and opportunities). This tool ensures that strategy and quality objectives are grounded in a realistic assessment of the internal and external environment. 

2. Failure Mode and Effects Analysis (FMEA) 

Use: FMEA systematically evaluates potential failure points in a product, process, or system and ranks them by severity, occurrence, and detection. It’s widely used in manufacturing, healthcare, and aerospace sectors. 

When to Use: During product design, process development, or when implementing changes that could introduce new risks. 

Clause Alignment: ISO 9001:2015 – Clause 8.3 (Design and development of products and services) and Clause 6.1 and 8.1. It supports risk-based planning and preventive strategies by analyzing “what could go wrong” and mitigating those risks before implementation. 

3. Risk Registers 

Use: A risk register is a living document that captures identified risks, assesses their likelihood and impact, and outlines mitigation actions and responsible parties. It provides transparency and traceability for risk management activities. 

When to Use: Continuously throughout project lifecycles or operational management, especially in industries like construction, logistics, or IT. 

Clause Alignment: ISO 9001:2015 – Clause 6.1 and Clause 9.1 (Monitoring, measurement, analysis and evaluation). It helps document ongoing risk review processes and links actions to strategic and operational plans. While not a requirement it is beneficial. 

4. Root Cause Analysis (RCA) 

Use: RCA investigates underlying causes of nonconformities, defects, or failures to prevent recurrence rather than just treating symptoms. It’s a staple in corrective action processes. 

When to Use: After incidents, near misses, or nonconformities—often triggered by audit findings or customer complaints. 

Clause Alignment: ISO 9001:2015 – Clause 10.2 (Nonconformity and corrective action). It supports continual improvement by ensuring lessons are learned and corrective actions address the source of problems. 

5. ISO/IEC 31010 – Risk Assessment Techniques 

Use: This standard outlines a variety of risk assessment tools including brainstorming, checklists, fault tree analysis, and bowtie analysis. It offers structured approaches tailored to industry-specific needs. 

When to Use: Depending on organizational maturity, criticality of operations, or regulatory environment. 

Clause Alignment: Supports ISO 9001:2015 – Clause 6.1, as well as clauses in ISO 14001 and ISO 45001 related to risk and opportunity planning. This framework provides flexibility for choosing appropriate methods suited to specific organizational risks. 

These tools, when chosen and applied correctly, don’t just satisfy audit checklists, they cultivate a culture of resilience and foresight. Over the years, I’ve seen organizations evolve by not just using these techniques mechanically, but integrating them into daily decision-making, making risk-based thinking a true operational philosophy rather than a compliance exercise. 

Internal vs External Audits: What Every Business Owner Should Know

The Strategic Importance of Audits for Business Owners

Audits are more than compliance checks; they are strategic tools that provide insights into performance, risk, and improvement opportunities. Engaged business leaders use audit results to drive better decision-making and long-term success. When conducting well, they provide leadership insights into where they may have to re-prioritize or allocate resources, where policies may be in conflict, what may be working well and where the system needs their leadership intervention.

What Are Internal and External Audits?

Internal Audits: Performed by or for the organization to check its own processes. These may be process audits or full system audits.

External Audits: These could be supplier audits (second party) or certification regulatory audits (third party). Third party audits are conducted by a third-party or certification body to verify compliance with standards.

Internal and external audits differ in breadth and depth of the audit based on scope and objective.

Why External Audits Should Be Taken Seriously?

External audits affect certification, reputation, and client confidence. A successful external audit demonstrates credibility and reliability.

Tip: Be prepared, be honest, and see auditors as partners in your improvement journey.

How to Prepare for Both Audits?

  • Keep documentation current
  • Review and close previous findings
  • Train staff on audit processes
  • Conduct mock audits
  • Engage leadership in the audit process

Conclusion:

ISO audit and their findings are not to be feared. They are valuable tools for identifying weaknesses and driving continuous improvement. With the right mindset and preparation, audits can move beyond mere compliance and become a core part of your strategic growth. Organizations that stay audit-ready show that they are not only compliant but also committed to excellence.

Top 10 Common ISO Audit Findings and How to Avoid Them

Importance of Being Audit-Ready:

Audits serve a critical role in verifying that an organization’s processes are aligned with established standards and functioning as intended. Far from being a punitive exercise, audits offer valuable insight into the strengths and weaknesses of a management system.

In my three decades of working with organizations across industries, one universal truth remains. An audit is not a surprise inspection, it’s a mirror. It reflects your organization’s systems, leadership engagement, and cultural commitment to quality and improvement. 

However, many organizations approach audits reactively, preparing only when one is imminent. This mindset often leads to unnecessary stress, inefficiencies, and missed opportunities for improvement. Being audit-ready means that compliance and performance monitoring are built into everyday operations, not treated as one-time events.

When an organization maintains a state of readiness, it reflects a culture of discipline, transparency, and continual improvement. Employees are aware of their responsibilities and of their processes, documentation is up-to-date, and leadership is engaged in the oversight of the system. This proactive approach not only supports successful audit outcomes but also enhances organizational resilience, stakeholder trust, and long-term sustainability.

Understanding ISO Audit Findings: What They Are and Why They Matter:

ISO audit findings are the documented results of an audit. Specifically, they identify areas where an organization’s management system either conforms to or deviates from the requirements of the ISO standard being audited. Findings can range from conformities, to observations (areas for potential improvement), to nonconformities, which indicate a failure to meet a specific requirement.

Audit findings are like diagnostic tools. Much like a physician’s report, they highlight where systems are healthy and where they need attention. Nonconformities, in particular, require careful attention. They are typically classified as minor or major. Left unaddressed, even minor nonconformities can escalate and lead to reputational damage, customer dissatisfaction, or even loss of certification.
In essence, audit findings are not setbacks, they are stepping stones toward improvement.

1. Poor Document Control

Uncontrolled, outdated, or missing documents can quickly lead to findings. Document control is critical for ensuring staff use the correct and current information. Organization can avoid this ISO Audit finding by implementing version control, limiting access to documentation, voiding printed copies of documentation, training employees on document management and regularly reviewing and updating procedures

2. Incomplete or Missing Records

Auditors expect to see evidence that procedures are being followed. If records are absent, it creates doubt about system effectiveness. Was the work really done? Further incomplete records are not able to evidence if the process step was followed as required by the procedure.

Organization can avoid this ISO Audit finding by automating recordkeeping, performing regular record audits, employee awareness and assigning clear ownership for maintaining records

3. Lack of Management Review

Without regular management reviews, there’s no top-level oversight of the system’s performance and alignment with strategic goals. Clause 9.3 of the ISO standards requires these reviews to be done at planned intervals. In some cases the organization may evidence the inputs provided to management but the outputs (decisions and actions) fail to get recorded.

Organizations can avoid this ISO Audit finding by scheduling periodic reviews, using metrics to guide discussions, making sure the leadership participates and documenting decisions and follow-up actions.

4. Ineffective Internal Audits

Weak internal audits fail to uncover problems and leave issues for external auditors to find. This could be caused by  poorly trained and qualified auditors, poor audit planning, using ‘canned’ checklists and a fear of audits and non-conformities causing personnel to hide issues.

Organizations can avoid this ISO Audit finding by training auditors from recognized training providers like QMII, auditing processes and not just documents, closing out internal audit findings promptly.

5. Unclear Roles and Responsibilities

When staff are unsure of their responsibilities, process gaps and accountability issues arise. In companies I have worked with there sometimes arises a confusion from where it is not clear which operator will conduct the task since all have the same job descriptions. 

Organizations can avoid this ISO Audit finding by defining roles and responsibilities in a RACI matrix or in the documented procedure, communicating changes clearly and verifying understanding during onboarding and training.

6. Non-Conformance Not Properly Addressed

Failure to analyze root causes or verify corrective actions can lead to repeat findings. A common cause of this may be a poorly written non-conformity as also a lack of structured root cause analysis training.

Organizations can avoid this ISO Audit finding by following a structured corrective action process, using tools like 5 Whys or Fishbone diagrams and reviewing the effectiveness of corrections

7. Lack of Risk-Based Thinking

ISO standards expect organizations to identify and manage risks proactively. Many still rely too heavily on reactive approaches. In some cases, risks are known, but are not passed up the chain because no structure exists for this to occur. Organizations can avoid this ISO Audit finding by including risk assessments in the planning phase, training staff on risk identification and maintaining a risk register that is updated on a regular basis. 

8. Insufficient Training or Competence

Staff who aren’t trained properly or lack required skills pose a compliance risk. Organizations can avoid this ISO Audit finding by developing and using a skills matrix, providing refresher training, linking training to performance reviews. Once the training is complete organizations must have a process to verify that training resulted in competence. 

9. Failure to Meet Customer or Regulatory Requirements

Not understanding or failing to meet these requirements can lead to major nonconformities. This occurs when organizations do not have a robust process for determining new requirements that may impact them and planning ahead to mitigate the risks. 

Organizations can avoid this ISO Audit finding by reviewing customer contracts and regulations, staying updated on evolving regulations, conducting compliance checks and keeping requirements visible to relevant teams.

10. Lack of Continual Improvement Evidence

Without records of improvement, your ISO system can appear stagnant and ineffective. Organizations can demonstrate to auditors that they meet the intent of continual improvement by trending and tracking KPIs, logging and reviewing improvement initiatives and recognizing and rewarding improvements

How to Retain Auditor Training Knowledge When You Can’t Apply It Immediately 

Completing an auditor training course is an exciting milestone. You walk away with frameworks, methodologies to create checklists, audit question techniques, and—if you’re like most professionals—a head buzzing with new knowledge. Ideally, you’d jump right into an audit and apply your skills, reinforcing what you’ve learned while it’s still fresh. But what if that opportunity doesn’t come right away? 

At QMII, we recognize this common challenge among our alumni. Let’s explore effective strategies to bridge the gap between training and practice—so that knowledge doesn’t fade but instead becomes a solid foundation for your future audit work. 

1. Simulate Real-World Scenarios 

Action: Design mock audits for yourself or with peers. 

Even without access to an organization’s system, you can simulate an audit process by reviewing publicly available quality manuals, environmental reports, or sample procedures including your own. Pretend you’re preparing for an audit: write an audit plan, create checklists, additional documentation you would request and practice conducting document reviews. 

Tip: Use scenarios from your training or past experience and ask yourself: 

  • What would I ask as an auditor? 
  • What evidence would I seek? 
  • What risks could be present? 

2. Start a Learning Journal 

Action: Reflect on key concepts, standards clauses, and audit techniques by writing them down in your own words. 

Journaling isn’t just for reflection, it’s a brain-anchoring technique. When you write out what you remember and how you would apply it, you’re reinforcing neural pathways tied to that knowledge. 

Include: 

  • Summaries of ISO clause requirements. 
  • How you would handle nonconformities. 
  • Sample non-conformities within your organization and write down your assessment of them as also the effectiveness of corrective actions. 

3. Teach Others What You Learned 

Action: Participate in knowledge-sharing sessions. 

There’s no better way to solidify your understanding than teaching others. Reach out to other auditors in your organization and discuss applicability and interpretation of a clause. Participate and contribute to discussions on LinkedIn forums. Search the web for interpretation of clauses and see the differences as opined by various different personnel. 

Bonus: You’re also building your credibility and visibility as an auditor. 

4. Stay Active in the QMII Alumni Network 

Action: Engage with blog articles, LinkedIn posts, ask questions, and share insights. 

QMII’s alumni network offers a treasure trove of experience. Staying engaged keeps you in the loop on best practices and might even lead to mentoring or shadowing opportunities. React to blogs written by QMII, contribute articles for QMII blog, comment on QMII posts and connect to QMII alumni. 

Don’t hesitate to: 

  • Ask others how they’re maintaining their skills. 
  • Request mock audit partnerships. 
  • Share resources and templates you’ve created. 

5. Continue the Learning Loop 

Action: Sign up for webinars, read audit case studies, and revisit your course materials regularly. 

Audit skills are built not just on knowledge, but on judgment, observation, and communication. You can sharpen these even while waiting for your first official audit assignment. 

Suggested activities

  • Attend QMII webinars or ISO updates. 
  • Subscribe to quality-focused newsletters. 
  • Read ISO audit case studies and identify what went wrong—and why. 

6. Request to Observe Internal Audits 

Action: If you’re part of an organization, ask to shadow an experienced auditor. 

Even if you’re not leading, observing an audit helps you internalize the structure, flow, and behavioral nuances of auditing. Jot down observations on auditor behavior, techniques, and interaction styles. Create your own checklists and then compare it to that prepared by the lead auditor. Discuss the differences after the audit. 

If your organization doesn’t have an active program, this is a great opportunity to propose starting one—a value-added initiative from a proactive auditor-in-training. 

Final Thoughts: Don’t Let the Gap Become a Gully 

Skills fade when left idle, but they flourish with even light engagement. Whether it’s through simulation, teaching, journaling, or community interaction, there are numerous ways to keep your audit knowledge sharp and ready. 

At QMII, we believe that continual improvement isn’t just for organizations, it’s a personal practice. Stay connected, stay curious, and keep that audit mindset active until your next assignment arrives. 

Have your own tips for retaining training knowledge? 
Join the conversation by commenting on this blog or drop us a line—we’d love to feature your story! 

ISO 13485: QMS Requirements of Medical Devices for Regulatory Purposes

by Dr. IJ Arora

ISO 13485:2016 is a standard that addresses quality management system requirements for those within the medical device industry. It is based on the systems-based approach found in ISO 9001:2015, but because it emphasizes requirements for regulatory purposes, it does not align with ISO’s harmonized structure (HS). In many ways, ISO 13485 does align with the HS, particularly in the structure and foundational principles of quality management.

The introduction of ISO 13485 explicitly states that the standard is aligned with ISO 9001, and this connection is important for understanding how the two standards relate to each other. I am a bit surprised as to why ISO 13485 isn’t fully harmonized with the HS as defined in Annex SL, which is the specific document within ISO standards that outlines the HS. I believe that if this standard were aligned to the HS, it would make implementation much less laborious for all involved.

The ISO 9001 foundation

The 2015 version of ISO 9001, which is presently under revision, provides a good basis for all standards. As mentioned, ISO 13485 has its roots in ISO 9001, which is why the key QMS principles (e.g., customer focus, leadership, process approach, continual improvement, and evidence-based decision making) central to ISO 9001 are also embedded in ISO 13485.

ISO 13485 includes several core concepts and clauses from ISO 9001. Clause 4 on quality management systems (e.g., structure, documentation requirements, and the scope of the QMS); cause 5 on management responsibility (e.g., top management involvement, resource allocation, and internal audits); and clause 8 relating to measurement, analysis, and improvement (e.g., monitoring, corrective actions, and continual improvement), are just some of these examples.

As I study, teach, consult, and audit using ISO 13485, I wonder why the standard Is not fully harmonized with similar standards as laid out in Annex SL. In consulting, I feel the pain of organizations that must meet regulatory requirements and so tend to overlook the process-based management system (PBMS) approach as the fundamental to the plan-do-check-act (PDCA) cycle. This regulatory focus is one reason why, although ISO 13485 shares many similarities with ISO 9001, it is not fully aligned with the HS. ISO 13485 places a strong emphasis on compliance with regulatory requirements specific to the medical device industry. The standard’s clauses addressing design and development, post-market surveillance, risk management, and traceability requirements are all far more extensive than those found in ISO 9001. Annex SL focuses more on general management practices and less on industry-specific regulatory controls. The detail and specificity required for medical device safety and compliance often necessitates a structure that goes beyond the framework of the HS.

Overcoming differences

Different scopes and audiences are also a consideration in that, while ISO 9001 is a general quality management standard applicable across industries, ISO 13485 is designed specifically for organizations that manufacture medical devices. These organizations must meet stringent regulatory requirements that go beyond what ISO 9001 addresses. Because of this, ISO 13485 requires more detailed processes related to product lifecycle management, post-market activities, risk management, and regulatory controls, which aren’t adequately covered under the more generalized HS. ISO 13485 includes a much stronger emphasis on managing the product’s entire lifecycle, from design and development to post-market activities (e.g., complaint handling and vigilance). Although ISO 9001 mentions product realization, ISO 13485 goes into much greater depth, including extensive requirements for design control and risk management. These elements reflect the higher level of scrutiny needed in the medical device industry, where safety and compliance are paramount.

With that said, I believe that these differences don’t prevent ISO 13485 from being organized according to the HS format. The standard would not only help medical device manufacturers’ management systems conform with specific regulatory requirements but also meet the obligations for continual improvement. After all, registered organizations in the aerospace and automobile industries already do just that via sector-specific management system standards that are harmonized with ISO 9001.

The structural differences in the clauses found in ISO 13485 and the standards adopting the HS are not too far apart. Although ISO 13485 is aligned with ISO 9001, it diverges when it comes to specifics that are unique to the medical device sector and regulatory requirements.

ISO 13485’s clause 7, “Product Realization” includes additional elements, such as design controls and regulatory compliance requirements, that are critical in the medical device industry. Post-market surveillance and complaint handling are central to ISO 13485, but the HS doesn’t go to the level of detail necessary for medical device manufacturers.

ISO 13485 emphasizes the need for continuous monitoring of device performance, even after they are on the market, ensuring any issues are identified and addressed in a timely manner. I believe ISO 9001’s subclause 9.1.2, “Customer Feedback,” can be updated to incorporate this requirement.

Risk management is a vital consideration. ISO 13485 integrates risk management into the standard in a way that is far more structured and pervasive than what is found in ISO 9001. ISO 13485 has a more detailed approach to identifying, assessing, and mitigating risks throughout the lifecycle of medical devices. However, these added requirements could be added to subclause 6.1.1 (““Actions to Address Risks and Opportunities”) or subclause 8.1.1 (“Operation Planning and Control”) found in the HS.

ISO 13485 includes specific requirements for design and development processes, which are critical in medical devices due to their complexity and potential risk to patient safety. The HS doesn’t provide this level of detail for other types of products or industries.

Identifying similarities

Notwithstanding the differences between ISO 13485 and the standards that align with the HS, there are also some key similarities. As with ISO 9001, ISO 13485 is built around seven quality management principles: customer focus, leadership, engagement of people, process approach, improvement, evidence-based decision making, and relationship management. Continual Improvement of the quality management system is part of both standards, emphasizing the need for a strong focus on monitoring, auditing, corrective actions, and reviews. Document control is another similarity. Both ISO 13485 and ISO 9001 stress the importance of clear and accurate documentation to ensure that quality management processes are defined, monitored, and maintained effectively.

In keeping itself separate from the HS, ISO 13485’s clause structure, despite being based on ISO 9001, serves to meet the unique needs of the medical device industry. The decision not to fully harmonize the standard with the structure seen in Annex SL likely stems from the need to ensure a tailored regulatory focus. ISO 13485 is aligned with a variety of regulatory frameworks across different countries and regions (e.g., FDA, EU MDR, TGA, etc.). These regulations require specific processes that go beyond the generic, high-level harmonized framework provided by Annex SL to facilitate combined/ integrated management systems. The structure of ISO 13485 allows for a more detailed, industry-specific approach to product safety, efficacy, risk management, and compliance. Product lifecycle control is an essential part of the medical device industry, and it has a complex lifecycle that includes design controls, manufacturing processes, and post-market activities that require more attention than the HS would provide.

Looking at a few additional clauses reveals that ISO 13485 follows a specific structure that allows it to emphasize the unique aspects of medical device quality management while maintaining consistency with other ISO standards.

For example, Clause 1, “Scope,” is relatively straightforward and outlines the scope of the standard, which is specific to organizations that design, manufacture, and maintain medical devices. The clause also highlights exclusions (for example, aspects not applicable to the organization), which is quite typical in a quality management standard.

Clause 2, “Normative References,” lists the documents referenced within ISO 13485, which is typical for any ISO management system standard. The important point here is that ISO 13485 requires compliance with relevant regulations and standards, particularly those in the medical device sector.

Clause 3, “Terms and Definitions,” is crucial because the terminology in the medical device industry can be very specifically. Definitions clarify terms that might have different meanings in other industries (e.g., what qualifies as a “medical device,” “design verification,” or “post-market surveillance”). This ensures uniformity and understanding across the industry.

Clause 4, “Quality Management System (QMS),” describes the basic requirements for establishing and maintaining a QMS, which is a fundamental aspect of ISO 13485. This clause outlines the need for a quality policy, the establishment of objectives, and the requirement to continually improve the QMS. These are common in all ISO standards but are tailored here to fit the needs of the medical device industry.

Clause 5, “Management Responsibility,” covers executive involvement as a key theme. In ISO 13485, it emphasizes top management’s responsibility for ensuring that quality objectives are met. This clause also requires that management provide resources for quality activities and review the performance of the QMS regularly, ensuring alignment with regulatory requirements and customer needs.

Clause 6, “Resource Management,” could have been aligned to clause 7, “Support,” found in the HS. This clause in ISO 13485 requires the organization to manage resources effectively, which includes personnel training and competence (a critical area in the medical device industry). This ensures that employees have the skills needed to produce safe and effective devices. It also covers infrastructure and the control of the work environment, ensuring that conditions are suitable for maintaining product quality.

Clause 7, “Product Realization,” diverges further from the HS. Product realization in the medical device sector involves the entire lifecycle of the device—from planning, design, development, and manufacturing to service and post-market activities. This clause is extensive and includes requirements for design controls, risk management, validation, and traceability, all of which are critical in the medical device industry. The detailed focus on design and development, verification and validation, and product monitoring ensures that all aspects of a medical device’s journey, from conception to post-market surveillance, are covered.

Clause 8, “Measurement, Analysis, and Improvement,” requires organizations to evaluate the effectiveness of their QMS through regular monitoring, measurement, and audits. It also focuses on corrective and preventive actions (CAPA) to improve quality. Preventive action in the HS has not been thrown out like the proverbial baby with the bath water. It has instead been replaced by requirement to appreciate risk. For medical devices, complaints and nonconformance reporting are key to ensuring ongoing safety and compliance. ISO 13485 could also have gone from preventive action to risk.

Post-market surveillance and vigilance is a requirement of the medical device standard. Unlike many other ISO standards, ISO 13485 places significant emphasis on post-market surveillance, which is the process of monitoring the performance of medical devices once they are in use. This is a major distinguishing factor from other ISO standards. Manufacturers are required to establish processes for post-market feedback, complaint handling, and field safety corrective actions (FSCA), which are essential for identifying and managing risks after the product is on the market.

In conclusion, I would opine and agree that although ISO 13485 is indeed based on ISO 9001, it diverges from the HS identified in Annex SL because the unique needs of the medical device industry—such as regulatory compliance, product lifecycle management, and patient safety—require a more detailed and specialized approach than the HS can provide. The clause structure of ISO 13485 reflects these specific requirements, making it a robust and industry-specific standard that ensures the safety and quality of medical devices while maintaining alignment with the foundational principles of quality management in ISO 9001.

This balance of maintaining core quality principles while addressing the needs of the medical device industry is why ISO 13485 has not fully adopted the HS but instead continues to incorporate elements of ISO 9001 alongside medical-device-specific regulatory needs. That it could still at the least attempt to align the primary clauses as risk to the HS would help all parties involved.

Note – The above article was recently featured in Exemplar Global’s publication called “The Auditor”. Click here to read it.

One-Off or Systemic: The Search for Root Causes

by Julius DeSilva

Accidents and failures, whether in maritime, aviation, healthcare, or nuclear settings, are often subjected to intense scrutiny to determine their root causes. However, the challenge lies in distinguishing whether an event is an anomaly or a symptom of a deeper systemic issue. This analysis is crucial as it directly influences the actions taken to prevent a recurrence or occurrence elsewhere. A management system approach, such as those outlined in ISO 45001 for occupational health and safety, ISO 9001 for quality management, or ISO 14001 for environmental management, provides a structured framework for systematically and proactively addressing risks when data exists.

Analysis of root causes: systemic failures

Root cause analysis is a fundamental investigative tool used to trace an incident to its origins. However, many organizations focus on immediate, apparent causes rather than examining systemic contributors and true root causes. Systemic failures result from weaknesses in policies, processes, or culture, and therefore, often recur in different forms over time.

The management system approach advocated by ISO standards and other industry-specific standards like the ISM code emphasize continual improvement and risk-based thinking. The intent of these standards is to reduce the probability of systemic failures by integrating safety, quality, efficiency, security, and environmental management into everyday operations.

Systemic failure example: Chernobyl

I recently read the book Midnight in Chernobyl, which outlined the 1986 Chernobyl nuclear disaster and the underlying systemic failures that contributed to this incident. Unlike isolated accidents, Chernobyl resulted from a combination of design flaws, operational errors, and a deficient safety culture. Key systemic issues included:

  • Design flaws. The RBMK reactor used in Chernobyl had an inherent positive void coefficient, meaning an increase in steam production could accelerate the reaction uncontrollably.
  • Operational failures. A safety test was conducted under unsafe conditions, including a reduced power level and disengaged emergency shutdown mechanisms.
  • Cultural and regulatory gaps. A lack of safety culture, insufficient training (and thus competency), and an authoritarian management style amounting to complacency discouraged questioning of unsafe practices.

These root causes culminated in an explosion that released massive amounts of radioactive material. European countries are so tightly packed that winds freely spread the outfall without borders. The systemic nature of the disaster was later addressed through international nuclear safety reforms, including the establishment of the International Atomic Energy Agency’s safety standards and stricter ISO frameworks such as ISO 19443, which outlines quality management system requirements for organizations working within the nuclear sector.

Other systemic failures

Deepwater Horizon oil spill (2010)

Another example of a systemic failure is the Deepwater Horizon oil spill. This incident was not merely the result of a single mistake but a consequence of systemic lapses in safety practices, regulatory oversight, and risk management. Contributing factors included:

  • Cultural deficiencies. The organization prioritized cost cutting over risk mitigation
  • Inadequate risk assessments. There was poor well-integrity testing and misinterpretation of pressure data.
  • Regulatory weaknesses. There was insufficient government oversight and a lack of stringent industrywide safety protocols.

This catastrophe led to significant regulatory changes, including the implementation of stricter safety and environmental policies within the oil and gas industry, aligned with ISO 45001 and ISO 14001.

The Boeing 737 MAX crashes (2018, 2019)

The Boeing 737 MAX crashes further illustrate systemic failure. Investigations revealed that flaws in the aircraft’s Maneuvering Characteristics Augmentation System (MCAS) were not adequately addressed due to:

  • Design and engineering oversights. Critical safety features were made optional rather than standard.
  • Regulatory gaps. The FAA relied excessively on Boeing’s self-certification.
  • Organizational pressures. The corporate culture emphasized speed-to-market delivery over comprehensive safety testing.

This resulted in significant regulatory reforms, including tighter oversight and compliance with international aviation safety standards.

Fixes vs. systemic longer-term improvement

Addressing failures can be approached through quick fixes or long-term systemic improvements. Each approach has its advantages and disadvantages:

Quick fixes

Pros:

  • Immediate resolution of pressing issues
  • Cost-effective in the short term
  • Prevents further damage or loss

Cons:

  • Does not address underlying systemic issues
  • Can lead to recurring problems if not supplemented with deeper analysis
  • Often reactive rather than proactive

Systemic longer-term improvements

Pros:

  • Addresses root causes, reducing the likelihood of recurrence
  • Enhances organizational resilience and safety culture
  • Aligns with ISO management systems, ensuring continuous improvement

Cons:

  • Requires significant time and resources
  • May face resistance from stakeholders due to cultural inertia
  • Implementation complexity can slow down immediate corrective actions

A balanced approach is often necessary—implementing short-term fixes to mitigate immediate risks while developing long-term systemic improvements to ensure sustainable safety and risk management practices.

What if we cannot foresee all risks?

Even with rigorous management systems and risk assessments, not all risks can be predicted. Organizations must be prepared to address unforeseen risks through:

  • Resilient systems. It is important to develop adaptable and robust safety management frameworks that can respond effectively to new threats.
  • Proactive learning. The organization can encourage a culture of continuous learning and scenario planning to anticipate emerging risks.
  • Redundancies and safeguards. Implementing fail-fail safe redundancies and contingency plans can mitigate the effects of unforeseen events.
  • Stakeholder collaboration. Engaging industry experts, regulators, and other stakeholders to share knowledge can help improve collective risk awareness.

Despite the lessons from Chernobyl, 25 years later the Fukushima disaster occurred. An earthquake of this magnitude was not foreseen as a risk even though in 1896 (as highlighted by an engineer on the project) an earthquake of magnitude 8.5 hit near the coast where the reactor was to be built. After Chernobyl, the 1970s-built reactor in Fukushima was not upgraded with the latest safety features due to high costs. Japan’s nuclear industry had a history of regulatory complacency and reluctance to accept international recommendations

ISO 31000, which addresses risk management, emphasizes the importance of resilience and adaptability in the face of unpredictable risks. By fostering a commitment to learning and preparedness across the organization, businesses can better navigate uncertainties while maintaining operational safety and efficiency.

The benefits of a management system approach

A management system approach, as defined by ISO standards, provides the following advantages:

  • Structured risk management. ISO 31000 ensures systematic identification, assessment, and mitigation of risks.
  • Continuous improvement. The Plan-Do-Check-Act (PDCA) cycle described in ISO 9001, ISO 45001, and ISO 14001 encourages learning from incidents to prevent recurrence.
  • Organizational culture change. Implementing ISO standards fosters a risk-oriented mindset, reducing the likelihood of systemic failures.

ISO management systems, when implemented and sustained, can act as a preventive tool to proactively manage risk.

Conclusion

Understanding whether an accident is an anomaly or a systemic failure is critical in determining the appropriate response. Sadly, at times industry must incur the cost of the nonconformity to learn the lesson. Organizational “can-do” attitudes lead to risk normalizations where dangerous conditions are seen as normal. Further, organizational and demographic cultures do not encourage challenging authority or questioning of decisions. Absence of accidents, incident reports, and near misses give a false sense of complacency that things are working well. This may lead to over-confidence in decision making, lapses in regulatory oversight, and deferring of resource allocation to other “priorities.”

Systemic failures indicate deeper vulnerabilities requiring long-term corrective actions. The application of ISO management systems offers a proactive and structured approach to accident prevention, ensuring that organizations move beyond reactive responses to fostering a culture of continuous improvement and risk management. By embracing these principles, industries can mitigate systemic risks, ensuring safer and more resilient operations.

Note – The above article was recently featured in Exemplar Global’s publication ‘The Auditor’. Click here to read.

The Baltimore Bridge Collapse—Another Case of a Failed Management System

By – Dr. IJ Arora

Can good management systems make organizations immune to disasters? The Baltimore bridge (or, more precisely, the Francis Scott Key Bridge) collapsed in 2023 because the container vessel MV Dali collided with it. This was a tragedy, perhaps caused by the failure of several management systems, the ship, the port, the state, and whoever else was involved.

The National Transportation Safety Board (NTSB) investigation is ongoing, and will no doubt look at the part played by MV Dali, its crew, and its operator. However, my thought is that MV Dali or other ships plying the waters should have, by simple statistical probability, been considered as risks by the authorities. Between the water channel, the high number of ships sailing in and out regularly, and the bridge itself, there was likely to be an collision someday. Perhaps it was not a matter of if, but when! Therefore, should the bridge have been better designed and made safer based on these known and appreciated risks? After all, not all accidents can be completely avoided, but each tragedy has lessons learned as responsive action. The lessons become the data that drives risk identification and trends, thus making the system proactive. I am sure the NTSB is considering all this. In the meantime, without going into the ongoing investigation, there would seem to be some basics which are common indications of systemic failures. Be it the Titan submersible, or the Boeing management system,  as a subject-matter experts in  process-based management systems, I see a common cause: the failure of the system to  deliver conforming products and services.

In this short article, I want to discuss this bridge collapse in the context of the management system, considering ISO 9001:2015 generically and the requirements of ISO 55001:2024—“Asset management—Vocabulary, overview and principles” specifically. ISO 55001 was first published in 2014. It was developed as a standalone standard for asset management, building upon the principles of ISO 9001 and other relevant standards.

Could simply designing a good system based on the standard have enabled the organization to better assess the associated risks? Perhaps they were assessed, and a bridge allision was considered an extremely low-probability occurrence. If that were the case, the discussion would be on prioritization of risks.

As of the time of this writing (September 2024), the investigation into the Baltimore bridge collapse is still ongoing, and the lawsuits are starting to fly. Although the exact cause of the collapse remains under investigation, we can consider several factors that might have contributed to the incident. MV Dali experienced a series of electrical blackouts before the allision. The implementation of the vessel’s safety management system (SMS, based on the ISM Code) could be a factor. The stability, age, and condition of the bridge are, I am sure, being investigated as a potential contributing factor. Then, there is always human element. There may have been errors on the part of the ship’s crew or the bridge’s operators. Was the SMS designed to support them in such a scenario? What factors may have caused operators at all levels to perhaps not follow requirements and mitigate the risks? The NTSB’s investigation will highlight a detailed analysis of the ship’s navigation systems, the bridge’s structural integrity, and the actions of the individuals involved in this tragedy. Their final report will provide a comprehensive understanding of the incident and may include recommendations to prevent similar occurrences in the future.

However, even at this stage we can agree that bridges in general are national assets. They are valuable infrastructure that provides essential services to communities. Although it is not publicly known whether the state of Maryland specifically implemented ISO 55001 for its bridges, the principles and practices outlined in this standard could have been beneficial in managing the risks associated with the Baltimore bridge. Through the implementation of this standard (and/or ISO 9001), the authorities could have performed:

  • Risk assessments. ISO 55001 requires organizations to conduct regular risk assessments to identify potential threats and vulnerabilities. A thorough assessment of the bridge’s condition, age, and traffic load could have helped identify potential risks and inform maintenance and repair decisions, as could have changes in procedures, protection of navigation channels, and so on.
  • Lifecycle management. The standard emphasizes the importance of managing assets throughout their entire lifecycle, from planning and acquisition to maintenance and disposal. By following ISO 55001, the state could have developed a comprehensive plan for the bridge’s maintenance, upgrades, and eventual replacement.
  • Performance measurements. ISO 55001 requires organizations to establish measurable objectives or key performance indicators (KPIs) to measure the effectiveness of their asset-management activities. This could have helped the state monitor the bridge’s condition and identify any signs of deterioration.
  • Continual improvement. The standard promotes a culture of continual improvement, encouraging organizations to learn from past experiences and make necessary adjustments to their asset-management practices.

It is impossible to say definitively whether ISO 55001 would have prevented the Baltimore bridge collapse. However, the principles and practices outlined in the standard could have helped to reduce the risk inherent in such incidents. By adopting a systematic and proactive approach to asset management, organizations can improve the reliability and safety of their infrastructure. A systematic study must go beyond what the MV Dali contributed to the Baltimore bridge collapse; it is also important to consider the broader context and the potential contributions of other factors:

  • Bridge design and maintenance. The age and condition of the bridge are likely to be factors in the investigation. Older infrastructure may be more susceptible to damage or failure, especially if it has not been adequately maintained or upgraded.
  • Vessel traffic. The frequency and intensity of vessel traffic in the area can also influence the risk of allisions. The bridge is in a busy shipping channel; therefore, the likelihood of incidents was higher.
  • Safety measures. The presence or absence of safety measures such as buoys, warning systems, or restricted areas can also affect the risk of allisions. This needs to be studied and are factors the authorities would know.
  • Human elements and factors. Errors on the part of both the ship’s crew and bridge operators can contribute to accidents. Factors such as fatigue, inexperience, or inadequate training may play a role. What led to these issues? Error proofing, mistake proofing, and failure mode and effects analysis (FMEA) are tools that could be part of the effective management system.

Let us therefore consider ISO 55001 and the relevant clauses of the standard which could apply to the collapse of the Baltimore bridge.

Clause 4—Context of the organization

  • Clause 4.1—Understanding the external context, such as the age of the bridge, traffic volume, and environmental factors, is crucial for risk assessment.
  • Clause 4.2—Identifying the needs and expectations of relevant interested parties, including the public, commuters, and regulatory bodies, is essential for effective asset management.

Clause 6—Planning

  • Clause 6.2.1—The bridge’s asset management plan should have included clear objectives for its maintenance, repair, and replacement.
  • Clause 6.2.2—Specific objectives related to safety, reliability, and cost-effectiveness should have been established.
  • Clause 6.2.3—Detailed planning for maintenance, inspections, and upgrades would have been necessary to ensure the bridge’s structural integrity.

Clause 7—Support

  • Clause 7.1—Adequate resources, including funding, personnel, and expertise, should have been allocated for bridge maintenance and inspection.
  • Clause 7.2—Ensuring that personnel involved in bridge management have the necessary competence and training is essential.
  • Clause 7.3—Raising awareness among all relevant stakeholders about the importance of bridge maintenance and safety is crucial.

Clause 8—Operation and maintenance

  • Clause 8.1—Regular inspections and monitoring of the bridge’s condition would have helped identify potential problems early on.
  • Clause 8.2—A well-defined maintenance schedule, including preventive and corrective maintenance, would have been necessary to address issues before they escalated.

Clause 9—Performance evaluation

  • Clause 9.1—Establishing KPIs to measure the bridge’s performance, such as safety records, traffic flow, and maintenance costs, would have provided valuable insights.
  • Clause 9.2—Regular monitoring and evaluation of these KPIs would have helped identify areas for improvement.

Clause 10—Improvement

  • Clause 10.2—The bridge’s management should have implemented a system for monitoring and measurement, including data collection and analysis.
  • Clause 10.3—Predictive maintenance techniques could have been used to identify potential failures before they occurred.

My objective in writing this article is help demonstrate that by applying the principles of a standard, be it generic ISO 9001 or a more specific standard (as in this case, the asset-management system standard ISO 55001) the organization (in this case the state of Maryland) could have strengthened its asset-management practices and potentially mitigated the risks associated with the Baltimore bridge collapse.

The above article was recently published in the Exemplar Global publication – ‘The Auditor’.