Reliability and Safety Software Download
Get a quote
Reliability and Safety Software Demo


 
 
 
 
Reliability Software, Safety and Quality Solutions / Services / System Safety

System Safety Services

System Safety and System Reliability 

System safety is an engineering discipline separate from system reliability and maintainability. Whiles R&M focuses on failure mitigation, Safety focuses on hazard mitigation. These two outlooks do not necessarily coincide.

As defined in MIL-STD-882C, Safety is defined as "freedom from those conditions that can cause death, injury, occupational illness, or damage to or loss of equipment or property, or damage to the environment". Similar definitions (e.g. IEEE STD-1228) emphasize elimination of hazards and accidents. Although in all these definitions the injury and damage are caused accidentally they do not directly refer to system failures or system reliability in any way. Failures can cause injury and damage, but system safety is more general in that it includes also conditions that are not necessarily the result of failure. Moreover, there are clear cases when increasing safety (through the elimination or mitigation of a hazard) decreases the reliability and maintainability of a system. Examples include such mundane systems as elevators with automated trip mechanisms that are triggered under conditions for a potential hazard (e.g. door malfunction) that do not necessarily affect the operational state. Under such circumstances the elevator is safely disabled but completely unreliable as it is not functioning at all.

There are of course many systems and conditions where reliability and safety do align, when proper straightforward functioning of a system (without failures) is enough for both reliable and safe operation. However this is not the general case and having a good reliability program does not necessarily mean that system safety is being effectively managed and satisfied.  

Hazard Analysis
Preliminary Hazard Analysis (PHA)
System Hazard Analysis (SHA)
Subsystem Hazard Analysis
Event Tree Analysis (ETA)
Risk Assessment
Safety Management


Hazard Analysis                                                                                                                                     

There are some variations on the definition of the "hazard" concept, whether it is considered an intrinsic property of an item or a set of conditions that involves both the item and the use environment. A definition that represents a wide spectrum of views defines a hazard as a "State or set of conditions of an item that, together with other conditions in the environment of the item will lead to an accident". Here "item" may refer to a system, subsystem, or object;  and variations can include qualifications such as "inevitably". However, clearly a hazard must be defined with respect to the environment in which the item exists and/or is operating.  This definition also illuminates the fact that a hazard can exist without a system failure: it can be entirely due to a combination of operational system state and environmental conditions (e.g. landing in bad weather).

Hazard analysis should take place, iteratively, over the entire lifecycle of the system and typically will yield different types of results at the different stages.

Preliminary Hazard Analysis (PHA)                                                                                                   

PHA can start as early as concept exploration or at the very early design stage. The PHA identifies the critical system functions and broad system hazards. The results are used to include safety considerations in concept trade-off analyses and design alternative comparisons. Naturally hazards related to implementation details, that are not within the critical system functions, will not be identified at this stage. The results are qualitative and risk assessment  is usually not complete at this stage.

The PHA is evaluated and updated iteratively as the initial design steps are taken. The PHA also provides input to later stage analyses.

Despite limited detail, the PHA provides critical input at a critical time. A decision to skip the preliminary hazard analysis and wait for a time "when we know more about the details of the system" can lead to costly results as safety is not included in the concept and design tradeoff analyses. It can lead to:

  •  Significant late stage design/engineering modifications

  • Costly operational and maintenance requirements in order to mitigate hazards

  •  Failure at market due to cost, safety and liability issues

Although later stage hazard analysis yields more detail on the hazards the preliminary analysis offers information that is more likely to critically affect the success of the program in the long run.

SoHaR will work with your design team and requirement/design documents to assess, list, and prioritize hazards at the early stages of program lifecycle. Our engineers are experienced at focusing in on the hazards and identifying, even at the early stages of design, potential hazards that may be overlooked by design engineers intent on delivering the best functionality. We will bring to the table a safety-centric outlook to complement your design-efforts and ensure that your safety requirements are not left for last.

System Hazard Analysis (SHA)                                                                                                      

System Hazard Analysis (SHA) is commonly performed once design is fleshed out, in parallel with the preliminary design review. The analysis is iteratively updated with the design.  Whereas Preliminary Hazard Analysis focuses on critical functionality and broad system hazards, System hazard analysis focuses in on the details. Specifically we are interested in

  • Overall system operation with attention to users, modes and varying environments

  • Interfaces between subsystems and their interdependent compliance with the overall safety requirements

  •  Whether design changes have affected safety

The results of the SHA are used to recommend changes, identify required controls, and evaluate how the design responds to safety requirements.

SHA requires attention to details of the design, knowledge of operational environments and system mode changes that can lead to unforeseen combinations of conditions. It requires extensive experience with "typical" hazard scenarios combined with detailed knowledge of the domain. SoHaR safety engineers will collaborate with your design and system engineers to ensure no hazards are overlooked and their causes and consequences are adequately identified.

SHA will often include quantitative assessment of hazard: probability of occurrence and severity of consequence. These are required for follow-up risk assessment.

Subsystem Hazard Analysis (SSHA)                                                                                             

Subsystem Hazard analysis (SSHA) is similar to SHA in its goals and methods, however its scope is limited to subsystems as components. It is often initiated at a later stage when details of the subsystems become available. Failure modes as contributing to hazards are focused on at the subsystem level and the detailed interfaces between components are investigated for possible conditions leading to hazards. Here we investigate how each single component affects the safety of the entire system while in the SHA we focus on the collaborative effects of components working together.

Hazards identified in the SHA and linked to specific conditions of subsystems are investigated and their probability of occurrence are estimated based on such input as component reliability and human error. Quantifiable input is added as the specifics of the design emerge.

Subsystems may include a single "media-type" (electronics, software, mechanical) but are often integrated. Embedded software-hardware systems or electromechanical actuators are examples of mixed-media subsystems that require an integrated SSHA. Even when a subsystem is composed purely of one engineering field it is still recommended that the SSHA be performed by safety engineers rather than design or system engineers. The goal of the analysis is to isolate the hazards and safety issues from the design and functional operation of the system. A design engineer with a strong view of the design of the subsystem will have difficulty looking away from mainline operation, as will a system engineers. It is the role of the safety engineer to provide the unique view that focuses on potential mishaps and hazardous conditions.    

Event Tree Analysis                                                                                                                          

Event Tree Analysis (ETA) is a bottom-up technique for analyzing the various outcomes of initiating events. It is often used in conjunction with Fault Tree Analysis which is a top-down method for analyzing and quantifying system failures.  If a system is small or simple enough to allow for a complete Fault Tree Analysis we may not have to analyze Event Trees. However, in most cases a system cannot be fully analyzed top-down and the Event Tree allows us to section off parts of the system for the Fault Tree Analysis.

The Event Tree begins with an initiating event and searches forward for all possible outcomes branching out at nodes signifying:

  • possible conditions (e.g. windy conditions during a gas leak)

  • possible states (e.g. cargo door unlatched during emergency landing)

  • possible malfunction of mitigating factors (e.g. sprinklers in the case of fire)

 

The Event Tree allows us to estimate the probability of outcomes based on the relative probabilities of the branches leading up to the outcome:

 Although the structure of an Event Tree is simple, there are several challenges in performing a useful Event Tree Analysis:

  •  Maintaining a global view of the system as a whole and not only of its various functions. To this end it is preferable to use a safety engineer to perform the analysis rather than a design or system engineer who are preoccupied with correct function. Safety issues are often not related to a system malfunctioning but rather to a negative synergistic effect of system mode, system state, environmental conditions and user actions. A safety engineering view considers all these inputs and is aware of interfaces that may not be up to the challenge when certain conditions and states align. 
     
  • One of the most difficult elements in the ETA is listing the correct initiating events and the correct branching nodes: whether these take place in a nuclear power plant or onboard a space shuttle, the progression of an event to a possible outcome has to be related to conditions we can interpret . As an example we consider microscopic crack formation. The Event Tree for this scenario should not (and cannot) track the physics of crack propagation to a disastrous accident. Rather the Event Tree should simulate how our mitigation elements deal with the crack. For example: a branching node may correspond to the probability of the crack being unnoticed in routine maintenance when it is 1mm or shorter; a second node would correspond to the probability of it being noticed at 3mm; and so on. For these nodes we should have reasonably good estimates and interpretations.
     
  • Quantitative input: often the probabilities available for the node outcomes are approximate or based on broad assumptions. This uncertainty propagates from the nodes to the possible outcomes and should be taken into account in the resulting outcome probabilities.   

The Event Tree Analysis is very useful in identifying the areas where we should use Fault Trees or FMECAs to investigate further. Fault Trees are often used to evaluate the probabilities at specific nodes (e.g. probability of a sprinkler not functioning).

Event Trees bring to the surface protection system features that are most crucial to eliminating risk allowing us to take steps to reduce their failure probability. Although they seem to have a simple form, they are a very powerful tool for an overall safety assessment of system and subsystem safety.

Risk Assessment                                                                                                                       

Risk assessment in the safety context directly relates to the hazard. Risk combines the hazard level with the likelihood of it leading to an accident (danger) and the duration of or exposure to the hazard.

The Event Tree Analysis can provide direct input to Risk Assessment.

The evaluation of risk almost always requires qualitative input and judgment. Often the probabilities involved in the three components are interdependent and one cannot assume a simplified independent probabilities product. A meaningful risk assessment requires experience and a very good understanding of system usage. Risk assessment is complementary to the design effort - it requires looking at negative outcomes. Here too the perspective of a safety engineer is healthy not only because of the specialized experience but also because of their ability to disconnect from mainline function and focus on negative, anticipated or unexpected outcomes.

Safety Management                                                                                                                    

In many industries management of the safety aspects of design, implementation & operation of a system or installation require specialized tools that guarantee that no hazard, safety task or design concern will be overlooked. Safety Management Systems (SMS) , software applications developed to manage this effort, can assure management, customers, and regulatory organizations that safety requirements of the program are  successfully and continuously met.

In particular in the aviation industry current guidelines for the establishment of SMS at airports include:

ICAO Annex 14: A systematic approach to managing safety, including the necessary organizational structures, accountabilities, policy and procedures.

FAA AC 150/5200-37 Formal business like approach to managing safety risk. It includes systematic procedures, practices and policies for the management of safety.

The main goals of a safety management system:

  1. identify possible hazards with significant risk of an accident, injury or damage;
     
  2. select appropriate corrective action to eliminate this risk or to reduce it to acceptable levels;
     
  3. monitor the corrective action taken and test its efficiency;

can be controlled via a single SMS application  that is flexible enough to meet these requirements throughout program lifecycle.

Establishing a formal reporting procedure within the safety management is an important element that allows monitoring the level of safety performance achieved throughout the organization. Thus being aware of every possible threat or risk and taking appropriate corrective action to minimize these risks.

Our 
FavoWeb FRACAS software tool provides the four essential requirements of a successful SMS:

1.         Collection and management of operational data

FavoWeb FRACAS functionality and configuration allows for uniform and easy data collection. Forms are flexible and allow customization of the fields and data to be collected and tracked. FavoWeb FRACAS is user friendly and wherever possible can operate with "point and-click" options, so that data collection is easy and therefore accurate. This also facilitates analysis and reporting as "free-text" input is minimized.

2.         Analysis of the data


FavoWeb FRACAS offers numerous reliability and statistics trending reports combined with a flexible query mechanism.

3.         Risk Assessment
Our RAM Commander 
Safety Assessment Software Module linked to the FavoWeb application implements all qualitative and quantitative tasks for safety assessment required during system development:

  • Generation and verification of safety requirements;
     

  • Identification of all relevant failure conditions;
     

  •  Consideration of all significant combinations of failures causing failure conditions;
     

  • Generation of output reports beginning with Functional Hazard Analysis (FHA/PHA) and ending with System Safety Assessment (SSA), verifying that every aspect of the design meets safety requirements.

     

4.         Corrective Action Effectiveness Assessment 
FavoWeb FRACAS Corrective Actions module provides full support of all corrective action activities:


 
 
 
Customers
OOPS. Your Flash player is missing or outdated.Click here to update your player so you can see this content.