Return   Back to Human Factors

 

CRONER

INDUSTRIAL HEALTH AND SAFETY

Human Factors in Maintenance

Mike Everley

2655 Words

 

Recently, a leading industrialist was killed in an accident at his home only a few hours after meeting the Prime Minister and welcoming him to a project forming part of an urban renewal scheme. The industrialist was killed when sandpapering wooden surfaces on the roof of a stable block, at his home, ready for painting. Apparently he was taking advantage of the fine weather to carry out the maintenance work and was alone at the time of his tragic fall.

Although not strictly a workplace accident, the fatality clearly demonstrates some of the factors which make maintenance work such a high risk activity. It also illustrates the role which human factors play in bringing about such tragic events.

According to the Health and Safety Executive (HSE), major accidents resulting from human error in industrial maintenance are on the increase even though the general accident trend is downwards. It is also the case that maintenance error as a root or contributory cause of major accidents has increased. High profile examples include: Clapham Junction, Bhopal and Piper Alpha.

Dr Paul Davies, HSE's Chief Scientist and Head of the Hazardous Installations Directive stresses that: "Traditional approaches to safety have focussed on engineering and process risks, and sought hardware solutions to them. However, studies show that `human factors` contribute to up to 80% of workplace accidents and incidents. HSE is actively tackling this area by developing its own human factors guidance and expertise, and applying it directly in its inspection and enforcement activities". Amongst recent guidance are the following HSE publications:

Together, they form a major initiative in the attempt to reduce maintenance accidents. The key message being that human error in maintenance is largely predictable and therefore can be identified and managed.

Modeling Human Failure

What HSG 48 aims to provide is a model for predicting human failure. When such a model is combined with a model for predicting equipment failure, such as a Failure Mode and Effect Analysis (FMEA), it becomes possible to predict the majority of failures that can occur in the person-equipment interface. Such predictive analysis being useful for designers of equipment and for the management of high risk activities such as maintenance.

The HSG 48 model deems human failure to be dependant upon either errors or violations. Violations being either routine, situational or exceptional. Errors, on the other hand, fall into the categories of either skill-based errors or mistakes. Skill-based errors can either be slips of action or lapses of memory, whereas mistakes can be categorised as either rule-based mistakes or knowledge-based mistakes. Therefore, the model allows for four causes of error and three causes of violations. However, it should be remembered that further analysis may reveal sub-causes leading to slips, lapses or mistakes - such as distractions or lack of full information etc. The model provides a starting place for a full analysis of predictable human failure rather than a complete and exhaustive outline.

A key distinction in the model being drawn between errors, which are actions or decisions that are not intended but which lead to a deviation from an accepted standard and to an undesired outcome, and violations, which are deliberate deviations from rules or procedures. The following key definitions are worth remembering when applying the HSG 48 model:

Term

Definition

Slips

Failures in carrying out the actions of a task. For example, operating the wrong switch.

Lapses

Failing to carry out an action. For example, not operating the switch at all.

Rule-Based Mistakes

When behaviour is based on remembered rules or familiar procedures, we may use these rules or procedures even when they are not the most convenient or efficient.

Knowledge-Based Mistakes

In unfamiliar situations we have to develop plans and procedures from first principles or utilise analogies based upon past experience, this can lead to misdiagnoses and miscalculation.

Routine Violations

Breaking the rules or procedures becomes a normal way of working within the work group.

Situational Violations

Breaking the rules is due to pressures from the job, such as time pressure, lack of sufficient numbers of staff or the correct equipment for the job.

Exceptional Violations

The rules are rarely broken, except when something goes wrong.

It is worth recalling that human factors involve the interface between the individual, the job that they are performing and the organisation in which the interface takes places - its culture and commitment to health and safety management etc. Only by tackling all three elements can major reductions in accidents and cases of ill-health be achieved.

Accident Proneness or Safety Critical Tasks?

CRR 175 makes the point that "most accident causation models are developed within a frame of reference determined by current events, culture and the level of technology of the time". Early research being coloured by the "nature versus nurture" debate over human nature and by the alarming increase in the number of accidents in British factories prior to and during the First World War. "Two important changes in British industrial practices were held responsible for the dramatic increase in accidents. The first was the increased pressure of work and the speeding up of machines, and the second was the drafting of younger and older workers and also women into the workplace, while `able bodied men` joined the Services". The early research aimed to clarify whether the rise in accidents was due to the fact that the demands of the workplace were now beyond the capabilities of ordinary persons to meet, or whether the problem was limited to a few individuals who could not cope with the new demands due to personal factors.

Both earlier and more recent research has failed to provide conclusive evidence for the notion of accident proneness, what has been shown, however, that those carrying out certain high hazard tasks may be more likely to be involved in accidents. A study of the accident data collected at the Shell Oil Company's manufacturing unit in Texas, between 1981 and 1986, revealed, for example, that 3.4% of employees accounted for 21.5% of the minor accidents. The job categories involved being operations, electrical crafts, process crafts, maintenance crafts and miscellaneous crafts.

Interestingly the model for predicting accident liability, due to individual differences, contained in CRR 175, suggests that those with an unstable extravert type of personality are more likely to commit routine, situational and exceptional violations. Whereas, those with an unstable introvert personality are more likely to commit slips, lapses and mistakes. Therefore, these findings can be used to underpin the model for predicting human failures contained in HSG 48.

Reducing Human Error in Maintenance

HSE's focus on human error during maintenance activities, in Improving Maintenance, comes from the recognition that maintenance is largely a human activity. Therefore, traditional safety strategies, based upon a safe place and a safe person approach, may prove to be less effective - particularly in the case of breakdown or emergency maintenance where a safe place of work cannot always be assured.

Whereas it may prove impossible to remove human error completely, good maintenance management should aim to control the likelihood of error and to limit the severity of the consequences resulting from such error. Often the likelihood of human error during maintenance is foreseeable at the design stage and suitable controls to limit the likelihood of such error, or to limit the severity of the resulting consequences, can be introduced most cost-effectively at this stage.

In one such example, a component in an aircraft engine had the same thread at both ends and could therefore be replaced, during a maintenance operation, in the reverse direction. If this was done, the engine would leak fuel during flight. The component had an arrow stamped upon it showing the correct direction for assembly, however this could easily be missed due to time pressure etc. Therefore, the component was redesigned having a different thread on each end. If this problem had been picked up at the design stage, perhaps through use of a FMEA or HAZOPs technique, then considerable costs would have been saved.

As with HSG 48, the three factors identified by Improving Maintenance as affecting the performance of a maintenance activity are: individual factors, organisational factors and job factors.

The organisational having an impact on individual performance, particularly through the prevailing culture. Maintenance and production departments, for example, often have differing priorities and this can result in conflicts and time pressures with regard to the maintenance being carried out.

It also being important that the requirements of particular maintenance tasks are correctly assessed and that the competencies of the individual sent to carry out these tasks are carefully matched to these requirements. In this way, the requirement of Regulation 13 of the Management of Health and Safety at Work Regulations 1999, to take into account the capabilities of employees as regards health and safety when entrusting them with tasks, is satisfied. When carrying out this matching exercise, it is important to consider job, workplace and environmental factors. Such a matching of requirements and capabilities should not prove a problem for routine or planned maintenance, however it can prove more problematic for breakdown or emergency maintenance which is often carried out under severe time constraints. It is important in such circumstances that a procedure exists, and is understood by the maintenance worker, with regard to what they need to do should they feel that the limits of their competence is being reached in the circumstances of the particular activity.

Finally, the attributes and capabilities of the individual concerned needs to be considered and these include their attitude, habits, personality, skills and competence.

All of these factors need to be managed within an effective safety management system, as required by the Management of Health and Safety at Work Regulations 1999 (a suggested model for such a system being contained in Successful Health and Safety Management (HSG 65) and The British Standard on Occupational Health and Safety Management Systems (BS 8800)).

Improving Maintenance suggests the following three stage assessment method with regard to maintenance management:

Stage 1: Identification

Stage 2: Assessment

Stage 3: Implementation

Identify areas for assessment. Depending upon the size of the maintenance activity, this may be a single process or a series of processes concentrating upon areas of specific concern. Qualitative, semi-qualitative and quantitative approaches may be adopted depending upon the circumstances.

Assess the identified areas for the key maintenance issues of concern. The two complimentary assessment approaches recommended are: incident review and workforce questionnaire. In both cases this will involve the development of forms and scoring schemes. (Examples and worked examples are contained in Improving Maintenance).

Prioritise areas for improvement and develop an action plan. The information obtained can also be utilised in risk assessments to identify issues requiring particular attention. The cost-effectiveness of suggested improvements to maintenance activities should also be considered.

Maintenance Policy

According to HSE, "the importance of developing policies covering critical business activities is increasingly recognised. However, the need for a maintenance policy is often neglected. Even where there is such a policy, it is often produced without consideration of other business objectives, eg production. Problems frequently arise when the responsibilities for maintenance are uncertain or where the maintenance policy is not compatible with the organisation's business plan. In these cases it is common for the maintenance function to have difficulties in securing adequate resources".

Only when there is correct alignment between the maintenance policy and the business plan will adequate resources, in terms of people, equipment and time allocation, be provided. Therefore, much depends upon the attitude of senior management and the resulting culture within the organisation. Where an organisation is unduly production driven, maintenance activities with often struggle along with inadequate resources and frequently accidents will occur. Human errors will be involved as maintenance workers attempt to cope through the taking of shortcuts and the carrying out of incomplete tasks etc.

The maintenance policy should make clear who has overall responsibility for the maintenance programme and the responsibilities and structure within the maintenance department. Where certain maintenance tasks require authorisation, the policy should clarify who is responsible for providing such authorisation. The interface between maintenance and operational departments should also be clearly defined and understood and accepted by all involved.

Where multi-skilled workers, either from maintenance or operational departments , carry out maintenance activities, it is vital that the policy stipulates that their levels of competence be assessed and matched with the requirements of the tasks they are being asked to perform.

Additionally; where contracting staff are relied upon to carry out maintenance operations, clients should not merely assume that they have the appropriate resources to work safely as the contractors may be relying upon the client's site to provide some of the resources. It is important, before work commences, for the client to ensure that the contractor provides a method statement and that agreement is reached as to who will be responsible for the resources outlined on the method statement.

Communications

Many maintenance accidents result from inadequate communications between the parties involved. For example, a brewery and an architect were prosecuted for failing to tell a construction company of the presence of asbestos in the cellar of a public house that was being refurbished. The asbestos was clearly indicated on the plans held by both the brewery and the architect. In another example a mechanic was killed during the maintenance of a road tanker. He was removing a faulty valve using an oxyacetylene torch when an explosion took place. He had not been informed that the tanker had been carrying flammable liquids.

Good communications often involves a mixture of communication methods, rather than over-reliance on one particular method which might prove ambiguous or fail to reach the correct individual at the appropriate time. Such communication methods include: briefings, conversations, notices on notice-boards, telephone or tannoy messages, pagers or two-way radio messages, documents, logs, work orders, checklists and permit-to-work systems. (It being remembered that maintenance procedures need to provide sufficient information to allow the worker to carry out tasks safely, while permits and isolation certificates need to ensure that the appropriate safeguards are in place to allow the task to be carried out safely - therefore procedures and permits are quite distinct both in function and in their content). The emphasis with all forms of communication being clarity and lack of ambiguity within and between messages. It is also vitally important that such clarity and lack of ambiguity is involved in communications between and to shifts during shift changeovers in maintenance work. Such a failure being the root cause of the Piper Alpha disaster.

Monitoring and Reviewing

The performance of the maintenance department needs to be routinely monitored in order to identify poor maintenance practices and other areas where improvements can be made. Such monitoring may utilise the following main performance measures: efficiency measures (repair times), equipment-reliability measures (mean time between repairs), staff morale (absenteeism) and safety measures (accident frequency rate during maintenance). According to HSE, the monitoring of maintenance performance needs to take the following two forms:

The Performance of Significant Maintenance Tasks

The Overall Performance of the Maintenance Department

Highly repetitive tasks, long and detailed tasks, rarely undertaken tasks, tasks dependant upon non-maintenance staff and tasks undertaken in poor environmental conditions.

Frequency of error-causing conditions identified during inspections etc, frequency of incidents and near misses, frequency of revisits due to maintenance errors, mean time between repairs, frequency of equipment failure attributable to human error during maintenance, accident frequency rate, absenteeism, staff morale and the percentage of maintenance which is preventative rather than breakdown.

Finally, the performance of the maintenance function needs to be periodically reviewed. Such reviews can include: audits, safety-culture surveys and benchmarking with other organisations. The results of the review being turned into prioritised action.

Return   Back to Human Factors