Reliability engineering key to resilience
As a result of the 2014 Water Act and Ofwat, water companies are increasingly studying how resilient they might be to stresses and strains, both now and in the future. Reliability engineering is the answer, says Alec Erskine, of MWH.
The 2014 Water Act specifically added a fifth clause to the water industry regulator Ofwat’s purpose, namely to secure:
• “(a) the long-term resilience of [water & sewerage] systems as regards environmental pressures, population growth and changes in consumer behaviour, and (b) that undertakers take steps… to meet, in the long term, the need for the supply of water and the provision of sewerage services to consumers,”
Following the 2014 Water Act, Ofwat spent time interpreting its implications and in its Towards Resilience document (December 2015), it provided a definition: “Resilience is the ability to cope with, and recover from, disruption, and anticipate trends and variability in order to maintain services for people and protect the natural environment, now and in the future.”
Given that avoiding “disruption” is pretty much the same as providing reliability, this signals that Ofwat wants to ensure reliability now and in the future. To do this we need the ability to firstly assess our reliability, and secondly predict the future.
For the legislators of the Water Act, there was perhaps an assumption that we were on top of reliability now, and they wanted to make sure we were taking a long-term view of it. The truth is that there is still work to do to ensure we are top of our reliability now.
In Towards Resilience, Ofwat also makes the point that measuring resilience (or reliability) would be a good idea, and recommends that the water companies consider how best to do this. But guidance on measuring resilience is scarce.
There is a relevant British Standard (BS 65000:2014, Guidance on Organisational Resilience) which is mainly focused on the cycle of learning from your mistakes. However, estimating reliability is an area of engineering roughly a century old.
Reliability engineering was first used shortly after the First World War in the context of aeroplane safety. Engineers working on the German V-1 missile programme worked out the basic theory during World War 2.
Space research in the 1950s and ‘60s pushed the theory further forward and the first journal emerged in 1963 (Institute of Electrical and Electronics Engineers Transactions). In 1965 Richard Barlow and Frank Proschan wrote the seminal text entitled Mathematical Theory of Reliability.
Oil and gas and the nuclear industry employ the techniques of reliability routinely but in our industry, it is rarely spotted, with the notable exceptions of its close relative the HAZOP Workshop, and the odd Safety Integrity Level (SIL) assessment for safety systems.
So what is reliability engineering? Reliability engineering is a whole collection of techniques intended to help us determine whether an item or a system is going to function or not. It can split between two approaches: the physical and the actuarial.
The physical approach is to do with variation in “load” – if we understand the variation in load and we understand the load at which the system fails, then we understand the reliability. This is fairly comfortable terrain for engineers and fits well with the “variability” part of the resilience definition. Loads might be interpreted as water demand, wastewater load, weather and so on.
The actuarial approach is more to do with lifetimes and deterioration and captures the asset performance in terms of its “time to failure” or “failure probability”.
Failures happen, especially when you have huge numbers of things. Predicting which ones are going to fail is really difficult but predicting how many are going to fail is surprisingly easy – just look at a graph of the monthly pipe bursts for a water company.
The last step is understanding how and whether individual asset failures escalate to system failures. When does the asset failure lead to an impact on the customer and when does the standby just kick in so the customer never knows it happened? How resilient are our systems and our networks to the inevitable breakdowns? Have we got enough standby, enough cross-connection to be able to cope? Have we got too much?
Techniques and software such as Reliability Block Diagrams and Fault Tree Analysis are designed to make precisely this calculation, turning an asset failure rate to a system failure rate – a quantity representing the reliability in failures per unit time.
These techniques are available and software is also available to help us. The failure rate estimates at asset level are not so commonly available. The big databanks that store data and regularly re-assess failure rates are mostly maintained by the oil industry.
Water data is scarce and we generally have to assume the equivalents in the oil industry have similar failure rates – a weak assumption.
We may get some supporting evidence from company failure records but there is not enough of this data.
Reliability is a means with which to measure our resilience both now and in the future as required by Ofwat. Using standardised or observed rates of failure and calculating the chances of the rare simultaneous failures needed to cause system failure, we can arrive at an objective measure. This helps with many decisions including those difficult calls involving safeguarded systems.
If the pump fails, the standby kicks in and there is no consequence, so there is no risk, so how can we justify replacing the pumps? Well, there is still system risk which will increase as the pumps age – and reliability engineering can quantify it.
So not only does reliability allow us to satisfy the requirements of the law but it may help us see where we have enough standby and where we need more. And that is a benefit worth chasing.
Alec Erskine is senior principal consultant at MWH, now part of Stantec.
This article first appeared in the July 2017 issue of WET News.
- Interview: Richard Harrison, Civils manager, Stonbury '...the transition was so fluent it just emphasised what a great move this was for the future of the company' - Richard... Read More >
- Overpumping solution helps Anglian sludge management When routine maintenance was required on business-critical sludge assets at Anglian Water's Caister WRC, a temporary... Read More >
- Ferrovial snaps up Enterprise Amey's spanish parent, Ferrovial Services, is set to become a major force in the UK utilities sector with the £385M takeover... Read More >
- Moving towards maintenance 4.0 Water utilities need to embrace smart asset management technologies but that is only part of the solution, writes Chris... Read More >
- Round table: Taking stock of totex Has total expenditure become enshrined in utilities' practices? A special pan-utility round table held at Utility Week... Read More >
- Round Table: Smart Asset Management for Water Companies Smart asset management is expected to drive significant improvements in water companies' operations, but is the industry... Read More >
- Into the Deep using advanced camera techniques Underwater drones, also known as remotely-operated vehicles (ROVs) have proved key to the inspection of... Read More >
- Opinion: Protecting our water assets against cyber threats It's vital that engineers start to lead the conversation on cyber security for operational technology in the water... Read More >