Risk! Engineers Talk Governance
Due Diligence and Risk Engineers Richard Robinson and Gaye Francis discuss governance in an engineering context.
Richard & Gaye are co-directors at R2A and have seen the risk business industry become very complex. The OHS/WHS 'business', in particular, has turned into an industry, that appears to be costing an awful lot of organisations an awful lot of money for very little result.
Richard & Gaye's point of difference is that they come from the Common Law viewpoint of what would be expected to be done in the event that something happens. Which is very, very different from just applying the risk management standard (for example).
They combine common law and risk management to come to a due diligence process to make organisations look at what their risk issues are and, more importantly, what they have to have in place to manage these things.
Due diligence is a governance exercise. You can't always be right, but what the courts demand of you is that you're always diligent
Risk! Engineers Talk Governance
Safety Integrity Levels (SIL) Allocation and its implications under WHS Legislation
In this episode of Risk! Engineers Talk Governance, due diligence engineers Richard Robinson and Gaye Francis discuss Safety Integrity Level (SIL) allocation and its implications under WHS legislation.
They discuss their long history of working with IEC /AS 61508 and that their biggest caution is around Part 5 because it uses target levels of risk and safety as the basis for the SIL allocations.
They share that they’ve seen SIL ratings work really well and, at other times, where there's been a misunderstanding of what SIL is. They explain the importance of getting the context right, put your hazard in, identify what your critical hazard is, and then look at all the controls that can be put in place. And often they're the civil design sort of things and mechanical designs before you even go to the electronic systems.
If you’d like more information about Richard and Gaye’s work, head to www.r2a.com.au
Megan (Producer) (00:01):
Welcome to another episode of Risk! Engineers Talk Governance. In this episode, due diligence engineers Richard Robinson and Gaye Francis talk about Safety Integrity Levels (SIL) allocation. This was actually a suggestion from Ben on YouTube, so we thank him for the idea.
If you enjoy the episode, please give us a rating and help us spread the word and also subscribe on your favourite podcast platform. Enjoy and just like Ben, if you've got any feedback, please drop us a line.
Gaye Francis (00:37):
Hi Richard, welcome to another podcast session.
Richard Robinson (00:40):
Hello, pre-holiday Gaye. How are you?
Gaye Francis (00:41):
I know, I know. It's coming around fast.
Gaye Francis (00:45):
Today we're going to talk about, its actually a request for a podcast topic, which is very exciting. So we're going to talk about Safety Integrity Level Allocation, so SIL Allocation, and the implications of that, especially under the WHS legislation (OHS Act). So as always, Richard, hit us hard first and then I'll chip in as we go!
Richard Robinson (01:06):
Excellent Gaye!
Richard Robinson (01:08):
Well there's a whole chapter in our larger textbook (Engineering Due Diligence) about this because this is something that's been a frustration for us for a long time. I don't know if anyone's aware, but R2A was the functional safety assessor under IEC 61508 when it was a drafted in the late 1990s, I guess. And in fact, I think that was your first job (Gaye) when you signed up because we were literally doing the functional safety assessment for how two trains would get past each other on single line track in New South Wales. And that was your job; to test that their allocation and the way in which the watchdog, in particular, was going to work and would deal with every crossing, turnout and every other aspect of the system. So that started life as TOCS - -Train Order Control system -- and turned into TMACS -- Train Management and Control System. And I think our certification stopped about 2015.
Gaye Francis (02:03):
Yeah, it was about 15 years (ago). I think.
Richard Robinson (02:04):
It was only meant to be 10 years, as I recall. And they kept asking us to extend it. Obviously we were slightly anxious about this because if there ever had been a railway collision in that time, I doubt that we'd still be in business.
Gaye Francis (02:19):
Yes.
Richard Robinson (02:20):
Anyway, so at the time my signature was on the train control room at Orange, which I gather has now moved somewhere else, together with my then business partner, Kevin Anderson.
Richard Robinson (02:29):
Now this means we sort of spent a lot of time on this 61508 and it didn't become formalized I think until about 2001/2002, and they're now up to the second edition, which is 2010. Although because when Australia adopts it about a year later it turns into IEC 61508 or AS 61508 2011, which is incredibly frustrating.
Richard Robinson (02:52):
Anyway, I think it's, from memory in seven parts. Part 0 I think has been a more recent thing. The first four parts relate to a general functional safety assessment standard, which is used for the 61511. The functional safety assessment standard for control systems 62065, I think it is, for safety systems and I can't remember what the nuclear one is, but since that's going to be one of our next podcasts, you might recall that.
Gaye Francis (03:16):
I'll make a note.
Richard Robinson (03:18):
Excellent!
Richard Robinson (03:19):
Now the bit that's caused us the greatest grief in which the chapter is all about is basically the SIL Allocation because it uses target levels of risk and safety as the basis for the SIL allocations. So what it asks you to do, and this is in Part 5, what it asks you to do is a hazard risk analysis, work out what the current risk levels are. Then you have to work out what your tolerable or target risk levels are. Then you look at what the different control aspects are, like the existing external ??? think they call them, they change their name between the additions, which external risk reduction facility. And then whatever's left that becomes your SIL allocation. So if you need another two orders of magnitude risk reduction, well then arguably that would be a SIL 2 depending on whether it's continuous or low demand and all that sort of thing.
Richard Robinson (04:07):
Now so far as we're concerned, and that's where the crunch comes, using target levels of risk and safety, which was what the previous ALARP discussion was all about, is absolutely verboten under the provisions of the WHS legislation. And that's why we've always felt that 61508 was a particular problem.
Richard Robinson (04:27):
Now I think it's worth just commenting. I remember looking up the websites internationally a while back just confirming about 61508 and I remember reading a website, which I thought was particularly helpful, where it made the point that it's not a standard that's recognised good practice, it's the worthwhile ideas in the standard. So saying that the cell allocation is an error in 61508, which is one section of one part of seven parts, doesn't mean that the whole thing's futile or anything like that. I mean if you have to go through the process of actually realising the SIL level that you've actually selected, then so far as we know the standard works particularly well. But the problem is the initial SIL allocation, which as we've demonstrated in all the studies we've done, is usually where the greatest cost occurs in the sense that it commits you to a huge cost. I mean basically, as far as we're aware, going from SIL 1 to SIL 2 to SIL 3 to SIL 4, pretty much it means an order of magnitude increase for every SIL number that you go up by.
Gaye Francis (05:33):
In cost. That is.
Gaye Francis (05:36):
But we've also seen it being used in a number of other ways and I think this is where people get confused about it, isn't it? We've often seen it used to allocate a reliability target for a system and we've also seen it being used in isolation and allocating SIL levels to the electronic components without taking into account all of the other controls that are in the system to begin with. So they're often elevated SIL ratings than may necessarily be required.
Richard Robinson (06:07):
Well, the WHS legislation is particularly clear. You've got to eliminate if you can, and that's an absolute categorical imperative. And then if you can't eliminate, then you reduce. And it's weird because you find this remark and I sort of dug it out again, it's about halfway into the first part one and it says something to the effect that, well, no, not within the scope of this standard. It is a primary importance that the determined hazards of the equipment under control are eliminated at source, for example, by the application of inherent safety principles and the application of good engineering practice. But that's just a note like a footnote. It's not saying, you start with saying, can we eliminate this thing? And you might recall the Gateway bridge example in Queensland. We got called in to do a SIL study on an anemometer, an electronic variable message system for the top of the Gateway bridge, which is up in the air. So you can get high winds up there and high sighted vehicles, vans and so forth can get knocked around by wind at that elevation if you're not expecting it. And so they wanted to provide people with...
Gaye Francis (07:14):
Information of what the...
Richard Robinson (07:16):
About wind speeds. The problem was of course, if that information is an error, ie too high or too low or not there at all and people are relying on it, that could then cause...
Gaye Francis (07:24):
An incident or an accident. And they were wondering what SIL level that this needed to be.
Richard Robinson (07:30):
That's Right. And this is where the IT people sort of encouraged this process, as I recall.
Gaye Francis (07:33):
They do.
Richard Robinson (07:34):
Now this... Perhaps you should take over here.
Gaye Francis (07:38):
So we sort of spent the morning workshopping this and arguing for about a morning and then we didn't really come to any landing on it at lunchtime. And then somebody asked, well you've got the Westgate Bridge in Melbourne, what do you use in Melbourne? I said, there's flags on the top of it. So we don't have an electronic system saying what the wind speeds are in Melbourne, but you can tell by the direction of the flags which way the wind's going, whether it's a strong wind, whether it's not a strong wind -- a visual indication. So we came back after lunch and we sort of landed on the control of putting a wind sock on the Gateway bridge that if it was missing, it didn't give incorrect information. It couldn't give incorrect information. And if it was missing then the wrong information wasn't given to the drivers. So I think that's what they ended up landing on, wasn't it? And the electronic system was ditched altogether.
Richard Robinson (08:42):
Yeah, that's right. And so basically just put up a standard wind sock. You see it at an airfield with a light on it so you can see it at night and if it's obviously blowing in one direction and it's rock hard and sitting out to one side, it's blowing a gale in that direction. And that's what was done. And all the SIL status we've done with rare exception, we've downgraded the SIL rating by at least the order of magnitude very often by two.
Gaye Francis (09:07):
Yes. Or no SIL rating at all.
Richard Robinson (09:09):
Because it just didn't make sense. And the only way to do that, and particularly we do it within the context of the WHS legislation. Now you might remember that other study we did in Queensland where the tunnel, the jet fans, in order to sort of manage a fire in the tunnels and it had been given a SIL rating and the contractor, because it had to work for the OMCS and I can't remember what OMCS stands for, but it's an operating management system for the tunnel. The OMCS ran the ventilation system and the idea was, I think it was size that if you turn the jet fans on, you could actually manage a 50 megawatt fire from memory. Design was for a, I think was for a heavy commercial vehicle type fire.
Richard Robinson (09:56):
We turned up to do this review and it was kind of strange actually because first of all, they'd obviously called it up as a reliability standard as I recall, but they'd completely ignored the fire supression system which is an utterly independent system. And so when you looked at both these systems together combined, there was a SIL rating in effect on the fire system which could already be incorporated. And therefore from our point of view, the OMC system didn't actually need accelerating at all. But what was particularly entertaining, if we recall, this was just before the WHS Act commenced in Brisbane on the 1st of January, 2012. So we were working this in November/December, 2021.
Gaye Francis (10:37):
2011.
Richard Robinson (10:38):
2011, yeah, that's right. And I remember distinctly saying, guys, if we sign off under the old tolerable risk basis of things and we sign off, there's a clause in the legislation that says anything that's signed off beforehand doesn't have to be done under the new one as long as you've sort of got it underway. But if you fail to sign off it before Christmas this year and the legislation commences on the 1st of January, we'll all be back here again in the new year doing it again. And all the technical people laughed at us. Remember? <Yes.> They said, oh, you're kidding. And that was before the lawyer turned up and told them all to come back.
Gaye Francis (11:10):
Yes, we were back in the February of 2012, weren't we?
Richard Robinson (11:14):
Yeah we were.
Gaye Francis (11:15):
But again, because they hadn't sort of had the context right of the two systems operating independently, they had over allocated a SIL level to the OMCS as you said, but they didn't take into account the other system.
Richard Robinson (11:32):
Well, a fire system's normally a low demand system, that means it acts less than once per year and it normally has about a SIL 2 low demand rating. Now the ventilator system with regards to fire would have sort of the same frequency, low demand. And so if you add two systems together, you're sort of up to SIL 4, which obviously doesn't make any sense at all for something like that. So anyway, I can't actually recall what did we do there. I think we said it didn't need a SIL rating and I think we worked out, we actually did the numbers to work out what the reliability was and the reliability was considered so high. So whilst it wasn't formally SIL rated t was sort of seen to be adequately reliable as I recall.
Gaye Francis (12:13):
That's my recollection of it as well, that when we actually dug deeper into it, it was more a reliability target for operations rather than a SIL rating per se. Just before, I guess we move on from the SIL ratings. I know that you've said we really don't know how they got the numbers.
Richard Robinson (12:31):
No, what the 10 to the minus 1, 2, 3, 4 low demand then 5, 6, 7, 8 for high demand.
Gaye Francis (12:38):
They just seem to roll over from each other. So we've never been able to ascertain where those numbers were derived from or how they came about.
Richard Robinson (12:47):
No, I've asked at a number of conferences I've been to on SIL studies and things like that and given papers and I've always asked people and it's just one of those things. I mean I don't think we have a philosophical problem with orders of magnitude. I think we've always found it to be quite useful.
Gaye Francis (12:59):
Yes.
Richard Robinson (13:00):
But I've never seen any sort of scientific basis, as it were, for that understanding.
Gaye Francis (13:06):
Yes. Now we have seen SIL levels work well in other cases and we did do a bypass again in Queensland and it was a very narrow bypass that if a wider vehicle, wider truck was going along the bypass, it was one vehicle at a time, they needed traffic signals at either end to stop the traffic. And that was a blind spot. So they did end up SIL rating that, but it didn't have a super high SIL rating.
Richard Robinson (13:34):
That's correct.
Gaye Francis (13:35):
But it was a combination of safety integrity levels as well as a reliability target that these things worked when they were required. So we've seen SIL ratings work really well. We've seen other times where there's been a misunderstanding of what SIL is and they've been more aimed at reliability targets. So I guess with SIL it's one of those things that you have to have an understanding of what you're trying to achieve. Always look at it in the context of all the other controls that you've got in place when you're trying to solve the hazard that you're doing, not just looking at the electronic component of it.
Richard Robinson (14:11):
Well that was the other one, the tunnel under the freeway, remember under the airport
Gaye Francis (14:18):
The Tugan bypass at the Gold Coast airport.
Richard Robinson (14:21):
They wanted to accelerate the VMS system for that you might recall. And again, it was IT people working in isolation because they failed to reflect on the fact that these civil engineers had gone to a lot of trouble to design the tunnel, so as you came around the corner, you've got a clear view from one side of the tunnel to the other. And from a very practical viewpoint, if there's a fireball erupts in the tunnel, do you believe what you see through the windscreen or do you believe the variable message on the side of the road? And the answer is you look through the windscreen and respond to that. So if there's a fireball on the tunnel that you can see, don't go in the tunnel. You don't need a variable message sign to be a high reliability saying don't go there.
Gaye Francis (15:02):
And I think that's what it is. A lot of the time the SIL reviews have been done out of context.
Richard Robinson (15:07):
I think that's correct.
Gaye Francis (15:08):
So that would be probably our takeaway from this SIL podcast is make sure you've got the context right...
Richard Robinson (15:15):
Because that will normally reduce the SIL level and save you generally an order of magnitude of cost.
Gaye Francis (15:20):
Reduce the SIL level or eliminate the requirement for a SIL level sometimes, many times actually. So get the context right, put your hazard in, identify what your critical hazard is, and then look at all the controls that can be put in place. And often they're the civil design sort of things and mechanical designs before you even go to the electronic systems.
Richard Robinson (15:42):
And don't do the target level of risk approach described in 61508 Section 5 because so far as we know, you'll be setting yourself up for some kind of potential criminal charge under the WHS legislation, the way the world has currently gone.
Gaye Francis (15:58):
Alright. So I hope you found that interesting and as we said, again, if anyone's got any topics they'd like to shoot us, we're happy to talk about those as well. So thank you Richard.
Richard Robinson (16:07):
Pleasure Gaye.
Gaye Francis (16:10):
See you next time.
Richard Robinson (16:11):
Indeed.