1
00:00:00,000 --> 00:00:10,000
[Music]

2
00:00:10,000 --> 00:00:12,000
Chain of events, cause and effect.

3
00:00:12,000 --> 00:00:14,000
We analyze what went right and what went wrong,

4
00:00:14,000 --> 00:00:19,000
as we discover that many outcomes can be predicted, planned for, and even prevented.

5
00:00:19,000 --> 00:00:21,000
I'm John Chidgey, and this is Causality.

6
00:00:21,000 --> 00:00:23,000
Causality is supported by you, our listeners.

7
00:00:23,000 --> 00:00:27,000
If you'd like to support the show, you can by becoming a Premium Subscriber.

8
00:00:27,000 --> 00:00:31,879
Premium subscribers have access to early release high quality ad free episodes, as well as

9
00:00:31,879 --> 00:00:35,299
bonus episodes and to Causality Explored.

10
00:00:35,299 --> 00:00:40,700
You can do this via Patreon or, if you prefer, via our website, visit https://engineered.network/causality

11
00:00:40,700 --> 00:00:44,780
to learn how you can help this show to continue to be made.

12
00:00:44,780 --> 00:00:46,140
Thank you.

13
00:00:46,140 --> 00:00:47,140
Walkerton

14
00:00:47,140 --> 00:00:54,619
Walkerton, Ontario, Canada has a population of about 5,000 people and lies approximately

15
00:00:54,619 --> 00:01:00,780
2 hours drive from Toronto, which by contrast is Canada's largest metropolitan city at 6 million

16
00:01:00,780 --> 00:01:09,019
people. In the early 2000s, the water supply to Walkerton was supplied by 3 groundwater wells

17
00:01:09,019 --> 00:01:16,859
numbered 5, 6 and 7. Well 5 was 15 metres deep, capable of supplying 56% of the town's water

18
00:01:16,859 --> 00:01:24,379
supply needs in isolation with Well 6, 72 meters deep with a maximum supply of 52% in

19
00:01:24,379 --> 00:01:31,819
isolation and Well 7 was 76 meters deep the deepest with a maximum supply of 140%

20
00:01:31,819 --> 00:01:40,459
of the town's water supply needs. Well 5 used sodium hypochlorite solution for chlorine dosing

21
00:01:40,459 --> 00:01:44,700
whereas Wells 6 and 7 both used chlorine gas for disinfection.

22
00:01:44,700 --> 00:01:51,260
The water distribution system had approximately 42 kilometres or 26 miles of water mains with

23
00:01:51,260 --> 00:01:55,980
two standpipes providing pressure equalisation and approximately 20 hours of reserved storage.

24
00:01:55,980 --> 00:02:01,900
A basic control system and SCADA controlled and monitored the wells and standpipes.

25
00:02:01,900 --> 00:02:09,340
The Walkerton Public Utilities Commission or PUC for short were charged with the safe operation

26
00:02:09,340 --> 00:02:14,939
and maintenance of the water supply system in Walkerton. So let's talk about the incident itself.

27
00:02:14,939 --> 00:02:22,060
On the 8th of May 2000, a series of storms and steady rain over a five-day period totaled some

28
00:02:22,060 --> 00:02:29,419
134 millimeters of rainfall. The volume of rainfall accumulation led to inevitable surface

29
00:02:29,419 --> 00:02:36,780
saturation and subsequent runoff which led to some minor localized flooding in the area. On the 12th

30
00:02:36,780 --> 00:02:43,900
of May some of that rainfall runoff entered Well 5. On Monday the 15th of May

31
00:02:43,900 --> 00:02:50,139
Stan Koebel returned on shift and noted that well seven was not operating. Every Monday,

32
00:02:50,139 --> 00:02:55,740
and this was no different, the operators collected their weekly samples from each of the wells and

33
00:02:55,740 --> 00:03:01,180
from key sample points in the distribution system. Around this time a construction project was

34
00:03:01,180 --> 00:03:07,340
was underway that required the installation of 615 metres or 2,000 feet of replacement

35
00:03:07,340 --> 00:03:14,400
water mains on Highway 9 in South West Walkerton between Wallace Street and Circle Drive.

36
00:03:14,400 --> 00:03:22,580
On Wednesday 17 May at 9:14am the A&L Laboratory faxed the results from the Highway 9 project

37
00:03:22,580 --> 00:03:31,300
water samples to the PUC. All three samples indicated positive for total coliforms and E. coli.

38
00:03:31,300 --> 00:03:39,219
At 2:37pm that afternoon, the remaining tests from the A&L lab were faxed to the PUC

39
00:03:39,219 --> 00:03:47,860
with a sample labeled Well 7 treated, positive for total coliform and E. coli. Further, the tests

40
00:03:47,860 --> 00:03:55,620
indicated coliform bacteria greater than 200 CFU/100mL, E. coli greater than 200 CFU/100mL,

41
00:03:55,620 --> 00:04:04,099
and a Heterotrophic plate count of 600 CFU/mL. By Thursday, the 18th of May, the number of

42
00:04:04,099 --> 00:04:09,460
illnesses had increased significantly, with a seven-year-old and nine-year-old admitted to the

43
00:04:09,460 --> 00:04:14,740
Owen Sound Hospital, and about 20 students from the Mother Teresa School reported in sick.

44
00:04:15,860 --> 00:04:19,939
Members of the public, including concerned parents, had contacted the Walkerton PUC

45
00:04:19,939 --> 00:04:25,540
to confirm the water was safe to drink, however were not told that anything was wrong.

46
00:04:25,540 --> 00:04:31,220
By Friday 19 May, 8 people had a documented three-day history of symptoms,

47
00:04:31,220 --> 00:04:36,339
with now more than 25 absent from the Mother Teresa School, 8 from Walkerton Public,

48
00:04:36,339 --> 00:04:40,180
and 3 residents from the Maple Court Villa retirement home were also affected.

49
00:04:40,180 --> 00:04:45,060
Dr Kristen Hallett, a paediatrician from the Grey-Bruce Health Services,

50
00:04:45,060 --> 00:04:48,339
had two patients referred to her from the hospital with similar symptoms.

51
00:04:48,339 --> 00:04:55,540
At approximately 9:00am that day, Dr Hallett contacted Dr Murray Quigg, the local medical

52
00:04:55,540 --> 00:05:00,980
officer of health (MHO) to inform him that her food history investigation of those patients

53
00:05:00,980 --> 00:05:06,740
indicated contaminated water was the most likely cause, with E. coli the most likely pathogen.

54
00:05:06,740 --> 00:05:13,139
During that day, James Schmidt, the public health inspector in Walkerton, received multiple calls

55
00:05:13,139 --> 00:05:19,220
and proceeded to call Mr. Koebel at 2:21pm directly, asking him about any issues with

56
00:05:19,220 --> 00:05:23,860
the water supply, to which Stan Koebel indicated he thought the water was 'OK'.

57
00:05:23,860 --> 00:05:29,939
On Sunday, the 21st of May, at approximately 1:30pm, a public health advisory was issued by

58
00:05:29,939 --> 00:05:34,500
the Health Unit to the Walkerton community not to drink municipal water from the tap,

59
00:05:34,500 --> 00:05:41,860
recommending boiling all water before consumption. The MHO also took their own independently

60
00:05:41,860 --> 00:05:47,860
collected water samples from multiple locations in Walkerton and their results on the 23rd of May

61
00:05:47,860 --> 00:05:53,459
all showed E. coli contamination, leading to all schools being closed the following day.

62
00:05:53,459 --> 00:05:59,540
On the 25th of May, the Regional Police Force directed the Ontario Provincial Police to begin

63
00:05:59,540 --> 00:06:06,579
a criminal investigation into the incident. With incidents such as these, the health impacts can

64
00:06:06,579 --> 00:06:12,980
take weeks, months or even years to fully play out, and those most at risk are our most vulnerable

65
00:06:12,980 --> 00:06:18,100
in society: children, the elderly, and those with pre-existing medical conditions that are

66
00:06:18,100 --> 00:06:24,100
ill-equipped to fight off an illness like this. The number of people killed directly or indirectly

67
00:06:24,100 --> 00:06:28,740
due to this incident has been debated since shortly after the incident, with either two,

68
00:06:28,740 --> 00:06:34,339
three or four others indirectly linked, and not all are conclusively proven to have been linked

69
00:06:34,339 --> 00:06:39,699
to the incident though there is strong evidence to suggest that they were. The following people

70
00:06:39,699 --> 00:06:47,459
lost their lives either in whole or in part due to this incident. Melville Dawe 69 years old May 19th.

71
00:06:47,459 --> 00:07:00,100
Lenore Al 66 years old, May 22nd. Mary Rose Raymond 2yrs old, May 23rd. Robert Brodie 89yrs old,

72
00:07:00,100 --> 00:07:08,660
May 24th. Edith Pearson 82yrs old, also May 24th. Vera Coe 75yrs old, also May 24th.

73
00:07:08,660 --> 00:07:17,540
Laura Rowe 84yrs old, on May 29th. Betty Trushinski 56yrs old, May 31st.

74
00:07:17,540 --> 00:07:24,339
So what on earth went wrong? The Honourable Dennis R O'Connor was appointed to lead the

75
00:07:24,339 --> 00:07:29,139
Walkerton Commission into this incident, producing a final report in two parts,

76
00:07:29,139 --> 00:07:32,420
the first of which was released in January 2002.

77
00:07:32,420 --> 00:07:35,779
The source of the contamination was found to be Well 5,

78
00:07:35,779 --> 00:07:39,939
with runoff from a farmer's paddock, with the fecal coliforms from the

79
00:07:39,939 --> 00:07:43,379
livestock excrement being washed into the drinking water.

80
00:07:43,379 --> 00:07:47,540
The investigation unsurfaced several disturbing behaviors and events

81
00:07:47,540 --> 00:07:52,740
leading up to the incident. Prior to the rainfall event on the 5th of May, Stan

82
00:07:52,740 --> 00:07:55,779
Koebel left Walkerton to attend a conference in

83
00:07:55,779 --> 00:08:01,060
Windsor for which he was away until the 14th of May during which time the rainfall event had

84
00:08:01,060 --> 00:08:07,300
occurred. In his absence his brother Frank Koebel was in charge of the PUC in Walkerton. When Stan

85
00:08:07,300 --> 00:08:12,180
left for the conference he was aware that the chlorinator on Well 7 was not functioning

86
00:08:12,180 --> 00:08:17,860
correctly since Well 7 was brought back into service on the 2nd of May. In fact the well hadn't

87
00:08:17,860 --> 00:08:22,819
been in service since the 10th of March. It wasn't uncommon to rotate water supply from each of the

88
00:08:22,819 --> 00:08:28,980
wells. But rather than shut Well 7 off, Stan Koebel instructed Frank to replace

89
00:08:28,980 --> 00:08:33,580
the chlorinator in Well 7 with the replacement unit that had been on the

90
00:08:33,580 --> 00:08:39,080
PUC premises in Walkerton for nearly 1-1/2yrs. Upon Stan's return

91
00:08:39,080 --> 00:08:42,860
he found that Frank had still not fitted the replacement chlorinator to Well 7

92
00:08:42,860 --> 00:08:46,620
and that Well 7 had still been running during that time. Well 7 pumped

93
00:08:46,620 --> 00:08:51,580
unchlorinated water into the system from the 3rd of May to the 9th of May as it

94
00:08:51,580 --> 00:08:56,820
the only well being used during that period, there was no new chlorine being injected into

95
00:08:56,820 --> 00:09:02,240
the water main system for that time. The correct course of action would have been to leave

96
00:09:02,240 --> 00:09:07,940
Wells 5 and/or 6 running while the chlorinator in Well 7 was replaced and then to return

97
00:09:07,940 --> 00:09:13,200
Well 7 to service, an activity that would have taken about a day of end-to-end activities

98
00:09:13,200 --> 00:09:18,659
to complete...absolute maximum. Chlorine residual needs to be maintained to ensure that any

99
00:09:18,659 --> 00:09:26,340
bacteria are killed before the water is consumed, but that's something we'll "explore" separately.

100
00:09:26,340 --> 00:09:31,620
Despite the fact that both Stan and Frank Koebel were aware that chlorination was required

101
00:09:31,620 --> 00:09:39,820
at all times as mandated in the Ontario Drinking Water Objectives and Bulletin 65-W-4, "Chlorination

102
00:09:39,820 --> 00:09:45,399
of Potable Water Supplies", when interviewed following the event, they believed that unchlorinated

103
00:09:45,399 --> 00:09:52,399
water from Well 7 was safe because it was from a deeper well. In addition, PUC staff

104
00:09:52,399 --> 00:09:58,360
would regularly drink raw unchlorinated water at the well because it was cold, clear and

105
00:09:58,360 --> 00:10:06,039
clean and "tasted better" than chlorinated water did. Multiple years of reinforcing the

106
00:10:06,039 --> 00:10:12,000
idea that it's safe to drink it today so it'll be safe to drink it tomorrow led to a mistaken

107
00:10:12,000 --> 00:10:18,480
belief that chlorination was in fact optional for Well 7. In the history of the plant,

108
00:10:18,480 --> 00:10:23,600
there had been no incidents like this, which also fed a mistaken belief that anything like

109
00:10:23,600 --> 00:10:29,919
this could actually happen. Every day operators were required to visit each well and make a

110
00:10:29,919 --> 00:10:36,320
recording of the following: the water flow totalizer, the chlorine chemical usage and

111
00:10:36,320 --> 00:10:43,039
the current chlorine residual. On the 13th of May at 4:10pm 0.75mg/L

112
00:10:43,039 --> 00:10:49,120
of chlorine concentration was recorded for Well 5. There was no entry for Well 6.

113
00:10:49,120 --> 00:10:55,759
By cross-checking the water volume recorded against the amount of Hypochlorite dosed during

114
00:10:55,759 --> 00:11:01,840
that period it was calculated that it was completely impossible to have a chlorine

115
00:11:01,840 --> 00:11:08,960
residual that high given how little Hypo was dosed during that period. The investigation

116
00:11:08,960 --> 00:11:14,320
also found that for more than 20 years it had been regular practice for PUC operators

117
00:11:14,320 --> 00:11:19,519
to not measure the actual chlorine residual but instead write down a fictitious value

118
00:11:19,519 --> 00:11:26,000
to put an entry in the box. Reviews of the logs showed a significant number of readings of 0.5

119
00:11:26,000 --> 00:11:33,840
and 0.75mg/L despite there being no correlation between the documented chlorine residual levels

120
00:11:33,840 --> 00:11:37,279
and chemicals consumed during those respective periods.

121
00:11:37,279 --> 00:11:42,399
Testimony from Stan Koebel was that multiple PUC staff had been filling sample containers from the

122
00:11:42,399 --> 00:11:48,240
PUC workshop which was down the line from Well 5 and labelled them as taken from other locations

123
00:11:48,240 --> 00:11:52,879
in the network. During the inquiry, when he was asked to explain why sample bottles had been

124
00:11:52,879 --> 00:11:57,840
submitted with the incorrect source information written on them, he answered, and I quote,

125
00:11:57,840 --> 00:12:04,240
"simply convenience or just couldn't be bothered." One more point about the behavior of

126
00:12:04,240 --> 00:12:09,840
both Stan and Frank Koebel that was uncovered in the investigation. Frank Koebel, on his brother's

127
00:12:09,840 --> 00:12:17,120
instructions, altered the daily operating sheet for Well 7 on May 22-23 in an apparent

128
00:12:17,120 --> 00:12:22,559
attempt to conceal from the MOE that Well 7 had been operating without a chlorinator for an

129
00:12:22,559 --> 00:12:27,200
an extended period, and that demonstrates that they were fully aware that running without

130
00:12:27,200 --> 00:12:32,559
a chlorinator was not an acceptable practice and yet they did it anyway.

131
00:12:32,559 --> 00:12:39,100
The Walkerton PUC operators therefore in summary, firstly, set inadequate doses of chlorine

132
00:12:39,100 --> 00:12:44,519
based on the water flows, secondly, they did not repair the faulty chlorination equipment

133
00:12:44,519 --> 00:12:49,240
in a timely manner, thirdly, they didn't regularly monitor chlorine residual every

134
00:12:49,240 --> 00:12:53,799
day. Fourthly, they made false entries in their daily logs, four days where readings

135
00:12:53,799 --> 00:12:58,679
were not taken. Fifthly, they intentionally mislabeled locations that microbiological

136
00:12:58,679 --> 00:13:03,600
samples were taken. And finally, they attempted to conceal facts after the event to protect

137
00:13:03,600 --> 00:13:09,279
themselves. The operators were fully aware their practices did not follow the Ministry

138
00:13:09,279 --> 00:13:14,720
of the Environment (MOE) guidelines and their directives. And having said that, the A&L

139
00:13:14,720 --> 00:13:19,720
laboratory also failed by not reporting their findings of potentially unsafe drinking water

140
00:13:19,720 --> 00:13:25,799
to the MOE. The A&L laboratory policy was only to send report results to their client

141
00:13:25,799 --> 00:13:31,159
directly and there was no requirement to notify the MOE or the local medical officer of health

142
00:13:31,159 --> 00:13:36,000
should they have found a problem in their tests. Mr Robert Deakin, the laboratory manager

143
00:13:36,000 --> 00:13:43,080
at A&L claimed he was unaware of section 4.1.3 of the ODWO guideline stating that the lab

144
00:13:43,080 --> 00:13:49,120
should notify the MOE District Office of indications of unsafe drinking water were they found.

145
00:13:49,120 --> 00:13:54,220
On Wednesday 17 May, the alarm could have been raised by the A&L Laboratory alerting

146
00:13:54,220 --> 00:13:59,779
the MOE or the MOH which would have resulted in a boil water notice being issued four days

147
00:13:59,779 --> 00:14:07,500
earlier that would have significantly reduced the spread of the outbreak. But they didn't.

148
00:14:07,500 --> 00:14:12,179
It's unclear how many lives would have been saved had that happened. However, there's

149
00:14:12,179 --> 00:14:16,019
no question the death toll would not have been as high. So let's talk a little bit about E. coli

150
00:14:16,019 --> 00:14:22,500
and what the problem with it is. Escherichia coli, or E. coli for short because it's a lot

151
00:14:22,500 --> 00:14:31,620
easy to say technically O157:H7 was the primary pathogen. The other was Campylobacter

152
00:14:31,620 --> 00:14:36,820
Jejuni which were the two bacteria then most responsible for the majority of deaths and

153
00:14:36,820 --> 00:14:43,779
illnesses in this incident. Once infected with E. coli, the intestinal symptoms last for about four

154
00:14:43,779 --> 00:14:50,580
days and can persist for longer. After 24 hours, bloody diarrhea is common and in some cases severe

155
00:14:50,580 --> 00:14:56,179
abdominal pains and cramping. Generally, it resolves itself without treatment other than

156
00:14:56,179 --> 00:15:02,500
just rehydration and the replacement of the body's electrolytes. However, for some people, particularly

157
00:15:02,500 --> 00:15:07,539
children under five years of age and the elderly, E. coli infection can be far more serious,

158
00:15:07,539 --> 00:15:15,539
causing hemolytic uremic syndrome, HUS, after five to ten days of infection, leading to anemia,

159
00:15:15,539 --> 00:15:21,620
low platelet counts, and in some cases kidney failure. In the most extreme of cases,

160
00:15:21,620 --> 00:15:26,659
these complications can result in death. Campylobacter Jejuni is the most common

161
00:15:26,659 --> 00:15:33,059
variant and it was implicated in the Walkerton incident as well. And in that case, diarrhea

162
00:15:33,059 --> 00:15:39,059
usually lasts 2-7 days with a significantly lower probability of fatality than for E. coli.

163
00:15:39,059 --> 00:15:46,500
The report had many recommendations, 28 in fact, but we'll look at one specific one and four others

164
00:15:46,500 --> 00:15:54,419
that fall broadly under the same key category. The first is recommendation 11, continuous

165
00:15:54,419 --> 00:15:59,940
monitoring. From the report I quote, "The MOE should require continuous chlorine and

166
00:15:59,940 --> 00:16:04,259
turbidity monitors for all groundwater sources that are under the direct influence of surface

167
00:16:04,259 --> 00:16:09,299
water or that serve municipal populations greater than a size prescribed by the MOE."

168
00:16:09,299 --> 00:16:17,539
This happened in 2000. So, a bit of history. In 1996, I worked at the Stanwell

169
00:16:17,539 --> 00:16:23,059
Power Station. That's a 1.4GW baseload power plant outside of my hometown of Rockhampton,

170
00:16:23,059 --> 00:16:27,860
in Queensland. When I joined, the so-called Effluent Outfall was being monitored with

171
00:16:27,860 --> 00:16:31,600
a local data logger for monitoring a small number of water quality measurements. The

172
00:16:31,600 --> 00:16:36,659
project I was asked to execute was for the continuous monitoring of both the inlet and

173
00:16:36,659 --> 00:16:42,299
outlet of the Northern Stormwater Dam to bring the data back into the plant DCS, Distributed

174
00:16:42,299 --> 00:16:48,379
Control System. At the time, the EPA was requiring hourly water quality samples be taken, however,

175
00:16:48,379 --> 00:16:54,100
the system that I installed would take multiple samples every single minute, far exceeding

176
00:16:54,100 --> 00:16:55,879
the requirements.

177
00:16:55,879 --> 00:17:01,759
That was 4 years before the Walkerton incident and the community of Stanwell had 1/3rd

178
00:17:01,759 --> 00:17:05,140
of the number of residents living there.

179
00:17:05,140 --> 00:17:10,339
The point though is that continuous monitoring using either a local data logger or a centralised

180
00:17:10,339 --> 00:17:16,000
SCADA system was well and truly tested and available technology that was not that expensive

181
00:17:16,000 --> 00:17:20,519
that could have been fitted easily into the Walkerton water treatment system had they

182
00:17:20,519 --> 00:17:22,519
wanted to.

183
00:17:22,519 --> 00:17:27,079
In the past 20 years working in water treatment facilities of all different sizes, but particularly

184
00:17:27,079 --> 00:17:32,680
in South East Queensland, I've never seen a system that relied solely on manual measurements,

185
00:17:32,680 --> 00:17:36,480
except occasional cross-checking for equipment calibration.

186
00:17:36,480 --> 00:17:38,000
Better safe than sorry.

187
00:17:38,000 --> 00:17:42,319
Moving on to the other recommendations, of which there's four: 20, 21, 22 and 23, and

188
00:17:42,319 --> 00:17:47,680
they all broadly discuss training. So I'll read each and then I'll summarise and then

189
00:17:47,680 --> 00:17:53,440
I'll summarise all of them at the end. Recommendation 20, I quote, "The government should require

190
00:17:53,440 --> 00:17:57,940
all water system operators, including those who now hold certificates voluntarily obtained

191
00:17:57,940 --> 00:18:03,000
through the grandparenting process, to become certified through examination within two years

192
00:18:03,000 --> 00:18:10,000
and to be periodically recertified." So yes, please ensure the people that are in charge

193
00:18:10,000 --> 00:18:14,539
of the plant are actually certified to do so and you have two years to get it done by

194
00:18:14,539 --> 00:18:18,759
the way and plan to recertify them every so often, that's a good idea, you should get

195
00:18:18,759 --> 00:18:23,660
on that and spoiler alert, they did following the incident.

196
00:18:23,660 --> 00:18:28,940
Recommendation 21, the materials and I quote, "The materials for the water operator course

197
00:18:28,940 --> 00:18:33,579
examinations and continuing education courses should emphasize in addition to the technical

198
00:18:33,579 --> 00:18:38,140
requirements necessary for performing the functions of each class of operator" and

199
00:18:38,140 --> 00:18:44,539
part is in italics, "the gravity of the public health risks" back to normal text, "associated

200
00:18:44,539 --> 00:18:48,700
with a failure to treat and or monitor drinking water properly, the need to seek appropriate

201
00:18:48,700 --> 00:18:53,819
assistance when such risks are identified and the rationale for and importance of regulatory measures

202
00:18:53,819 --> 00:18:58,859
designed to prevent or identify those public health risks." So in other words, make sure your

203
00:18:58,859 --> 00:19:04,299
operators understand that they could kill people if they don't do their jobs properly. A little dose

204
00:19:04,299 --> 00:19:09,460
of fear when you're dosing chlorine goes a long way.

205
00:19:09,460 --> 00:19:10,460
Recommendation 22:

206
00:19:10,460 --> 00:19:17,099
I quote, "The government should amend Ontario Regulation 435/93 to define 'training'

207
00:19:17,099 --> 00:19:21,660
clearly for the purposes of 40 hours of annual mandatory training with an emphasis on the

208
00:19:21,660 --> 00:19:24,099
subject matter described in Recommendation 21."

209
00:19:24,099 --> 00:19:25,339
End quote.

210
00:19:25,339 --> 00:19:28,740
Now this is subtle, but it's really important.

211
00:19:28,740 --> 00:19:33,180
I'm an RPEQ, Registered Professional Engineer in Queensland and a Chartered Professional

212
00:19:33,180 --> 00:19:36,180
Engineer in Australia (CPEng), and

213
00:19:36,180 --> 00:19:40,059
in order to maintain those qualifications and certifications, I'm required to undergo

214
00:19:40,059 --> 00:19:45,220
recordable and audited Continuous Professional Development or CPD for short.

215
00:19:45,220 --> 00:19:49,099
Now, that CPD could include training, but it stipulates that it must be training that's

216
00:19:49,099 --> 00:19:53,980
relevant to my discipline amongst other things.

217
00:19:53,980 --> 00:19:58,740
It's not like I spoke to this guy in the corridor and he taught me how chlorine works so I'm

218
00:19:58,740 --> 00:20:00,140
like trained now.

219
00:20:00,140 --> 00:20:06,059
No, it needs to be structured, reviewed, relevant training that's recorded and tested, otherwise

220
00:20:06,059 --> 00:20:07,299
there's no point.

221
00:20:07,299 --> 00:20:13,140
And when I write down my CPD, I guarantee you, Engineers Australia, check it.

222
00:20:13,140 --> 00:20:16,980
Alright, Recommendation 23, last one.

223
00:20:16,980 --> 00:20:21,579
I quote, "The government should proceed with the proposed requirement that operators

224
00:20:21,579 --> 00:20:27,700
undertake 36 hours of MOE-approved training every three years as a condition of certification

225
00:20:27,700 --> 00:20:29,339
or renewal. Such

226
00:20:29,339 --> 00:20:33,339
courses should include training in emergency issues with water treatment and pathogen risks,

227
00:20:33,339 --> 00:20:38,619
emergency and contingency planning, the gravity of public health risks associated with the failure

228
00:20:38,619 --> 00:20:43,740
to treat and/or monitor drinking water properly, the need to seek appropriate assistance when such

229
00:20:43,740 --> 00:20:48,380
risks are identified, and the rationale for and importance of regulatory measures designed to

230
00:20:48,380 --> 00:20:54,700
prevent or identify public health risks." That was a long couple of sentences but this kind of

231
00:20:54,700 --> 00:20:58,700
repeats and expands on the previous three points which I think probably could have been worded in

232
00:20:58,700 --> 00:21:06,460
a more intertwined way but in essence yes, make the training regular: 3 years.

233
00:21:06,460 --> 00:21:09,059
The key training points are very good though.

234
00:21:09,059 --> 00:21:13,640
They're focused on abnormal operation, how to deal with emergencies and yes let's remind

235
00:21:13,640 --> 00:21:20,299
them again and again that they could kill people if they don't do their job properly.

236
00:21:20,299 --> 00:21:24,539
To wrap up on training we did actually speak about that on Episode 27 about Gare de Lyon

237
00:21:24,539 --> 00:21:27,180
and it's worth repeating here.

238
00:21:27,180 --> 00:21:31,680
When operators are asked to operate any kind of plant, they need to be taught the consequences

239
00:21:31,680 --> 00:21:33,119
of incorrect operation.

240
00:21:33,119 --> 00:21:37,420
And whilst it sounds obvious for something like water treatment, we all drink water,

241
00:21:37,420 --> 00:21:41,079
and hence we could make a lot of people sick or even kill them if we make mistakes in how

242
00:21:41,079 --> 00:21:46,319
we treat, or in this case don't treat, our water before it's consumed.

243
00:21:46,319 --> 00:21:49,720
People think that training's about learning how to do something correctly over and over

244
00:21:49,720 --> 00:21:52,200
and over, and yeah, that's part of it...

245
00:21:52,200 --> 00:21:56,039
but the most important part of operator training isn't how to start it up, shut it down, or

246
00:21:56,039 --> 00:22:02,359
run or test or maintain it necessarily. It's how you handle upsets, unplanned activities,

247
00:22:02,359 --> 00:22:08,799
worst case scenarios, and in this case an E. coli outbreak. Training in this case would

248
00:22:08,799 --> 00:22:13,920
have been as soon as the results came back and they were bad, shut it down. Shut it all

249
00:22:13,920 --> 00:22:20,680
down. Warn people. But they didn't. Understanding the importance of the laboratory testing as

250
00:22:20,680 --> 00:22:24,160
a measure of water quality rather than thinking that you can tell there's E. coli in the water

251
00:22:24,160 --> 00:22:30,680
just by tasting it, this is a huge knowledge and competency gap that's honestly very hard

252
00:22:30,680 --> 00:22:32,400
to fathom.

253
00:22:32,400 --> 00:22:37,039
So it brings me to the final question, possibly the most important question in this whole

254
00:22:37,039 --> 00:22:38,980
incident.

255
00:22:38,980 --> 00:22:45,079
How the hell did Stan and Frank Koebel end up running a water utility in the first place?

256
00:22:45,079 --> 00:22:47,559
So let's talk about Stan for a second.

257
00:22:47,559 --> 00:22:52,480
Stan was a certified class 3 operator of a water distribution system.

258
00:22:52,480 --> 00:22:56,140
He joined Walkerton PUC in 1972 when he was 19 years old.

259
00:22:56,140 --> 00:23:00,200
His father was the foreman of the Walkerton Works Department at the time and he had an

260
00:23:00,200 --> 00:23:01,740
11th grade education.

261
00:23:01,740 --> 00:23:06,259
For the first 4 years of his career, he worked under Ian MacLeod, the then General

262
00:23:06,259 --> 00:23:11,900
Manager of the PUC, before changing to Electrical Supply and Distribution, completing a linesman

263
00:23:11,900 --> 00:23:13,000
apprenticeship.

264
00:23:13,000 --> 00:23:17,220
In 1981, he was promoted to foreman and was responsible for both water and electricity

265
00:23:17,220 --> 00:23:23,859
at PUC, and when Mr. McLeod retired in 1988, he was promoted to the General Manager position.

266
00:23:23,859 --> 00:23:29,460
The only course Stan Koebel attended following the most recent promotion was a leadership

267
00:23:29,460 --> 00:23:31,660
training course.

268
00:23:31,660 --> 00:23:37,059
In 1987, the MOE introduced a grandfathering program for water operators regarding their

269
00:23:37,059 --> 00:23:38,940
certifications.

270
00:23:38,940 --> 00:23:44,819
For those unfamiliar, a grandfather policy is a provision in which an old rule continues

271
00:23:44,819 --> 00:23:51,460
to apply to some existing situations while a new rule will apply to all future situations.

272
00:23:51,460 --> 00:23:56,960
Those exempt from the new rule are said to have grandfathered rights or acquired rights

273
00:23:56,960 --> 00:24:00,299
or to have been grandfathered in, depending on who you speak to.

274
00:24:00,299 --> 00:24:04,960
In the context of the certification, in this case, operators were deemed through experience

275
00:24:04,960 --> 00:24:11,319
to have implicit certification through demonstrated capability and therefore could be safely granted

276
00:24:11,319 --> 00:24:16,440
a certification using experience as their sole measure for qualification.

277
00:24:16,440 --> 00:24:20,319
I'll talk about that a little bit more in a minute.

278
00:24:20,319 --> 00:24:21,319
Back to Stan Koebel.

279
00:24:21,319 --> 00:24:27,400
So at the time, Mr. McLeod submitted Stan Koebel's name to the MOE as he had been certified

280
00:24:27,400 --> 00:24:33,319
as a Class 2 operator, although he had never been required to pass an examination.

281
00:24:33,319 --> 00:24:38,680
He had been recertified as a Class 3 when in 1996 the Walkerton water system was reclassified

282
00:24:38,680 --> 00:24:40,279
as Class 3.

283
00:24:40,279 --> 00:24:44,960
Again, without any MOE assessment of knowledge or skills.

284
00:24:44,960 --> 00:24:52,740
During the testimony, Stan Koebel stated that he did not know what E. coli was, nor of its

285
00:24:52,740 --> 00:24:55,180
implications to human health.

286
00:24:55,180 --> 00:24:59,220
He did not fully understand turbidity or organic Nitrogen.

287
00:24:59,220 --> 00:25:08,619
Consequently, he did not always fully comprehend portions of MOE inspection reports and correspondence.

288
00:25:08,619 --> 00:25:10,460
That's not good.

289
00:25:10,460 --> 00:25:11,460
Frank Koebel.

290
00:25:11,460 --> 00:25:17,140
In 1983, he completed courses to qualify as a journeyman linesman at the Ontario Hydro

291
00:25:17,140 --> 00:25:18,140
Training Centre.

292
00:25:18,140 --> 00:25:23,019
Prior to 1988, approximately one quarter of Frank's time was spent working on hydroelectricity

293
00:25:23,019 --> 00:25:24,940
with the remainder on the water system.

294
00:25:24,940 --> 00:25:28,940
In 1988, he was promoted to foreman in the same time period that his brother was promoted

295
00:25:28,940 --> 00:25:30,880
to general manager.

296
00:25:30,880 --> 00:25:36,519
Frank obtained his Class 2 certification via grandfathering and later his Class 3 without

297
00:25:36,519 --> 00:25:40,980
being required to complete any courses, with no competency testing or examinations just

298
00:25:40,980 --> 00:25:46,220
like his brother. During testimony, Frank Koebel also admitted to many knowledge gaps

299
00:25:46,220 --> 00:25:51,819
that matched Stan's, however, additionally, he was unaware of what Total Chlorine was

300
00:25:51,819 --> 00:25:59,039
(he didn't know what Free Chlorine was) nor was he aware of the Chlorination Bulletin

301
00:25:59,039 --> 00:26:07,680
nor Ontario Regulation 435/93 regarding requirements for the licensing and competency of operators.

302
00:26:07,680 --> 00:26:13,319
In the entirety of his 25 years he worked at Walkerton PUC, Frank Koebel admitted he

303
00:26:13,319 --> 00:26:18,559
had never attended a single training course about chlorination in any form.

304
00:26:18,559 --> 00:26:20,980
So let's talk about the fallout.

305
00:26:20,980 --> 00:26:27,359
The Ontario Government paid more than $72M Canadian dollars just in compensation

306
00:26:27,359 --> 00:26:33,000
to the victims of the incident and their families, and the total economic impact of the incident

307
00:26:33,000 --> 00:26:37,160
was approximately $155M.

308
00:26:37,160 --> 00:26:41,359
The former manager of Walkerton's Utilities Commission, Stan Koebel, was jailed for one

309
00:26:41,359 --> 00:26:44,200
year for his role in this incident.

310
00:26:44,200 --> 00:26:49,359
The former foreman and Stan Koebel's brother, Frank Koebel, was sentenced to nine months

311
00:26:49,359 --> 00:26:51,480
of house arrest.

312
00:26:51,480 --> 00:26:58,859
A total of 10,189 claims were made, with 9,275 qualifying for compensation.

313
00:26:58,859 --> 00:27:04,480
After 7 months since the Boil Water Advisory and at a cost of $11 million, the Ontario

314
00:27:04,480 --> 00:27:09,359
Clear Water Association finally announced the water was once again safe to drink.

315
00:27:09,359 --> 00:27:14,240
Despite that announcement, it took residents many years before they trusted the town water

316
00:27:14,240 --> 00:27:19,680
supply again, with many choosing to stick with bottled water instead.

317
00:27:19,680 --> 00:27:22,160
So what do we conclude from all of this?

318
00:27:22,160 --> 00:27:30,420
The depth of the ignorance, laziness and careless disregard for common sense is almost laughable

319
00:27:30,420 --> 00:27:36,519
if they hadn't managed to kill people and make approximately 2,320 people sick, which

320
00:27:36,519 --> 00:27:39,799
was effectively half the town's population.

321
00:27:39,799 --> 00:27:44,319
There are definitely some similarities to Flint, Michigan insofar as the operators didn't

322
00:27:44,319 --> 00:27:45,880
understand what they were doing.

323
00:27:45,880 --> 00:27:50,519
They fit the dictionary definition of 'incompetent' and there's a link in the show notes if you

324
00:27:50,519 --> 00:27:52,279
don't believe me.

325
00:27:52,279 --> 00:27:57,079
Grandfathering a certification on the basis of demonstrated experience isn't a very good

326
00:27:57,079 --> 00:27:58,160
idea.

327
00:27:58,160 --> 00:28:02,319
If someone is competent through experience, then surely they wouldn't mind sitting a short

328
00:28:02,319 --> 00:28:04,119
test maybe.

329
00:28:04,119 --> 00:28:06,039
What's Free Chlorine for water treatment?

330
00:28:06,039 --> 00:28:07,039
I don't know.

331
00:28:07,039 --> 00:28:08,559
It could be relevant.

332
00:28:08,559 --> 00:28:13,960
The logical flaw where experience is used as a sole indicator of competence is this:

333
00:28:13,960 --> 00:28:18,960
"Just because you've been doing something for 25 years and you're very consistent at it

334
00:28:18,960 --> 00:28:22,880
because you've had lots of practice, that just might mean you're doing it consistently

335
00:28:22,880 --> 00:28:26,539
badly or wrong for 25 years..."

336
00:28:26,539 --> 00:28:28,660
That's all it tells you.

337
00:28:28,660 --> 00:28:29,660
Experience matters.

338
00:28:29,660 --> 00:28:30,660
Yes, it does.

339
00:28:30,660 --> 00:28:33,319
Of course it does, but it has to be practically demonstrated.

340
00:28:33,319 --> 00:28:38,839
Beware anybody that opens up with the line, "I have 25 years of experience," and then

341
00:28:38,839 --> 00:28:40,279
demands respect.

342
00:28:40,279 --> 00:28:42,240
It doesn't work like that.

343
00:28:42,240 --> 00:28:49,299
The fundamental problem I have with situations like this is the promotion of the wrong people.

344
00:28:49,299 --> 00:28:53,559
If a company or a utility where nothing has gone wrong for many decades has a key player

345
00:28:53,559 --> 00:28:59,200
leave and that person has been a key reason why there have been no incidents and no issues

346
00:28:59,200 --> 00:29:04,480
for decades, and they could leave because they retire or they're just or they're downsized,

347
00:29:04,480 --> 00:29:08,640
a few years after that happens, that's when we start to see incidents occurring.

348
00:29:08,640 --> 00:29:09,640
Why is that?

349
00:29:09,640 --> 00:29:14,640
Some people call it a 'Brain Drain' or the 'Grey Drain', but it's more subtle than that.

350
00:29:14,640 --> 00:29:20,640
If you don't know enough about the detail of the technical content of the role you're hiring someone for,

351
00:29:20,640 --> 00:29:26,640
then you don't know what skills and technical knowledge that they need to have in order to function in that role.

352
00:29:26,640 --> 00:29:30,640
So you don't know what training they need either.

353
00:29:30,640 --> 00:29:34,640
So then the next generation of people in that role then compound the problem

354
00:29:34,640 --> 00:29:38,640
by subsequently hiring more people that equally don't know what they need to know

355
00:29:38,640 --> 00:29:43,519
because their new manager doesn't know what they needed to know. The cycle then spirals out of

356
00:29:43,519 --> 00:29:49,480
control until you end up with an incompetent organization and incidents happen and people

357
00:29:49,480 --> 00:29:58,519
die. In succession promotions like this, familiarity with someone already in the organization can put

358
00:29:58,519 --> 00:30:02,000
someone into a role without anyone asking relevant questions about their capabilities

359
00:30:02,000 --> 00:30:06,839
like, "We know Bob. He's been here for years and he's awesome. Now, let's just let him run a nuclear

360
00:30:06,839 --> 00:30:11,240
reactor in manual for an hour, it'll be fine." Hmmm...

361
00:30:11,240 --> 00:30:16,140
There are more jobs out there than you might think where all it takes is one act of incompetence

362
00:30:16,140 --> 00:30:22,799
in the right alignment of events and someone will be injured or become sick or be killed.

363
00:30:22,799 --> 00:30:27,759
Now, if you're in a role and you're not sure about something, ask someone.

364
00:30:27,759 --> 00:30:30,180
Talk to people in similar roles like yours.

365
00:30:30,180 --> 00:30:31,839
Go to conferences if you can.

366
00:30:31,839 --> 00:30:33,200
Be curious.

367
00:30:33,200 --> 00:30:34,640
Why do we dose Hypo?

368
00:30:34,640 --> 00:30:35,759
Why does that matter?

369
00:30:35,759 --> 00:30:38,480
Why don't we run the generator at this frequency?

370
00:30:38,480 --> 00:30:41,700
Why do we need to sync to the grid before we close the circuit breaker?

371
00:30:41,700 --> 00:30:46,940
You might be surprised how people just asking simple questions can break through this kind

372
00:30:46,940 --> 00:30:49,960
of incompetency malaise.

373
00:30:49,960 --> 00:30:53,920
It's hard to see, in the case of Walkerton, what could have been done differently to prevent

374
00:30:53,920 --> 00:30:58,839
this incident without going back to that grandfathering clause.

375
00:30:58,839 --> 00:31:02,279
Maybe the way to look at it is like this.

376
00:31:02,279 --> 00:31:04,359
How are you certified?

377
00:31:04,359 --> 00:31:08,079
How are others you work with certified?

378
00:31:08,079 --> 00:31:13,400
And ask yourself, is experience alone enough?

379
00:31:13,400 --> 00:31:16,000
Spoiler alert, it isn't.

380
00:31:16,000 --> 00:31:19,759
If you're enjoying Causality and want to support the show, you can by supporting our sponsors

381
00:31:19,759 --> 00:31:22,200
or by becoming a Premium Subscriber.

382
00:31:22,200 --> 00:31:26,720
You can find details at https://engineered.network/causality with a thank you to all of our Patrons and

383
00:31:26,720 --> 00:31:31,759
Premium Subscribers and a special thank you to our Patreon Silver Producers Carsten Hansen,

384
00:31:31,759 --> 00:31:35,039
John Whitlow, Joseph Antonio, and Kevin Koch.

385
00:31:35,039 --> 00:31:40,380
And an extra special thank you to our Patreon Gold Producer known only as R. Premium subscribers

386
00:31:40,380 --> 00:31:44,339
and patrons have access to early release, high quality ad-free episodes, as well as

387
00:31:44,339 --> 00:31:47,920
bonus episodes and to Causality Explored.

388
00:31:47,920 --> 00:31:51,519
You can do this via Patreon, or if you prefer, via our website.

389
00:31:51,519 --> 00:31:56,079
Visit https://engineered.network/causality to learn how you can help this show continue to be

390
00:31:56,079 --> 00:31:57,079
made.

391
00:31:57,079 --> 00:32:00,519
Causality is heavily researched and links to all the materials used for the creation

392
00:32:00,519 --> 00:32:04,880
of this episode are contained in the show notes. You can find them in the text of the

393
00:32:04,880 --> 00:32:10,359
episode description of your podcast player or on our website. You can follow me on the

394
00:32:10,359 --> 00:32:18,519
Fediverse at chidgey@engineered.space, on Twitter @johnchidgey or the network on Twitter @Engineered_Net.

395
00:32:18,519 --> 00:32:22,160
This was Causality. I'm John Chidgey. Thanks so much for listening.

396
00:32:22,160 --> 00:32:32,160
[MUSIC]

397
00:32:32,160 --> 00:32:34,740
(somber music)

398
00:32:34,740 --> 00:32:37,319
(gentle music)

399
00:32:37,319 --> 00:32:40,079
(dramatic music)

400
00:32:40,079 --> 00:32:42,839
(dramatic music)

401
00:32:42,839 --> 00:32:44,839
[Music]

402
00:32:44,839 --> 00:32:47,599
(dramatic music)

403
00:32:47,599 --> 00:32:50,359
(dramatic music)

404
00:32:50,359 --> 00:32:52,940
(gentle music)

405
00:32:52,940 --> 00:32:59,940
[Music]

406
00:32:59,940 --> 00:33:02,519
(gentle music)

407
00:33:02,519 --> 00:33:04,519
(music)

408
00:33:04,519 --> 00:33:07,099
(gentle music)

409
00:33:07,099 --> 00:33:09,680
(gentle music)

410
00:33:09,680 --> 00:33:38,680
[Music]

411
00:33:38,680 --> 00:33:44,759
Many thanks to listener John Paul from Ontario, Canada for writing in and requesting this topic and bringing it to my attention.

412
00:33:44,759 --> 00:33:47,599
Good luck in your studies, John. I hope they're going well.

