﻿1
00:00:00,000 --> 00:00:12,800
[Music]

2
00:00:12,800 --> 00:00:17,520
Chain of events, cause and effect. We analyse what went right, what went wrong, as we discover that

3
00:00:17,520 --> 00:00:22,800
many outcomes can be predicted, planned for, and even prevented. I'm John Chidgey and this is Causality.

4
00:00:22,800 --> 00:00:28,080
To celebrate the 50th episode of Causality, I'll be hosting three live Q&A sessions for current

5
00:00:28,080 --> 00:00:33,760
patrons in May 2023 to accommodate listeners time zones all around the world. Details will be

6
00:00:33,760 --> 00:00:40,080
published on Patreon in coming weeks. A competition is now open where you could win your own Causality

7
00:00:40,080 --> 00:00:45,440
T-shirt. To enter, all you need to do is write a short or long post either on your own blog,

8
00:00:45,440 --> 00:00:51,360
the Fediverse, Twitter or Facebook, linking to and celebrating your favorite episode of Causality or

9
00:00:51,360 --> 00:00:57,760
just the show in general. Then, submit a link to your post via email to admin@engineered.network

10
00:00:57,760 --> 00:01:03,760
to enter. The competition closes on the 31st of May, 2023 and you can enter as many times as you

11
00:01:03,760 --> 00:01:09,360
like. The best post will be chosen and the winner published on the network blog the following week.

12
00:01:09,360 --> 00:01:14,400
If you don't want to wait, you can just buy your own from the TEN store with T-shirts for this and

13
00:01:14,400 --> 00:01:20,320
other TEN shows, smartphone cases and more are all available now but for a limited time. The

14
00:01:20,320 --> 00:01:26,240
TEN store will be closing on the 14th of June, 2023 so get in while you can. Visit

15
00:01:26,240 --> 00:01:31,440
Visit https://engineered.network/celebrate for details, and keep an eye on Patreon posts for all the

16
00:01:31,440 --> 00:01:32,440
details.

17
00:01:32,440 --> 00:01:35,480
Causality is entirely supported by you, our listeners.

18
00:01:35,480 --> 00:01:39,480
If you'd like to support us and keep the show ad-free, you can by becoming a Premium

19
00:01:39,480 --> 00:01:40,480
Supporter.

20
00:01:40,480 --> 00:01:44,560
Premium Supporters have access to high-quality versions of episodes, as well as bonus material

21
00:01:44,560 --> 00:01:47,760
from all of our shows not available anywhere else.

22
00:01:47,760 --> 00:01:52,160
Just visit https://engineered.network/causality to learn how you can help this show to continue

23
00:01:52,160 --> 00:01:53,640
to be made.

24
00:01:53,640 --> 00:01:54,640
Thank you.

25
00:01:54,640 --> 00:01:56,040
737 Max:

26
00:01:56,040 --> 00:02:01,420
Ethiopian Air. In Episode 33 of this show, we delved deeply

27
00:02:01,420 --> 00:02:06,480
into the history of the Boeing 737 MAX and specifically the incident relating to Lion

28
00:02:06,480 --> 00:02:09,040
Air Flight 610.

29
00:02:09,040 --> 00:02:13,940
Mention was made during that episode of a second incident relating to the 737 MAX regarding

30
00:02:13,940 --> 00:02:20,840
Ethiopian Air Flight 302 that occurred on 10 March 2019, less than 5 months following

31
00:02:20,840 --> 00:02:22,560
Lion Air 610.

32
00:02:22,560 --> 00:02:27,260
The draft investigation report was rather thin on detail at the time, shall we say,

33
00:02:27,260 --> 00:02:33,420
and when Episode 33 aired on the 31st of January 2020, insufficient detail was available to

34
00:02:33,420 --> 00:02:35,700
compare and contrast the two incidents.

35
00:02:35,700 --> 00:02:41,260
With the final report on Flight 302 eventually released on the 23rd of December 2022, we

36
00:02:41,260 --> 00:02:46,100
can now, finally, conclude regarding the Boeing 737 MAX.

37
00:02:46,100 --> 00:02:49,860
Only technical details that weren't already covered in Episode 33 will be covered here

38
00:02:49,860 --> 00:02:50,980
in this episode.

39
00:02:50,980 --> 00:02:55,820
For the full context of technical points relating to MCAS operation, please listen to episode

40
00:02:55,820 --> 00:02:57,220
33 first.

41
00:02:57,220 --> 00:03:01,860
Even if you've listened to it before, it's useful to re-listen to it as the reason MCAS

42
00:03:01,860 --> 00:03:06,760
exists and the other findings for the most part will apply here, plus a few others.

43
00:03:06,760 --> 00:03:09,980
With that all said, now let's talk about the incident.

44
00:03:09,980 --> 00:03:17,620
At 8.36am local time on Sunday 10 March 2019, Ethiopian Air Flight 302 lined up for takeoff

45
00:03:17,620 --> 00:03:24,180
on runway 07R at Addis Ababa Bowl International Airport in Ethiopia.

46
00:03:24,180 --> 00:03:30,520
Aboard were 149 passengers, 5 cabin crew, 2 flight deck crew and an in-flight security

47
00:03:30,520 --> 00:03:36,000
officer, otherwise known as an Air Marshal, and they were heading to Nairobi, specifically

48
00:03:36,000 --> 00:03:41,380
to the Kenya Jomo Kenyatta International Airport, a trip that takes approximately 2

49
00:03:41,380 --> 00:03:42,380
hours.

50
00:03:42,380 --> 00:03:47,540
was Yared Getachew. He was 29 years old and had been flying for nine years.

51
00:03:47,540 --> 00:03:53,860
Yared had 4,017 hours experience on the Boeing 737 Next Generation model with

52
00:03:53,860 --> 00:04:00,020
103 hours on the Boeing 737 Max. In Ethiopia people are addressed by their

53
00:04:00,020 --> 00:04:04,020
given names only and moving forward therefore he'll be referred to only as

54
00:04:04,020 --> 00:04:10,400
Yared. The co-pilot was Ahmed Nur Mohammod. He was 25 years old and was a

55
00:04:10,400 --> 00:04:17,840
relatively recent graduate. He had 151 hours on the Boeing 737 Next Generation model with only 56

56
00:04:17,840 --> 00:04:24,640
hours on the Boeing 737 Max. The aircraft itself was relatively new and had only 1330 hours of

57
00:04:24,640 --> 00:04:32,160
flight time for a total of 382 cycles at that time. At 8.37am and 36 seconds, air traffic control

58
00:04:32,160 --> 00:04:37,200
cleared the flight for takeoff and 15 seconds later the aircraft began its takeoff roll.

59
00:04:37,200 --> 00:04:43,680
At 8.38 and 43 seconds, VR was reached and the aircraft rotated or lifted its nose and started

60
00:04:43,680 --> 00:04:48,480
departing ground level. Within a second of the nose rising, the AOA sensors, that's angle of

61
00:04:48,480 --> 00:04:53,200
attack sensors, began reporting disagree. Activating the left-hand stick shaker and

62
00:04:53,200 --> 00:04:57,920
calculated airspeed on the left-hand side began showing erroneous values relative to the right-hand

63
00:04:57,920 --> 00:05:03,200
side. Five seconds after takeoff, both the master caution and anti-ice lamps turned on.

64
00:05:03,200 --> 00:05:07,920
The captain and first officer attempted to engage autopilot from 400 feet of altitude.

65
00:05:07,920 --> 00:05:14,240
However, it kept disengaging. A total of four times, with the longest autopilot active duration

66
00:05:14,240 --> 00:05:19,840
lasting only 32 seconds. At 8.39 and 59 seconds, the captain radioed the tower,

67
00:05:19,840 --> 00:05:23,920
indicating they were having flight control problems and requested to maintain their initial heading.

68
00:05:23,920 --> 00:05:30,640
During this radio call, the autopilot once again disengaged. The MCAS system activated its first

69
00:05:30,640 --> 00:05:37,680
nose down trim command lasting 9 seconds, leaving a 2.1 unit stabilizer position requiring 90lbs

70
00:05:37,680 --> 00:05:43,920
from the pilot to pitch up the nose of the plane at that point. Red-black stripes then

71
00:05:43,920 --> 00:05:49,760
presented across the speed tape on the left hand controls followed by a GPWS don't sink

72
00:05:49,760 --> 00:05:55,680
warning for 3 seconds and a pull up warning message on both flight displays for 14 seconds.

73
00:05:55,680 --> 00:06:01,360
At 8.40 and 14 seconds, the captain further trimmed nose up for 2 seconds using the electric

74
00:06:01,360 --> 00:06:07,920
trim switches on the control wheel, reaching 2.3 stabilizer units. At 4.50 and 22 seconds,

75
00:06:07,920 --> 00:06:13,360
MCAS applied its second nose down trim command for 7 seconds, though it would have run for 9

76
00:06:13,360 --> 00:06:18,080
seconds except the captain again applied a manual electric trim during its activation,

77
00:06:18,080 --> 00:06:23,440
which cancelled out the MCAS impact, returning to 2.3 stabilizer units.

78
00:06:23,440 --> 00:06:28,400
During this engagement, the GPWS don't sync once again sounded and the pull-up was once again

79
00:06:28,400 --> 00:06:34,640
displayed. At approximately 8.40 and 38 seconds, the first officer and captain agreed to apply the

80
00:06:34,640 --> 00:06:40,880
stability trim cut-out switches. These are guarded switches and disable the automatic electric trim

81
00:06:40,880 --> 00:06:47,360
which is found in all 737s. It was in fact the only way to stop MCAS from functioning and doing

82
00:06:47,360 --> 00:06:52,640
so meant they now had to trim the aircraft manually using the manual trim wheels. At this

83
00:06:52,640 --> 00:06:59,360
point in time the stabilizer was still at 2.3 stabilizer units. The aircraft was at 1500 feet

84
00:06:59,360 --> 00:07:05,600
above ground level although the left hand reported 500 feet lower than this. Travelling at 332 knots

85
00:07:05,600 --> 00:07:12,960
that's 615 kilometers an hour, pitch was 2.5 degrees climbing at 350 feet per minute. At 840

86
00:07:12,960 --> 00:07:18,960
and 43 seconds the MCAS attempted to operate for a third time however the cutout switches inhibited

87
00:07:18,960 --> 00:07:23,360
its operation. The plane's pitch was now effectively in the hands of the pilots manually,

88
00:07:23,360 --> 00:07:29,840
with one or both pulling up periodically leading to varying pitch values between +7° to -2°.

89
00:07:29,840 --> 00:07:38,160
At 8.40 and 50 seconds the aircraft had reached 9,500 feet (2900m) and air traffic control were

90
00:07:38,160 --> 00:07:45,440
advised by the first officer that the captain wanted to maintain 14,000 feet (4,300m) noting

91
00:07:45,440 --> 00:07:50,160
they still had a flight control problem. Over the next few minutes the captain and

92
00:07:50,160 --> 00:07:54,400
first officer applied considerable physical force to pitch up the aircraft in an attempt

93
00:07:54,400 --> 00:08:00,480
to reach 14 000 feet, with the overspeed alert now sounding. In the midst of the confusion,

94
00:08:00,480 --> 00:08:05,520
the captain and first officer had both failed to realize they were still at 94%

95
00:08:05,520 --> 00:08:11,680
N1 reference, with autothrottle still set to ARM mode. As an aside, N1 relates to the low

96
00:08:11,680 --> 00:08:16,480
pressure spool in a jet engine where N2 relates to the high pressure spool. It's referenced against

97
00:08:16,480 --> 00:08:21,040
maximum in percent and is roughly analogous to revs per minute in a reciprocating internal

98
00:08:21,040 --> 00:08:26,080
combustion engine. N2 is more important during initial engine startup however once operating

99
00:08:26,080 --> 00:08:32,000
N1 is generally referenced. Hence a 94% N1 reference is close to maximum thrust which is

100
00:08:32,000 --> 00:08:38,400
required for takeoff. Autothrottle maintains either a speed or thrust setting and was set to N1 which

101
00:08:38,400 --> 00:08:45,440
is a thrust mode. ARM mode on a Boeing 737 allows the pilot to set autothrottle without a set speed,

102
00:08:45,440 --> 00:08:51,360
which sounds odd but its intention is to allow for speed control but still have minimum speed

103
00:08:51,360 --> 00:08:57,440
protection. The use of ARM in the 737 flight crew training manual is described as follows.

104
00:08:57,440 --> 00:09:02,640
"The A/T ARM mode is not normally recommended because its function can be confusing.

105
00:09:02,640 --> 00:09:07,600
The primary feature the A/T ARM mode provides is minimum speed protection in the event the

106
00:09:07,600 --> 00:09:13,760
airplane slows to minimum maneuvering speed. Other features normally associated with the A/T

107
00:09:13,760 --> 00:09:19,680
are not provided." Now back to the incident. The captain asked the first officer to confirm that

108
00:09:19,680 --> 00:09:24,000
the trim was functioning properly in manual. However, the first officer concluded stating

109
00:09:24,000 --> 00:09:30,560
and I quote "it is not working." At 8:42 and 15 seconds the first officer requested a

110
00:09:30,560 --> 00:09:35,920
vector to return to the airport which was granted. They began a banking turn to return to the airport

111
00:09:35,920 --> 00:09:40,400
changing their heading from 102 degrees to a new heading of 262 degrees.

112
00:09:40,400 --> 00:09:47,120
At this point in time, the stabilizer was still at 2.3 units. The aircraft was at 6,200 feet or

113
00:09:47,120 --> 00:09:52,560
1,900 meters above ground level, although the left hand reported 1,250 feet lower than this.

114
00:09:52,560 --> 00:09:58,720
Travelling at 367 knots, that's 680 kilometers an hour, pitch was plus 1 degree, descending at

115
00:09:58,720 --> 00:10:05,440
125 feet per minute, banking 21 degrees right. In order to re-engage autopilot, the captain and

116
00:10:05,440 --> 00:10:10,440
and 1st officer turned off the stability trim cutout switches and began trim control once

117
00:10:10,440 --> 00:10:18,560
again using the electric trim switches. However, this action also re-engaged MCAS. At 8:43

118
00:10:18,560 --> 00:10:23,720
and 21 seconds, the MCAS operated for the fourth time, pitching down for 5 seconds,

119
00:10:23,720 --> 00:10:29,200
moving the stabilizer to 1 unit. During this activation, the captain and 1st officer decreased

120
00:10:29,200 --> 00:10:34,760
their average force pulling up from 100 pounds to 78 over three and a half seconds, during

121
00:10:34,760 --> 00:10:44,200
which time the pitch went from plus 0.5 degrees, nose up, to minus 7.8 degrees, now nose down.

122
00:10:44,200 --> 00:10:49,320
This increased the descent rate from minus 100 feet per minute to minus 5000 feet per

123
00:10:49,320 --> 00:10:54,440
minute. The combined force applied by the pilot and first officer registered at 180

124
00:10:54,440 --> 00:11:01,560
pounds and despite this they could not pull out. At 8:43 and 36 seconds the Enhanced Ground

125
00:11:01,560 --> 00:11:09,640
Proximity Warning system (EPWS) alerted "Terrain, Terrain, Pull up, Pull up." Impact occurred at

126
00:11:09,640 --> 00:11:16,600
approximately 8:43 and 44 seconds. According to the onboard sensors the aircraft was traveling

127
00:11:16,600 --> 00:11:22,760
at approximately 500 knots, that's 926 kilometers per hour, at the time of impact.

128
00:11:22,760 --> 00:11:30,040
The plane had crashed in farmer's fields near the town of Bishoftu, 62km or 39mi

129
00:11:30,040 --> 00:11:36,040
southeast of the airport that it took off from, leaving an impact crater of 27 meters, that's 90

130
00:11:36,040 --> 00:11:42,440
feet wide, and 37 meters, that's 120 feet long, with wreckage found as much as 9 meters or 30

131
00:11:42,440 --> 00:11:48,760
feet deep in the ground. There were no survivors. Let's talk about the investigation and reports.

132
00:11:48,760 --> 00:11:53,880
The Federal Democratic Republic of Ethiopia, Ministry of Transport and Logistics,

133
00:11:53,880 --> 00:12:01,400
Aircraft Accident Investigation Bureau or Ethiopian Accident Investigation Bureau or EAIB

134
00:12:01,400 --> 00:12:08,120
for short, conducted the investigation on behalf of the Ethiopian Civil Aviation Authority (ECAA).

135
00:12:08,120 --> 00:12:15,160
They issued multiple reports. The first was a preliminary report on the 19th of April 2019.

136
00:12:15,160 --> 00:12:21,560
The next was an interim report on the 9th of March 2020. The first final draft for internal

137
00:12:21,560 --> 00:12:27,880
review was released on the 12th of January 2021. The second final draft for internal review was

138
00:12:27,880 --> 00:12:34,440
released on the 26th of May 2021. The third final draft for internal review was released on the 30th

139
00:12:34,440 --> 00:12:43,320
of March 2022 and as we said previously the final was released on the 23rd of December 2022. Now

140
00:12:43,320 --> 00:12:49,480
this may seem odd but there is a good reason for this. The International Civil Aviation Organization

141
00:12:49,480 --> 00:12:56,680
ICAO, Annex 13 aircraft accident and incident investigation requires a preliminary

142
00:12:56,680 --> 00:13:02,520
report is released within 30 days of the incident. Specifically it requests and I quote "The state

143
00:13:02,520 --> 00:13:07,240
conducting the investigation should release the final report in the shortest possible time and,

144
00:13:07,240 --> 00:13:12,520
if possible, within 12 months of the date of the occurrence. If the report cannot be released within

145
00:13:12,520 --> 00:13:16,920
12 months, the State conducting the investigation should release an interim report on each

146
00:13:16,920 --> 00:13:22,360
anniversary of the occurrence detailing the progress of the investigation and any safety

147
00:13:22,360 --> 00:13:28,760
issues raised." Noting the timing and dates of the report released, it's clear that these were

148
00:13:28,760 --> 00:13:33,080
intended to meet this expectation. What's interesting is whether they were intending

149
00:13:33,080 --> 00:13:38,520
these drafts to be an opportunity for, or whether they were actually interested in taking on feedback.

150
00:13:38,520 --> 00:13:42,600
There are two external entities that were requested to assist in the investigation that

151
00:13:42,600 --> 00:13:48,360
are worthy of call out. The first was the Bureau d’Enquêtes et d’Analyses pour la Sécurité

152
00:13:48,360 --> 00:13:56,520
de l’Aviation Civile (BEA) from France. The second was a United States team comprising

153
00:13:56,520 --> 00:14:02,440
representatives from the National Transportation Safety Board (NTSB), Federal Aviation Administration

154
00:14:02,440 --> 00:14:08,200
(FAA), the aircraft manufacturer Boeing, and the engine manufacturer General Electric.

155
00:14:08,200 --> 00:14:13,560
In addition to this, the US team called Collins Aerospace in as a technical advisor to the US

156
00:14:13,560 --> 00:14:19,400
team in April 2019 after the EAIB requested assistance into the most likely failure modes

157
00:14:19,400 --> 00:14:25,480
of an AOA sensor. More about this in a minute. To quickly revisit the AOA sensors previously

158
00:14:25,480 --> 00:14:30,360
discussed in Episode 33 and how they relate to MCAS, that's the Maneuver Characteristic

159
00:14:30,360 --> 00:14:36,600
Augmentation System. There are two AOA sensors, also sometimes called alpha vanes, fitted on every

160
00:14:36,600 --> 00:14:43,480
737 on either side of the nose directly beneath a respective pair of pitot tubes. The AOA sensors

161
00:14:43,480 --> 00:14:49,320
pivot around a central axis with the small reverse swept blade or fin, often referred to as the vein.

162
00:14:49,320 --> 00:14:55,320
Unlike an aerofoil, the fin operates just like a wind vane. It is blown backwards to a position

163
00:14:55,320 --> 00:14:59,240
where it has the least cross-sectional wind resistance, which is directly in the downstream

164
00:14:59,240 --> 00:15:04,440
direction of the airflow. The flight control computers receive their process values from

165
00:15:04,440 --> 00:15:11,240
sensors via multiple systems, including the Air Data Inertial Reference System, or ADIRS.

166
00:15:11,240 --> 00:15:18,520
The left ADIRU from the left AOA sensors and the right ADIRU from the right AOA sensor.

167
00:15:18,520 --> 00:15:23,800
MCAS is a flight control law, executed in a single Flight Control Computer only,

168
00:15:23,800 --> 00:15:30,120
based on the angle of attack value from a single sensor. MCAS is only present on the 737 MAX range

169
00:15:30,120 --> 00:15:36,280
of aircraft and it becomes active during manual (meaning autopilot is not engaged) flaps fully up

170
00:15:36,280 --> 00:15:42,360
in position 0 when the AOA value received by the Master Flight Control Computer exceeds a determined

171
00:15:42,360 --> 00:15:47,880
setpoint value. The intention of MCAS was to avoid the need for retraining of pilots that were used

172
00:15:47,880 --> 00:15:55,000
to the different reaction of the 737 MAX compared to the previous generation, the 737NG, for which

173
00:15:55,000 --> 00:16:01,480
the MAX presented a different pitch response due to design changes to the MAX. Upon customer release

174
00:16:01,480 --> 00:16:06,520
there was no mention of MCAS in any training materials and Boeing had advertised the newer

175
00:16:06,520 --> 00:16:14,280
plane as not requiring any additional retraining over the 737NG. So what went wrong? The investigators

176
00:16:14,280 --> 00:16:19,880
implicated MCAS as the primary cause of the incident, for much the same reasons as Lion

177
00:16:19,880 --> 00:16:25,480
Air 610. The incorrect AOA sensor reading into the flight master control computer

178
00:16:25,480 --> 00:16:29,240
under manual electric trim control had operated incorrectly.

179
00:16:29,240 --> 00:16:35,240
The pilot and first officer then began a series of trim corrections, cutting MCAS out and back

180
00:16:35,240 --> 00:16:40,920
in again until MCAS commanded a nose down into the ground. In essence, the pilots were fighting the

181
00:16:40,920 --> 00:16:47,320
control system due to an erroneous input and lost. Admittedly, the Lion Air 610 pilots managed to

182
00:16:47,320 --> 00:16:52,840
fight MCAS for longer, about twice as long for 12 minutes, but the issue here should be, why are we

183
00:16:52,840 --> 00:16:58,440
fighting MCAS at all? A more detailed deep dive into MCAS and its issues is in Episode 33 if you're

184
00:16:58,440 --> 00:17:03,960
interested. Given that the erroneous AOA sensor values triggered the MCAS behavior, the investigators

185
00:17:03,960 --> 00:17:09,800
found that, and I quote, "an AOA sensor malfunction most likely occurred as the result of a power

186
00:17:09,800 --> 00:17:16,120
quality problem that resulted in the loss of power to the left AOA sensor heater."

187
00:17:16,120 --> 00:17:19,360
So let's look at the AOA sensor heater just for a moment.

188
00:17:19,360 --> 00:17:25,720
The EAIB report states that the AOA sensors have an "embedded heater in a vane that

189
00:17:25,720 --> 00:17:31,800
thermally compensates to increase the vane surface temperature in high flow and for de-icing."

190
00:17:31,800 --> 00:17:37,880
It's potentially a bad translation, I tweaked it slightly, but in essence, the AOA vane,

191
00:17:37,880 --> 00:17:42,920
shaft and coupling must be kept free-moving at all times, otherwise it won't settle into the

192
00:17:42,920 --> 00:17:48,200
correct position when airflow travels over it. Seems simple enough. So if the heater had failed

193
00:17:48,200 --> 00:17:52,680
and ice had accumulated, then you might see readings like those encountered leading up to

194
00:17:52,680 --> 00:17:57,720
the incident. However, heaters are subject to the laws of thermodynamics and basic physics,

195
00:17:57,720 --> 00:18:03,320
of which we learn that thermal coefficients and hence thermal lag is a problem, both good and bad.

196
00:18:03,880 --> 00:18:09,160
Assuming a sensor is already iced up, when we turn on the power systems and activate the heater,

197
00:18:09,160 --> 00:18:12,680
it could take several minutes for the heating coils to transfer enough heat

198
00:18:12,680 --> 00:18:16,440
through the shaft and vane to start melting any ice that might be present.

199
00:18:16,440 --> 00:18:20,680
Conversely, if the sensor is not yet iced up and the heater turns off,

200
00:18:20,680 --> 00:18:25,000
there will be enough residual heat that it won't ice up for at least several minutes,

201
00:18:25,000 --> 00:18:28,520
and only then if external atmospheric conditions are right.

202
00:18:28,520 --> 00:18:32,040
So what was the weather like that morning? I'm glad you asked.

203
00:18:32,040 --> 00:18:36,920
The report indicated conditions at the time of the incident were approximately +13°C,

204
00:18:36,920 --> 00:18:44,360
that's 55°F, with a dew point of +11°C or 52°F, which aligns with the historical data

205
00:18:44,360 --> 00:18:48,840
from the local weather bureau. The aircraft had been on the ground for nearly three hours between

206
00:18:48,840 --> 00:18:54,600
flights, noting that the overnight low was +11°C and was far above freezing temperature.

207
00:18:54,600 --> 00:18:59,080
Its previous flight was an overnight from Johannesburg, a nearly six-hour flight that

208
00:18:59,080 --> 00:19:05,560
landed at 5:52am local time. The maintenance log showing that flight's arrival recorded no write-ups

209
00:19:05,560 --> 00:19:11,080
or rectification actions and there were no notes from the flight crew either. If there had been

210
00:19:11,080 --> 00:19:16,680
icing during the prior flight, an AOA disagree error should have indicated, but even if somehow

211
00:19:16,680 --> 00:19:21,880
it didn't, the heater did not show any errors until 6 seconds after the moment of rotation,

212
00:19:21,880 --> 00:19:27,480
which suggests the heater was still working fine up until that moment. I find it extremely

213
00:19:27,480 --> 00:19:32,440
implausible that the heater failure was the primary cause of the AOA sensor's incorrect

214
00:19:32,440 --> 00:19:37,560
readings. I have other reasons for thinking that too. A little more from the EAIB about the sensor

215
00:19:37,560 --> 00:19:44,280
though. The EAIB explored the maintenance log for the aircraft and noted that it had, and I quote,

216
00:19:44,280 --> 00:19:49,160
"suffered intermittent electrical electronic anomalies in addition to the flight control

217
00:19:49,160 --> 00:19:55,560
system malfunctions" and "three days before the crash the auxiliary power unit APU fault light

218
00:19:55,560 --> 00:20:01,720
illuminated and the APU had a protective shutdown. Where the onboard maintenance function computer

219
00:20:01,720 --> 00:20:08,440
message also indicated the start converter unit SCU showed the APU start system was inoperative."

220
00:20:08,440 --> 00:20:15,720
There was one other interesting note that they added and I quote "the captain's personal

221
00:20:15,720 --> 00:20:21,240
computer power outlet also had no power. They concluded the possibility of intermittent

222
00:20:21,240 --> 00:20:24,360
electrical-electronic system defects were an underlying issue."

223
00:20:24,360 --> 00:20:26,760
Okay then.

224
00:20:26,760 --> 00:20:34,440
This is supposed to be an engineering-based, fact-driven analysis that follows specific

225
00:20:34,440 --> 00:20:38,520
evidence that directly leads us to a specific root cause or causes.

226
00:20:38,520 --> 00:20:45,000
The above conclusion, to me at least, reads very much like it was "...an electrical glitch of some kind,

227
00:20:45,000 --> 00:20:45,480
probably...?"

228
00:20:45,480 --> 00:20:47,080
Somewhat flimsy?

229
00:20:47,960 --> 00:20:54,280
If this statement sounds harsh, maybe I should say it feels a little presumptive and inconclusive.

230
00:20:54,280 --> 00:20:58,680
I have no doubt that there were electrical glitches with this 737 MAX.

231
00:20:58,680 --> 00:21:05,080
The only important question is, did a specific electrical glitch specifically cause the MCAS

232
00:21:05,080 --> 00:21:10,520
to malfunction or not? Let's talk about the Collins Aerospace involvement. Collins Aerospace

233
00:21:10,520 --> 00:21:15,720
provided a report to the EAIB with their findings based on the flight data recorder information

234
00:21:15,720 --> 00:21:22,120
provided to them by the EAIB. One of the findings of the Collins report was the most likely cause

235
00:21:22,120 --> 00:21:27,160
for erroneous readings from the sensor was a bird strike. To search for evidence of a potential bird

236
00:21:27,160 --> 00:21:31,640
strike during or shortly following take-off, the investigators inspected the immediate take-off

237
00:21:31,640 --> 00:21:37,160
runway area for signs of debris to explain the damage to the AOA sensor or potential damage to

238
00:21:37,160 --> 00:21:42,360
the tail. They found no evidence that this was the case, responding in their final official report

239
00:21:42,360 --> 00:21:47,800
that, and I quote, "the investigation team cannot comment and verify on the conclusions noted in the

240
00:21:47,800 --> 00:21:53,160
Collins report." Now this is where things start to get a bit more interesting.

241
00:21:53,160 --> 00:21:59,320
The NTSB had provided comments and feedback to each of the internal drafts, the first within

242
00:21:59,320 --> 00:22:03,880
six weeks, the second within two weeks, and the third they took longer to submit but requested

243
00:22:03,880 --> 00:22:09,560
it be incorporated into the final report. It wasn't. Regarding the AOA sensor conclusion,

244
00:22:09,560 --> 00:22:14,920
the NTSB agreed with the Collins Aerospace Report which correlated sensor data with known failure

245
00:22:14,920 --> 00:22:20,440
modes of an AOA sensor. They had developed a detailed fault tree analysis that considered

246
00:22:20,440 --> 00:22:27,720
the following things. 1) Manufacturing defects; 2) Internal component failures; 3) Heater failures;

247
00:22:27,720 --> 00:22:34,520
4) Non-impact structural failures of the AOA vane attachment hardware; and finally 5) AOA vane

248
00:22:34,520 --> 00:22:41,560
impact failures. The NTSB called out the following findings. The AOA reading began deviating on the

249
00:22:41,560 --> 00:22:47,400
left-hand sensor at 44 seconds after the beginning of take-off roll. The left alpha vane fail

250
00:22:47,400 --> 00:22:52,520
annunciation on the probe heat panel, indicating vane heater current below the monitor threshold,

251
00:22:52,520 --> 00:22:57,720
6 seconds after the AOA deviations began, is consistent with a vane breaking at the hub

252
00:22:57,720 --> 00:23:03,560
and separating from the AOA sensor. These timings align with moments following rotation or nose

253
00:23:03,560 --> 00:23:09,960
lift-off and noting that a small to moderate bird weighing approximately 230 grams or 1/2lb

254
00:23:09,960 --> 00:23:16,600
impacting at 170 knots would be sufficient to cause damage of this suspected kind. The NTSB

255
00:23:16,600 --> 00:23:21,880
were also critical of the delay the EAIB team took in searching the runway for debris and

256
00:23:21,880 --> 00:23:27,240
bird activity which was eight days after the incident and subsequently a lack of search of

257
00:23:27,240 --> 00:23:34,440
the taxiway EA302 would have been directly above at the most likely time of bird impact, Taxiway D.

258
00:23:34,440 --> 00:23:40,600
Additionally, the EAIB had reported officially on an engine failure event that had occurred

259
00:23:40,600 --> 00:23:45,640
months before this incident due to a bird strike, making a recommendation that the Ethiopian

260
00:23:45,640 --> 00:23:52,680
Airlines Group Airport Authority (EAGAA) to "take practical measures to minimize/eliminate

261
00:23:52,680 --> 00:23:57,440
bird hazards around the airport so that arriving and departing flights are conducted safely

262
00:23:57,440 --> 00:24:01,220
without any human and material loss."

263
00:24:01,220 --> 00:24:06,800
Given this recommendation occurred 8 months after the EAE302 incident, it's clear no

264
00:24:06,800 --> 00:24:09,440
additional bird control measures had been put in place.

265
00:24:09,440 --> 00:24:14,520
However, with some investigator personnel overlap between investigations, it's unclear

266
00:24:14,520 --> 00:24:19,640
why this was dismissed as a potential root cause in the EA302's final report.

267
00:24:19,640 --> 00:24:22,400
Let's talk a bit about crew training.

268
00:24:22,400 --> 00:24:28,700
In the EAIB report, Finding 83 states the following and I quote, "The Emergency Airworthiness

269
00:24:28,700 --> 00:24:33,160
Directive (AD) pilot procedures were inadequate and unverified.

270
00:24:33,160 --> 00:24:40,080
AD 2018-23-51 does not mention the possibility of an auto throttle malfunction due to an

271
00:24:40,080 --> 00:24:42,400
erroneous AOA input."

272
00:24:42,400 --> 00:24:48,720
About that Airworthiness Directive, that's 2018-23-51, that was issued by the Federal

273
00:24:48,720 --> 00:24:54,880
Aviation Authority on the 12th of June 2018, long before the incident. The NTSB responded to this

274
00:24:54,880 --> 00:25:01,280
specifically stating, and I quote, "Even if such a reference document did *not* exist, the flight crew

275
00:25:01,280 --> 00:25:07,520
should have been trained on 737-8 MAX non-normal procedures. Non-normal procedures related to

276
00:25:07,520 --> 00:25:13,520
erroneous AOA inputs instruct the crew to disengage both the autopilot and autothrottle, thereby

277
00:25:13,520 --> 00:25:18,880
preventing the erroneous AOA inputs from affecting flight control and throttle movements."

278
00:25:18,880 --> 00:25:26,800
Interestingly, if you read the detail of AD 2018-23-51, it states the following for a

279
00:25:26,800 --> 00:25:33,920
runaway stabilizer condition. It says the pilot should "disengage autopilot and control airplane

280
00:25:33,920 --> 00:25:39,280
pitch attitude with control column and main electric trim as required. If relaxing the

281
00:25:39,280 --> 00:25:45,120
column causes the trim to move, set stabilizer trim switches to cut out. If runaway continues,

282
00:25:45,120 --> 00:25:51,120
hold the stabilizer trim wheel against rotation and trim the airplane manually." It goes on to say,

283
00:25:51,120 --> 00:25:56,560
"Initially, higher control forces may be needed to overcome any stabilizer nose-down trim already

284
00:25:56,560 --> 00:26:02,240
applied. Electric stabilizer trim can be used to neutralize control column pitch forces before

285
00:26:02,240 --> 00:26:08,480
moving the stab trim cutout switches to cut out. Manual stabilizer trim can be used before and

286
00:26:08,480 --> 00:26:16,000
after the stab trim cutout switches are moved." This was issued by Boeing to Ethiopian Air on the

287
00:26:16,000 --> 00:26:21,520
6th of November 2018 following the initial findings from the Lion Air 610 incident,

288
00:26:21,520 --> 00:26:28,160
issued as a Flight Crew Operations Manual Bulletin, reference number ETH-12. So the pilots

289
00:26:28,160 --> 00:26:33,840
actually did attempt some of those things, however they applied the trim cutout switches perhaps too

290
00:26:33,840 --> 00:26:39,680
early and then against the directive un-cut-out the stabiliser trim when they probably shouldn't have.

291
00:26:39,680 --> 00:26:44,480
The extreme forces they had to apply were in part due to the near full thrust condition of the

292
00:26:44,480 --> 00:26:49,040
engines at the time. Technically though, that specific directive doesn't mention autothrottle

293
00:26:49,040 --> 00:26:54,080
however, that's true. But the issue of speed and force applied to the trim stabilisers is one of

294
00:26:54,080 --> 00:26:59,760
basic aerodynamics. The NTSB's comments and their feedback specifically about autothrottle are

295
00:26:59,760 --> 00:27:03,760
interesting. The following direct quote from their feedback is read verbatim:

296
00:27:03,760 --> 00:27:09,840
"Because the autothrottle remained engaged and responsive to the erroneous AOA inputs,

297
00:27:09,840 --> 00:27:15,040
the autothrottle did not transition to N1 mode and remained in ARM mode with take-off thrust.

298
00:27:15,040 --> 00:27:20,320
The expected crew response is to manually control thrust in this situation. However,

299
00:27:20,320 --> 00:27:24,160
the lack of manual control and the absence of flight crew conversation regarding the

300
00:27:24,160 --> 00:27:28,960
thrust settings indicate that the crew did not notice the autothrottle's failure to transition

301
00:27:28,960 --> 00:27:34,560
to N1, even when the aural overspeed warning triggered as the airplane accelerated beyond

302
00:27:34,560 --> 00:27:40,640
about 340 knots. As airspeed increased, the required control forces increased on both the

303
00:27:40,640 --> 00:27:47,680
control column and the manual trim wheel." But it wasn't just the NTSB that was saying this.

304
00:27:47,680 --> 00:27:52,080
The BEA also deconstructed the crew's actions in their feedback, stating the following,

305
00:27:52,080 --> 00:27:57,520
"In the case of the IAS disagree, the flight crew has to apply the airspeed

306
00:27:57,520 --> 00:28:03,440
unreliable non-normal checklist. This checklist states to first disengage the autopilot, then the

307
00:28:03,440 --> 00:28:09,440
autothrottle." So after the updated directive was given to Ethiopian Air four months before this

308
00:28:09,440 --> 00:28:14,640
incident, do we know that the pilots were made aware of it? Ethiopian Air uses a system called

309
00:28:14,640 --> 00:28:20,160
Logipad, which pilots are required to upload as standard procedure before going on a flight to

310
00:28:20,160 --> 00:28:24,960
grab the latest directives and bulletins. The company confirmed that at least every seven days

311
00:28:24,960 --> 00:28:30,720
this was done by the pilots involved. There is however no test for comprehension, no review or

312
00:28:30,720 --> 00:28:35,040
check that the uploaded documents have been read. The system can only confirm that they were

313
00:28:35,040 --> 00:28:39,840
uploaded to the pilot's device. The BEA stated the following regarding this system.

314
00:28:39,840 --> 00:28:45,680
A contributing factor in this incident was, and I quote, "The use of the LogiPad system by the

315
00:28:45,680 --> 00:28:50,960
airline as a sole means to disseminate information on new systems and/or procedures which doesn't

316
00:28:50,960 --> 00:28:56,080
allow the evaluation of the crew's understanding and knowledge acquisition on new systems and

317
00:28:56,080 --> 00:29:01,280
procedures. The system was used to disseminate information related to the MCAS system issued

318
00:29:01,280 --> 00:29:06,560
following the previous 737 MAX accident and did not allow the airline to ensure that the

319
00:29:06,560 --> 00:29:12,400
crews had read and correctly understood this information." So this all feels a bit odd.

320
00:29:12,400 --> 00:29:17,040
Something that's bugged me for a while with these incidents is the relative experience of the

321
00:29:17,040 --> 00:29:23,840
pilots changed the duration of the lack of control to the ultimate crash. There were a lot of 737

322
00:29:23,840 --> 00:29:29,360
MAX's flying out there so it's a numbers game with such an important instrument the AOA sensor

323
00:29:29,360 --> 00:29:34,720
now playing such a vital role in that mode of operation that surely we'd had a near miss before

324
00:29:34,720 --> 00:29:41,280
either of these incidents? It turns out there was, and whilst MCAS wasn't involved the investigators

325
00:29:41,280 --> 00:29:46,000
did find that the pilots were not following the Boeing training manual and this could or

326
00:29:46,000 --> 00:29:50,800
should have been a warning of how other pilots might react under the same alerts as observed

327
00:29:50,800 --> 00:29:54,560
on both Lion Air 610 and Ethiopian Air 302.

328
00:29:54,560 --> 00:30:02,720
The report I'm referring to is BEA 2018-0071 released on the 16th of November 2020 regarding

329
00:30:02,720 --> 00:30:08,000
the same aircraft with the same faulty AOA sensor over two flights on subsequent days,

330
00:30:08,000 --> 00:30:10,920
the 7th and 8th of February 2018.

331
00:30:10,920 --> 00:30:15,280
Whilst the incident did not involve a crash and no loss or damage occurred, the crew performance

332
00:30:15,280 --> 00:30:21,000
may have provided some key insights into how to address the issue with other 737 MAX flight

333
00:30:21,000 --> 00:30:22,000
crews.

334
00:30:22,000 --> 00:30:27,480
For both flights, an incorrect reading AOA sensor triggered AOA Disagree and Alt Disagree

335
00:30:27,480 --> 00:30:32,560
alerts, with one flight pressing onto its destination and the other radioing "PAN-PAN"

336
00:30:32,560 --> 00:30:35,180
and returning to their originating airport.

337
00:30:35,180 --> 00:30:40,040
Both flight crews chose to follow the AOA Disagree and Alt Disagree checklists followed

338
00:30:40,040 --> 00:30:44,720
by the IAS Disagree checklist, and whilst they noted a brief reference to the airspeed

339
00:30:44,720 --> 00:30:51,360
unreliable checklist, the pilots did not follow it. The Boeing 737-800 flight manual airspeed

340
00:30:51,360 --> 00:30:58,880
unreliable procedure has the following key memory items. 1) Autopilot, if engaged, disengage;

341
00:30:58,880 --> 00:31:05,360
2) Autothrottle, if engaged, disengage; 3) Flight director switches, both set to off.

342
00:31:05,360 --> 00:31:11,680
The BEA in this report states and I quote "In these two incidents, the pilots did not immediately

343
00:31:11,680 --> 00:31:17,120
carry out the memory items. In both cases, they first tried to identify the side which was supplying

344
00:31:17,120 --> 00:31:22,080
the erroneous information and initially used this assessment to continue the flight with the

345
00:31:22,080 --> 00:31:28,320
automatic systems engaged." Had Boeing dug deeper into the order in which these checklists

346
00:31:28,320 --> 00:31:33,280
were executed, ensuring flight crews followed them with revised training, it's possible that

347
00:31:33,280 --> 00:31:39,840
neither EA302 or LA610 would have happened. Of course, that's leaving a lot in the hands of

348
00:31:39,840 --> 00:31:45,040
procedures and training for a system, MCAS, that wasn't even named in any training manuals or

349
00:31:45,040 --> 00:31:50,080
checklists. Also, of course, that would have required the BEA report to have been completed,

350
00:31:50,080 --> 00:31:55,920
reviewed and published before LA610, not years after it had actually happened.

351
00:31:55,920 --> 00:32:01,840
So it's been a while since Episode 33. And Episode 33's discussion regarding the fallout

352
00:32:01,840 --> 00:32:07,040
from Lion Air 610 was somewhat limited at that time since that was released on the 31st of January

353
00:32:07,040 --> 00:32:13,680
2020. A few things have happened in the world since then, not just in the case of the 737 MAX.

354
00:32:13,680 --> 00:32:20,560
The SARS-CoV-2, also known as COVID-19 pandemic, had spread globally by March 2020,

355
00:32:20,560 --> 00:32:25,600
with many countries locking down all but essential air travel in and out. At its peak,

356
00:32:25,600 --> 00:32:30,720
the FAA reported in late April 2020 that air traffic in the United States had dropped by 96%.

357
00:32:30,720 --> 00:32:36,560
The 737 MAX fleet of aircraft were already grounded whilst investigations and rectifications

358
00:32:36,560 --> 00:32:43,360
continued from March 2019 following the EA302 incident. Whilst the FAA issued a Continued

359
00:32:43,360 --> 00:32:49,280
Airworthiness Notification to the International Community (CANIC) for the 737 MAX

360
00:32:49,280 --> 00:32:55,720
on the 18th of November 2020, many 737 MAXs took significant time to return to service

361
00:32:55,720 --> 00:33:04,280
due to poor COVID-19 demand. On the 7th of January 2021 Boeing were charged with 737

362
00:33:04,280 --> 00:33:08,560
Fraud Conspiracy and agreed to settle with the US Department of Justice for a

363
00:33:08,560 --> 00:33:14,000
total criminal amount of just over $2.5B USD. $0.5B

364
00:33:14,000 --> 00:33:18,320
dollars of that figure is to a crash victim beneficiaries fund for both the

365
00:33:18,320 --> 00:33:22,760
Lion Air and Ethiopian Air incidents. The statement from the acting assistant

366
00:33:22,760 --> 00:33:26,880
Attorney General David P. Burns of the Justice Department's criminal division

367
00:33:26,880 --> 00:33:31,880
is worth reading and I quote "the tragic crashes of Lion Air flight 610 and

368
00:33:31,880 --> 00:33:37,720
Ethiopian Airlines Flight 302 exposed fraudulent and deceptive conduct by employees of one of the

369
00:33:37,720 --> 00:33:43,400
world's leading commercial airplane manufacturers. Boeing's employees chose the path of profit over

370
00:33:43,400 --> 00:33:49,080
candor by concealing material information from the FAA concerning the operation of its 737 MAX

371
00:33:49,080 --> 00:33:54,920
airplane and engaging in an effort to cover up their deception. This resolution holds Boeing

372
00:33:54,920 --> 00:34:00,120
accountable for its employees' criminal misconduct, addresses the financial impact to Boeing's airline

373
00:34:00,120 --> 00:34:05,000
customers and hopefully provide some measure of compensation to the crash victims' families and

374
00:34:05,000 --> 00:34:11,800
beneficiaries." Now for an incident of this sort of scale, as one might expect,

375
00:34:11,800 --> 00:34:18,440
it didn't end there for Boeing. But to briefly quote myself from Episode 33, "Certainly, Boeing,

376
00:34:18,440 --> 00:34:23,640
and to a lesser extent the FAA, for less than ideal oversight of Boeing's qualification of the

377
00:34:23,640 --> 00:34:28,680
the 737 MAX have to shoulder most of the responsibility for these events." Well then.

378
00:34:28,680 --> 00:34:35,080
On the 16th of September 2020, a 238-page congressional report by the House Committee

379
00:34:35,080 --> 00:34:39,720
on Transportation and Infrastructure was released, taking 18 months of investigation

380
00:34:39,720 --> 00:34:45,000
to produce that placed fault primarily with Boeing, with some resting also with the FAA.

381
00:34:45,000 --> 00:34:51,000
Nice to be validated there. The report described "disturbing cultural issues relating to employee

382
00:34:51,000 --> 00:34:55,800
surveys showing some employees had experienced undue pressure as Boeing pressed to complete the

383
00:34:55,800 --> 00:35:02,520
737 MAX ahead of other offerings released at the time by Airbus." Regarding the two 737 MAX crashes,

384
00:35:02,520 --> 00:35:07,480
the report stated and I quote, "They were the horrific culmination of a series of faulty

385
00:35:07,480 --> 00:35:12,120
technical assumptions by Boeing's engineers, a lack of transparency on the part of Boeing's

386
00:35:12,120 --> 00:35:17,960
management and grossly insufficient oversight by the FAA. The pernicious results of regulatory

387
00:35:17,960 --> 00:35:22,720
capture on the part of the FAA with respect to its responsibilities to perform robust

388
00:35:22,720 --> 00:35:27,360
oversight of Boeing and to ensure the safety of the flying public."

389
00:35:27,360 --> 00:35:32,760
On 2 September 2021, Boeing and Ethiopian Airlines reached an out-of-court settlement

390
00:35:32,760 --> 00:35:34,800
for an undisclosed amount.

391
00:35:34,800 --> 00:35:41,520
On 5 November 2021, Boeing's directors settled the shareholder lawsuit for $237.5M

392
00:35:41,520 --> 00:35:42,920
US dollars.

393
00:35:42,920 --> 00:35:47,280
The shareholder lawsuit claimed, and I quote, "Boeing's directors and officers failed

394
00:35:47,280 --> 00:35:51,920
them in overseeing mission-critical airplane safety to protect enterprise and stockholder

395
00:35:51,920 --> 00:35:54,040
value."

396
00:35:54,040 --> 00:35:59,840
On 11 November 2021, Boeing accepted liability for overseas family compensation claims relating

397
00:35:59,840 --> 00:36:03,920
to Ethiopian Air 302 to be submitted by the US court system.

398
00:36:03,920 --> 00:36:07,740
The final amount of compensation via this pathway remains unclear.

399
00:36:07,740 --> 00:36:13,540
And again, if I may quote myself, again from Episode 33, "I think that aviation authorities

400
00:36:13,540 --> 00:36:19,160
around the world need to reconsider what constitutes a genuine derivative design and when grandfathering

401
00:36:19,160 --> 00:36:24,360
provisions should and should not apply, as it encourages aircraft manufacturers to make

402
00:36:24,360 --> 00:36:29,160
incremental changes to an aircraft's design and avoid a full regression test of all of

403
00:36:29,160 --> 00:36:31,820
the impacted aspects of those changes."

404
00:36:31,820 --> 00:36:33,960
So about those regulations...

405
00:36:33,960 --> 00:36:39,880
The Aircraft Certification, Safety and Accountability Act was passed on 17 November 2020, which

406
00:36:39,880 --> 00:36:45,720
requires the FAA to do many things, but key items of interest to me are, require manufacturers

407
00:36:45,720 --> 00:36:50,920
to disclose the FAA certain safety critical information related to an aircraft, and revise

408
00:36:50,920 --> 00:36:56,200
and improve its process of issuing amended type certificates for modifying an aircraft.

409
00:36:56,200 --> 00:37:02,440
I should damn well hope so. On the 6th of March 2023, the FAA released another policy proposal

410
00:37:02,440 --> 00:37:08,600
that would require applicants who want to modify the original transport category aircraft designs

411
00:37:08,600 --> 00:37:14,360
to disclose all proposed changes in a single document at the beginning of the certification

412
00:37:14,360 --> 00:37:21,640
process. Again, yes...indeed, great idea! Finally!! In terms of the final costs to Boeing,

413
00:37:21,640 --> 00:37:27,960
the costs of the 737 MAX incidents to Boeing as a company are continuing even today. At the end of

414
00:37:27,960 --> 00:37:33,960
2020, over 800 737 MAX orders had been cancelled, and with the production shut down between the

415
00:37:33,960 --> 00:37:39,880
COVID-19 pandemic and the order reduction, it's difficult to separate the two. Best estimates are

416
00:37:39,880 --> 00:37:46,040
that by mid-2022 Boeing had lost approximately $20 billion between lost orders, re-compliance

417
00:37:46,040 --> 00:37:53,640
and compensation claims. So what did Boeing do to get the 737 MAX re-certified? On the 20th of

418
00:37:53,640 --> 00:38:02,200
November 2020, the FAA issued AD 2020-24-02 that superseded their previous airworthiness directive

419
00:38:02,200 --> 00:38:07,880
that was 2018-23-15 we mentioned previously regarding the 737 MAX aircraft.

420
00:38:07,880 --> 00:38:13,240
Rather than dig into every detail, the summary of key points from the directive are as follows.

421
00:38:13,240 --> 00:38:17,720
Boeing to install new flight control computer software. This change is intended to prevent

422
00:38:17,720 --> 00:38:23,240
erroneous MCAS activation, among other safeguards. A direct extract states,

423
00:38:23,240 --> 00:38:29,480
"The new flight control laws now require inputs from both AOA sensors in order to activate MCAS."

424
00:38:29,480 --> 00:38:35,080
Boeing to install updated cockpit display system software to generate an AOA disagree alert.

425
00:38:35,080 --> 00:38:39,960
This will alert pilots that the airplane's two AOA sensors are disagreeing by a certain amount

426
00:38:39,960 --> 00:38:45,560
indicating a potential AOA sensor failure. Boeing to incorporate new and revised operating

427
00:38:45,560 --> 00:38:50,600
procedures to the airplane flight manual. This change is intended to ensure the flight crew has

428
00:38:50,600 --> 00:38:56,120
the means to recognize and respond to erroneous stabilizer movement and the effects of a potential

429
00:38:56,120 --> 00:39:01,640
AOA sensor failure. In addition to these design changes, FAA also will require operators to

430
00:39:01,640 --> 00:39:07,160
conduct an AOA sensor system test and perform an operational readiness flight prior to returning

431
00:39:07,160 --> 00:39:15,080
each airplane to service. Now Transport Canada and the European Union Aviation Safety Agency (EASA)

432
00:39:15,080 --> 00:39:20,200
they didn't fully accept this directive, suggesting that it didn't go quite far enough and thus they

433
00:39:20,200 --> 00:39:25,800
didn't adopt it at the time. With Transport Canada instead issuing its own directive on the 18th of

434
00:39:25,800 --> 00:39:32,040
January 2021. Some, but not all, of the additional requirements included the addition of coloured

435
00:39:32,040 --> 00:39:36,360
caps on the circuit breakers for the stick shaker to allow for easier identification,

436
00:39:36,360 --> 00:39:43,080
an enhanced flight deck procedure that provides the option for a pilot in command to disable a

437
00:39:43,080 --> 00:39:47,480
loud and intrusive warning system, in other words the stick shaker, when the system has

438
00:39:47,480 --> 00:39:53,480
been erroneously activated by a failure of an AOA. The EASA requested similar additions to

439
00:39:53,480 --> 00:39:59,320
Transport Canada before finally accepting the 737 MAX was fit to return to service once those

440
00:39:59,320 --> 00:40:06,040
requirements were met and the EASA followed on the 27th of January 2021. So what do we learn from

441
00:40:06,040 --> 00:40:11,320
all this? There are a few things to take away from this incident beyond those that were already

442
00:40:11,320 --> 00:40:18,200
discussed in Lion Air 610 but perhaps not what you might think. Firstly let's talk about Root Cause

443
00:40:18,200 --> 00:40:24,560
analysis and opinions. Clearly the BEA, NTSB and Collins had a different opinion

444
00:40:24,560 --> 00:40:29,760
from the EAIB, but this shouldn't be about opinions, it should be about facts.

445
00:40:29,760 --> 00:40:35,160
Yes, the official report told its story, but the facts by equivalence disagree

446
00:40:35,160 --> 00:40:39,360
with the formal report. It's important to remember when you're reading these

447
00:40:39,360 --> 00:40:43,840
formal reports to keep some level of skepticism. Reports are written by people

448
00:40:43,840 --> 00:40:47,840
and people have opinions, but facts don't have opinions and that's what makes them

449
00:40:47,840 --> 00:40:52,100
facts. If you're ever investigating anything and doing a root cause analysis

450
00:40:52,100 --> 00:40:56,120
and you or your team or organization might be implicated by a finding, you

451
00:40:56,120 --> 00:40:59,920
have to be prepared to admit fault if that's the real root cause because

452
00:40:59,920 --> 00:41:04,400
otherwise you and others won't really learn anything. Ultimately though, we each

453
00:41:04,400 --> 00:41:08,840
have to sift through the opinions and facts that are jumbled together and take

454
00:41:08,840 --> 00:41:17,400
our own learnings as best we can. The other learning that's perhaps less clear is the adherence to the intent of the ICAO

455
00:41:17,400 --> 00:41:21,960
requirement to release a report with findings as soon as possible following an incident.

456
00:41:21,960 --> 00:41:27,080
For comparison, the time lag between the incidents and the final report for Lion Air 610

457
00:41:27,080 --> 00:41:33,800
was 49 weeks, not even one year. The time lag for Ethiopian Air 302 was 197 weeks,

458
00:41:33,800 --> 00:41:39,720
that's just under 4 years or 4 times longer. During that time, multiple other organisations

459
00:41:39,720 --> 00:41:44,440
had completed their reviews of the aircraft. Boeing as an organisation, the FAA had passed

460
00:41:44,440 --> 00:41:49,560
new regulations, Boeing had updated the flight control software and a pandemic came and went.

461
00:41:49,560 --> 00:41:55,560
Well...mostly went...and all the while, we were still waiting for a report that incorporated

462
00:41:55,560 --> 00:42:00,920
very little feedback and whose contents did not improve proportionally with the amount of

463
00:42:00,920 --> 00:42:05,320
additional time that had passed. The BA report on the near miss could have been quicker as well,

464
00:42:05,320 --> 00:42:11,080
coming in at 144 weeks or just under three years. And that may have potentially prevented

465
00:42:11,080 --> 00:42:16,360
these incidents if Boeing and the FAA had it early enough but would have required a very

466
00:42:16,360 --> 00:42:21,560
short turnaround. To be clear, do your investigations with some hustle, listen to the

467
00:42:21,560 --> 00:42:26,680
experts, incorporate their findings into your own and be prepared to throw yourself under the bus

468
00:42:26,680 --> 00:42:33,080
if that's a true root cause. Ultimately, investigating is hard. For the record,

469
00:42:33,080 --> 00:42:39,240
I have no staff on this show. I never have. It's just me. Only me. I'm not a formal investigator

470
00:42:39,240 --> 00:42:43,320
outside of the company I work for, and even within the company, it's more of a role that

471
00:42:43,320 --> 00:42:47,000
Technical Authorities like myself are asked to undertake from time to time.

472
00:42:47,000 --> 00:42:52,440
I've done a lot of Root Cause Analyses, Fault Trees, HAZOPs, CHAZOPs, and 5-Whys in the nearly

473
00:42:52,440 --> 00:42:57,080
three decades of my professional engineering career around the world. I have never investigated

474
00:42:57,080 --> 00:43:00,520
an incident where one or more people lost their lives. I have never investigated an

475
00:43:00,520 --> 00:43:04,600
incident where the cost outcomes broke the $1M mark. I've never been in charge

476
00:43:04,600 --> 00:43:08,840
of a team of investigators because I didn't need a team to undertake the investigations at the

477
00:43:08,840 --> 00:43:14,280
scale that I was asked to. I have, however, organized and performed post-incident reviews,

478
00:43:14,280 --> 00:43:17,800
traced alarm logs, maintenance histories, operating procedures, organizational,

479
00:43:17,800 --> 00:43:22,680
structural changes, local weather conditions, and challenged biases, had my own biases

480
00:43:22,680 --> 00:43:28,520
challenged, and written ridiculously detailed reports with many, many findings. And my conclusion

481
00:43:28,520 --> 00:43:34,200
from all of that is simple. Investigating is hard. Or rather, it's very hard to do it well.

482
00:43:35,080 --> 00:43:39,880
Sometimes you just can't find the facts, and in my experience, it's not the facts that present

483
00:43:39,880 --> 00:43:46,200
the problem, it's the people. And people aren't evil. They're not good either. They're just people.

484
00:43:46,200 --> 00:43:51,880
Our recollections vary from day to day with the passage of time, and we're subject to influence,

485
00:43:51,880 --> 00:43:56,840
both real and perceived. And sometimes those subtleties can dramatically change the findings,

486
00:43:56,840 --> 00:44:02,280
even though they shouldn't. I have worked at companies with a no-blame mantra, where knowledge

487
00:44:02,280 --> 00:44:06,320
sharing and admitted fault is celebrated as an opportunity for mutual learning, and I

488
00:44:06,320 --> 00:44:10,260
think that's noble and I'm glad that environments like that exist.

489
00:44:10,260 --> 00:44:11,720
But don't kid yourself.

490
00:44:11,720 --> 00:44:15,500
If the stakes are real, the consequences need to be real too.

491
00:44:15,500 --> 00:44:21,060
I like to say to others, "It's not about blame, but it actually is..." as a caution to other

492
00:44:21,060 --> 00:44:25,480
people but also a reminder to myself that I'm not above judgment.

493
00:44:25,480 --> 00:44:28,920
When you're investigating an incident, no matter how many times people press for the

494
00:44:28,920 --> 00:44:33,720
true root cause for mutual learnings, the human element will always be there pushing back if it

495
00:44:33,720 --> 00:44:39,160
feels the facts portray them in a bad light, and they may be blamed, either in whole or in part.

496
00:44:39,160 --> 00:44:44,040
Beyond egotistical reasons, there's legal consequences, contractual consequences,

497
00:44:44,040 --> 00:44:48,120
and litigative consequences to admission of guilt...in certain circumstances.

498
00:44:48,120 --> 00:44:52,360
You have to be prepared to throw yourself under the bus you're driving,

499
00:44:52,360 --> 00:44:57,000
even if that's physically difficult (you know what I mean.) I rely on official reports produced

500
00:44:57,000 --> 00:45:02,760
by people whose job it was to overturn every rock, pull apart every thread, and dig into every minute

501
00:45:02,760 --> 00:45:07,720
detail. Sometimes, reading other investigation reports, you reach a point where you know there

502
00:45:07,720 --> 00:45:12,680
were issues with the report. And this is an example of that. It's clear to me that there's

503
00:45:12,680 --> 00:45:17,560
a fundamental disagreement with what set these events into motion with the failed AOA sensor.

504
00:45:17,560 --> 00:45:23,720
The EAIB blame Boeing and their bad wiring, and Collins and their AOA sensor. The NTSB and Collins

505
00:45:23,720 --> 00:45:27,600
blame an uninvited external third party: a bird.

506
00:45:27,600 --> 00:45:32,800
If we briefly suspend evidence for a moment, why would each party be debating this at all?

507
00:45:32,800 --> 00:45:36,280
MCAS was certainly the root cause, and no one's arguing about that.

508
00:45:36,280 --> 00:45:38,120
But what was the initiating event?

509
00:45:38,120 --> 00:45:44,400
Who really is to blame, beyond just the Boeing 737 MAX MCAS design at that time?

510
00:45:44,400 --> 00:45:48,880
If an Ethiopian authority blames poor bird controls at an airport in their country, there

511
00:45:48,880 --> 00:45:55,000
may be liability pressures from companies and families for compensation, bad publicity and reputational damage.

512
00:45:55,000 --> 00:45:59,080
If the NTSB and Collins demonstrate it was a third party event, then Collins protect

513
00:45:59,080 --> 00:46:02,980
their reputation and may avoid some liability concerns as well.

514
00:46:02,980 --> 00:46:06,360
But this show isn't about that kind of speculation.

515
00:46:06,360 --> 00:46:09,060
So when we're investigating something, what do we do?

516
00:46:09,060 --> 00:46:11,040
We follow the evidence as best we can.

517
00:46:11,040 --> 00:46:15,820
There were insufficient clues in the debris to conclude the initiating event, or even

518
00:46:15,820 --> 00:46:20,540
if the vane was connected at the moment of impact, based on debris at the crash site.

519
00:46:20,540 --> 00:46:25,740
In cases like this, investigators have to do the next best thing, replicate the circumstances

520
00:46:25,740 --> 00:46:31,340
with equivalent equipment and attempt to inject the same suspected failure modes to mimic

521
00:46:31,340 --> 00:46:33,080
and get the same results.

522
00:46:33,080 --> 00:46:38,220
This is exactly what Collins did and their conclusion was a bird strike fit the failure

523
00:46:38,220 --> 00:46:40,860
they observed in the lead up to the incident.

524
00:46:40,860 --> 00:46:45,180
If you have to weigh on the one hand, some wiring problem with the heater, which should

525
00:46:45,180 --> 00:46:50,360
not have an immediate impact to the sensor, with failure testing under equivalent circumstances

526
00:46:50,360 --> 00:46:54,460
led to the same behaviour, then the choice should be obvious.

527
00:46:54,460 --> 00:46:56,900
So far as crew training goes, that's yet another issue.

528
00:46:56,900 --> 00:47:01,240
But then, it's not like the MCAS system was obvious, nor was it mentioned, nor was

529
00:47:01,240 --> 00:47:07,380
there specific training mentioning it as part of the 737-NG to 737-MAX migration training,

530
00:47:07,380 --> 00:47:09,140
as previously discussed in Episode 33.

531
00:47:09,140 --> 00:47:12,500
I could go on about it, but I'm not going to again.

532
00:47:12,500 --> 00:47:17,640
Between the two incidents, 346 people died because of the MCAS system.

533
00:47:17,640 --> 00:47:21,060
Both reports into each incident agree on that point.

534
00:47:21,060 --> 00:47:25,380
The measures taken by the United States government regulatory improvements, if applied correctly

535
00:47:25,380 --> 00:47:29,860
by the FAA and Boeing specifically, should prevent incidents like this from occurring

536
00:47:29,860 --> 00:47:31,140
in the future.

537
00:47:31,140 --> 00:47:36,060
The updates Boeing made to the 737 MAX address the deficiencies that MCAS had and it's

538
00:47:36,060 --> 00:47:38,660
now a safer plane as most out there.

539
00:47:38,660 --> 00:47:42,100
I'd happily fly on a retrofitted one at this point.

540
00:47:42,100 --> 00:47:47,900
So perhaps in the 4-1/2yrs since Lion Air 610, the right outcomes have eventuated.

541
00:47:47,900 --> 00:47:50,940
The correct course corrections have been made.

542
00:47:50,940 --> 00:47:52,900
And that's a good thing.

543
00:47:52,900 --> 00:47:58,220
But I can't shake my concerns about Boeing as a company and how they performed engineering

544
00:47:58,220 --> 00:48:00,940
in their designs in recent times.

545
00:48:00,940 --> 00:48:04,620
Boeing was a well-established business with a solid engineering reputation.

546
00:48:04,620 --> 00:48:09,120
But as we discussed in Episode 33, the push internally was to get the design out the door

547
00:48:09,120 --> 00:48:13,520
and minimise cost for the end customer, and risk probabilities were downrated when they

548
00:48:13,520 --> 00:48:15,080
should not have been.

549
00:48:15,080 --> 00:48:18,960
Younger engineers look at the history of the solid, reliable designs that went through

550
00:48:18,960 --> 00:48:23,400
detailed rigour and risk assessments, but because they haven't seen the failures, they

551
00:48:23,400 --> 00:48:28,520
haven't lived the consequences of cost-centric decision making, they make the mistake of

552
00:48:28,520 --> 00:48:30,760
putting cost ahead of risk.

553
00:48:30,760 --> 00:48:33,240
This isn't a problem that's unique to Boeing either.

554
00:48:33,240 --> 00:48:39,080
As engineers, we need to be constantly vigilant to ensure that those that make decisions understand

555
00:48:39,080 --> 00:48:44,680
those risks before the decisions are made. We need to ensure that the engineering processes

556
00:48:44,680 --> 00:48:50,760
are followed. We need to ensure that risks are fairly assessed and to stop the job if we need to,

557
00:48:50,760 --> 00:48:56,440
because if we don't, someday our inaction will lead to a failure and someone may be injured or

558
00:48:56,440 --> 00:49:03,560
may die. Finally though, on a self-reflecting note, and perhaps fittingly, after 50 episodes of this

559
00:49:03,560 --> 00:49:08,920
show, that's a show that places formal investigative reports in very high regard as the source of

560
00:49:08,920 --> 00:49:15,480
facts. Not all reports are created equally. We should read each of them carefully. Be mindful

561
00:49:15,480 --> 00:49:20,280
of the biases in play, including your own, and be balanced in the learnings that we take from

562
00:49:20,280 --> 00:49:26,040
every incident. Be sure within yourself that you've taken away the best things to avoid future

563
00:49:26,040 --> 00:49:33,480
incidents from occurring, and not just blindly trusting a one-page summary. I think it's good

564
00:49:33,480 --> 00:49:37,640
to be a little skeptical. I think it's good to keep an open mind.

565
00:49:37,640 --> 00:49:42,600
Forever a skeptic? I guess I am. I think that's okay.

566
00:49:42,600 --> 00:49:48,880
To celebrate the 50th episode of Causality, I'll be hosting 3 live Q&A sessions for current

567
00:49:48,880 --> 00:49:53,720
patrons in May 2023 to accommodate listeners' time zones all around the world. Details will

568
00:49:53,720 --> 00:49:58,160
be published on Patreon in coming weeks. A competition is now open where you could win

569
00:49:58,160 --> 00:50:03,240
your own Causality T-shirt. To enter, all you need to do is write a short or long post

570
00:50:03,240 --> 00:50:08,440
either on your own blog, the Fetiverse, Twitter or Facebook, linking to and celebrating your

571
00:50:08,440 --> 00:50:13,120
favorite episode of Causality, or just the show in general. Then submit a link to your

572
00:50:13,120 --> 00:50:17,400
post via email to admin@engineered.network to enter.

573
00:50:17,400 --> 00:50:22,000
The competition closes on the 31st of May 2023, and you can enter as many times as you

574
00:50:22,000 --> 00:50:26,640
like. The best post will be chosen and the winner published on the network blog the following

575
00:50:26,640 --> 00:50:27,640
week.

576
00:50:27,640 --> 00:50:31,440
If you don't want to wait, you can just buy your own from the TEN store, with T-shirts

577
00:50:31,440 --> 00:50:37,400
for this and other TEN shows, smartphone cases and more, all available now, but for a limited

578
00:50:37,400 --> 00:50:43,640
The 10 store will be closing on the 14th of June 2023, so get in while you can.

579
00:50:43,640 --> 00:50:48,280
Visit https://engineered.network/celebrate for details and keep an eye on Patreon posts for all the

580
00:50:48,280 --> 00:50:49,280
details.

581
00:50:49,280 --> 00:50:52,600
If you're enjoying Causality and you'd like to support us and keep the show ad free,

582
00:50:52,600 --> 00:50:55,160
you can by becoming a premium supporter.

583
00:50:55,160 --> 00:50:59,240
Just visit https://engineered.network/causality to learn how you can help this show to continue to

584
00:50:59,240 --> 00:51:00,240
be made.

585
00:51:00,240 --> 00:51:01,240
Thank you.

586
00:51:01,240 --> 00:51:03,040
A big thank you to all of our supporters.

587
00:51:03,040 --> 00:51:07,720
A special thank you to our Silver Producers Mitch Bilger, Lesley, Shane O'Neill, Jared

588
00:51:07,720 --> 00:51:13,960
Roman, Joel Maher, Katerina Will, Chad Juehring, Dave Jones and Kellen Frodelius-Fujimoto.

589
00:51:13,960 --> 00:51:18,640
And an extra special thank you to both of our Gold Producers, Stephen Bridle and our

590
00:51:18,640 --> 00:51:23,720
Gold Producer known only as "R". Causality is heavily researched and links to all materials

591
00:51:23,720 --> 00:51:26,880
used for the creation of this episode are contained in the show notes.

592
00:51:26,880 --> 00:51:30,800
You can find them in the text of the episode description of your podcast player or on our

593
00:51:30,800 --> 00:51:31,800
website. Causality

594
00:51:31,800 --> 00:51:37,480
is a Podcasting 2.0 enhanced show and with the right podcast player you'll have episode locations,

595
00:51:37,480 --> 00:51:42,600
enhanced chapters, and real-time subtitles on selected episodes. And you can also stream

596
00:51:42,600 --> 00:51:47,560
Satoshi's and Boost with a message if you like. There's details on how, along with a Boostagram

597
00:51:47,560 --> 00:51:53,240
leaderboard on our website. You can follow me on the Fediverse @chidgey@engineered.space or

598
00:51:53,240 --> 00:52:08,440
the network @engnet@engineered.space. This was Causality. I'm John Chidgey. Thanks so much for listening.

