Good enough

It’s not a universal rule. What is? There are a million and one ways that both good and bad things can happen in life. A million is way under any genuine calculation. Slight changes in decisions that are made can head us off in a completely different direction. So much fiction is based on this reality.

Yes, I have watched “Everything Everywhere All at Once[1]”. I’m in two minds about my reaction. There’s no doubt that it has an original take on the theory of multiple universes and how they might interact. It surprised me in just how much comedy formed the core of the film. There are moments when the pace of the story left me wondering where on earth is this going? Overall, it is an enjoyable movie and its great to see such originality and imagination.

This strange notion of multiple universes, numbered beyond count, has an appeal but it’s more than a headful. What I mean is that trying to imagine what it looks like, if such a thing is possible, is almost hopeless. What I liked about the movie is that small difference are more probable and large difference are far less probable. So, to get to the worlds that are radically different from where you are it’s necessary to do something extremely improbable.

Anyway, that’s not what I’m writing about this morning. I’ve just been reading a bit about Sir Robert Alexander Watson Watt. The man credited with giving us radar technology.

Perfect is the enemy of good is a dictum that’s has several attributions. It keeps coming up. Some people celebrate those who strive for perfection. However, in human affairs, perfection, is an extremely improbable outcome in most situations. There’s a lot of talent and perspiration needed to jump from average to perfect in any walk of life.

What the dictum above shorthand’s is that throwing massive amounts of effort at a problem can prevent a good outcome. Striving for perfection, faced with our human condition, can be a negative.

That fits well with me. My experience of research, design and development suggested the value of incremental improvement and not waiting for perfect answers to arise from ever more work. It’s the problem with research funding. Every paper calls for more research to be done.

In aviation safety work the Pareto principle is invaluable. It can be explained by a ghastly Americanisms. Namely, let’s address the “low hanging fruit” first. In other words, let’s make the easiest improvements, that produce the biggest differences, first.

I’m right on-board with Robert Watson-Watt and his “cult of the imperfect”. He’s quoted saying: “Give them the third best to go on with; the second best comes too late, the best never comes”. It’s to say do enough of what works now without agonising over all the other possible better ways. Don’t procrastinate (too much).


[1] https://www.imdb.com/title/tt6710474/

Safety in numbers. Part 1

It’s a common misconception that the more you have of something the better it is. Well, I say, misconception but in simple cases it’s not a misconception. For safety’s sake, it’s common to have more than one of something. In a classic everyday aircraft that might be two engines, two flight controls, two electrical generators and two pilots, so on.

It seems the most common-sense of common-sense conclusions. That if one thing fails or doesn’t do what it should we have another one to replace it. It’s not always the case that both things work together, all the time, and when one goes the other does the whole job. That’s because, like two aircraft engines, the normal situation is both working together in parallel. There are other situations where a system can be carrying the full load and another one is sitting there keeping an eye on what’s happening ready to take over, if needed.

This week, as with many weeks, thinkers and politicians have been saying we need more people with a STEM education (Science, Technology, Engineering, and Math). Often this seems common-sense and little questioned. However, it’s not always clear that people mean the same things when talking about STEM. Most particularly it’s not always clear what they consider to be Math.

To misquote the famous author H. G. Wells: Statistical thinking may, one day be as necessary as the ability to read and write. His full quote was a bit more impenetrable, but the overall meaning is captured in my shorten version.

To understand how a combination of things work together, or not, some statistical thinking is certainly needed. Fighting against the reaction that maths associated with probabilities can scare people off. Ways to keep our reasoning simple do help.

The sums for dual aircraft systems are not so difficult. That is provided we know that the something we are talking about is reliable in the first place. If it’s not reliable then the story is a different one. For the sake of argument, and considering practical reality let say that the thing we are talking about only fails once every 1000 hours.

What’s that in human terms? It’s a lot less than a year’s worth of daylight hours. That being roughly half of 24 hours x 7 days x 52 weeks = 4368 hours (putting aside location and leap years). In a year, in good health, our bodies operate continuously for that time. For the engineered systems under discussion that may not be the case. We switch the on, and we switch them off, possibly many times in a year.

That’s why we need to consider the amount of time something is exposed to the possibility of failure. We can now use the word “probability” instead of possibility. Chance and likelihood work too. When numerically expressed, probabilities range from 0 to 1. That is zero being when something will never happen and one being when something will always happen.

So, let’s think about any one hour of operation of an engineered system, and use the reliability number from our simple argument. We can liken that, making an assumption, to a probability number of P = 1/1000 or 1 x 10-3 per hour. That gives us a round number that represents the likelihood of failure in any one hour of operation of one system.

Now, back to the start. We have two systems. Maybe two engines. That is two systems that can work independently of each other. It’s true that there are some cases where they may not work independently of each other but let’s park those cases for the moment.

As soon as we have more than one thing we need to talk of combinations. Here the simple question is how many combinations exist for two working systems?

Let’s give them the names A and B. In our simplified world either A or B can work, or not work when needed to work. That’s failed or not failed, said another way. There are normally four combinations that can exist. Displayed in a table this looks like:

A okB ok
A failsB ok
A okB fails
A failsB fails
Table 1

This is all binary. We are not considering any near failure, or other anomalous behaviour that can happen in the real world. We are not considering any operator intervention that switches on or switches off our system. We are looking at the probability of a failure happening in a period of operation of both systems together.

Now, let’s say that the systems A and B each have a known probability of failure.

Thus, the last line of the table becomes: P4 = PA and PB

That is in any given hour of operation the chances of both A and B failing together are the product of their probabilities. Assuming the failures to be random.

Calculating the last line of the table becomes: P4 = PA x PB

In the first line of the table, we have the case of perfection. Simultaneous operation is not interrupted, even though we know both A and B have a likelihood of failure in any one hour of operation.

Thus, the first line becomes: P1 = (1 – PA) x (1 – PB)

Which nicely approximates to P1 = 1, given that 1/1000 is tiny by comparison.

The cases where either A or B fails are in the middle of the table.

P2 = PA x (1 – PB) together with P3 = (1 – PA) x PB

Thus, using the same logic as above the probability of A or B failing is PA + PB

It gets even better if we consider the two systems to be identical. Namely, that probabilities PA and PB  are equal.

A double failure occurs at probability P2

A single failure occurs at probability 2P

So, two systems operating in parallel there’s a decreased the likelihood of a double failure but an increase in the likelihood of a single failure. This can be taken beyond an arrangement with two systems. For an arrangement with four systems, there’s a massively decreased likelihood of a total failure but four times the increase in the likelihood of a single failure. Hence my remark at the beginning. 

[Please let me know if this is in error or there’s a better way of saying it]