Bayes’ theorem is a widely used concept used in both statistics and probability theory.
As per the Wikipedia page: “[It] describes the probability of an event, based on prior knowledge of conditions that might be related to the event“.
In other words, it tries to calculate the probabilities of an even within a defined scenario (or scenarios), having itself its own likelihood of happening.
Another common equivalent definition is that Bayes’ theorem deals with the conditional probability of events.
How it works
Bayes’ theorem deals with likelihoods of actual recorded events.
A first simplistic and intuitive example is:
you see a guy getting off his Volvo S60, and you are asked to guess if his salary is (say) north or south of 30,000€.
I bet most of us would guess that it is higher than that mark
(In this case we don’t have the precise data, but it is not far fetched to assume that 90% of Volvo S60 owners are above -or well above- the 30,000€ salary, for example).
Given our assumption above (and not forgetting that Bayes’ deals with actual factual probabilities) we just made in educated guess to minimize the possibilities of being wrong.
Largely, Bayes is based on this logic (though it definitely applies it with better statistical and numerical precision).
The statistics behind it
The formula of Bayes’ theorem is the one below:
Where A and B are 2 events that can or cannot happen simultaneously (otherwise said: are not mutually exclusive).
The formula above reads as it follows:
– The probability of A happening, given that B has happened
is equal to
– the probability of B happening, given that A has happened, multiplied by the probabilities of A happening, and divided by the probabilities of B happening.
A simple example, (taken from our past Statistics module) might help to clarify this statement.
You have an automatic monitoring system, created to detect intruders, and it does so with a probability of 90%.
The system automatically records the weather, and in a series of controlled tests it has shown that, when the intruder was succesfully detected:
– 75% of the times the weather was clear
– 20% of the times the weather was cloudy
– 5% of the times the weather was rainy
When instead the system failed to detect the intruder:
– 60% of the times the weather was clear
– 30% of the times the weather was cloudy
– 10% of the times the weather was rainy
Find the probability of detecting the intruder, given that the weather is rainy (assuming an intruder actually entered the plant).
Defining D the event that the intruder is detected
(and DC its complementary event that the intruder is NOT detected):
P(D) = 0.9
P(Clear¦D) = 0.75
P(Cloudy¦D) = 0.20
P(Rainy¦D) = 0.05
P(DC) = 0.1
P(Clear¦DC) = 0.60
P(Cloudy¦DC) = 0.30
P(Rainy¦DC) = 0.10
One way to look at the problem (which helps us understanding as well the logic behind the theorem) is by using the following tree:
Realizing then that the previously shown formula can be breaken down as explained here:
We can procede by calculating (remembering that D and DC are mutually exclusive & exhaustive events):
P(D¦Rainy) = [P(D) * P(Rainy¦D)] / [P(D)*P(Rainy¦D) + P(DC)*P(Rainy¦DC) ]
(0.9)(0.5) / (0.9)(0.5)+(0.10)(0.10) = 0.818 = 81.8%
Under rainy conditions, the system can detect an intruder with probability of 0.818 (a value lower than the designed probability of 0.9)
We hope that this definitions and this example can help as a first approach to Bayes’ theorem nature and purposes.