From Correlation to Causation through stories and math

Correlation and causation are two concepts that people often mixup in their minds. I must admit that I myself have been guilty about this, and it unlikely that I would ever entirely grow out of it as it is wired deeply into our psychology. Let me use this article to briefly emphasise what the concepts of correlation and causation means, some interesting stories that have emerged from people misunderstanding these concepts and an algorithm that attempts to find causal relationship using correlation information. Here is a story that I heard a professor of mine, Prof. Dr. Ernst-Jan Camiel Wit, tell us during a lecture. There was a school that was involved in a study to see if providing free mid-day meals to students, which they could choose to be subscribed to this or not. At the end of the study, both the students who subscribed to it and did not where tested for different health indicators. It was observed that the students who chose to have meals from the programme had poorer health

Interactive Monty Hall Problem Implementation

I may have watched too many videos about Monty Hall problem on YouTube. Many vloggers have attempted to explain the solution to the problem using their skills. I did not search for whether someone has published this before, but I am going to attempt to create a virtual version of the problem in this blog, although it is likely that at least a few people did create this.

Some information from WikiPedia: The problem was originally posed (and solved) in a letter by Steve Selvin to the American Statistician in 1975. There was a game show named Let's Make a Deal, created and produced by Stefan Hatos and Monty Hall, the latter serving as its host for nearly 30 years. I have not watched the show, and I am not sure if the entire show was just about this problem or not.

The problem statement is as follows:
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
The first time I was asked the question, I think I thought that doors No.1 and No.2 would have equal probability and hence switching wouldn't make much sense. However, using some math, we arrive at a different answers.

Initially when you were choosing, the probability that Car was behind the door you choose was 1/3. The probability that the car was under either Doors 2 or Doors 3 was 2/3. Then it was revealed to you that Door 3 had a goat. Therefore, the probability that the car was under Door 2 becomes 2/3. Therefore switching may be the most logical decision to make.

I am pretty sure that a lot of people, including some of the most logical programmers can not accept this. Therefore, creating an online simulation of this problem with the source code right in front of you may convince you.

What you see below is interactive. Give it a shot. Play a (few) dozen times and see if the theory holds right.

Times When Switching was advantageous:
Times When Staying was advantageous:
Times When your won the car:
Times When your lost the car:

Here is the HTML part of the code:
<div style="width: 100%">
    <div style="margin: auto">
        <div onclick="choiceMade(0)" id="door0" style="display:inline-block; background: #ccc; border: #fff solid 2px; width: 100px; height: 150px;">&nbsp;</div>
        <div onclick="choiceMade(1)" id="door1" style="display:inline-block; background: #ccc; border: #fff solid 2px; width: 100px; height: 150px;">&nbsp;</div>
        <div onclick="choiceMade(2)" id="door2" style="display:inline-block; background: #ccc; border: #fff solid 2px; width: 100px; height: 150px;">&nbsp;</div>
        <input id="start" type="button" value="Start" onclick="start()"><br>
        <input id="switch" type="button" value="Switch"  onclick="switchChoice()" style="display:none"><br>
        <input id="stay" type="button" value="Stay"  onclick="stayWithChoice()" style="display:none"><br>
    <div id="textMessage"></div>
        <div id="swichtingStat"><b>Times When Switching was advantageous: </b></div>
        <div id="stayingStat"><b>Times When Staying was advantageous: </b></div>
        <div id="timesWonStat"><b>Times When your won the car: </b></div>
        <div id="timesLostStat"><b>Times When your lost the car: </b></div>
And here is the JavaScript part of the code:
var switching = 0;
var staying = 0;
var timesWon = 0;
var timesLost = 0;
var whereTheCarIs;
var currentChoice;
var startButton = document.getElementById("start");
var switchButton = document.getElementById("switch");
var stayButton = document.getElementById("stay");
var swichtingStat = document.getElementById("swichtingStat");
var stayingStat = document.getElementById("stayingStat");
var timesWonStat = document.getElementById("timesWonStat");
var timesLostStat = document.getElementById("timesLostStat");
var textMessage = document.getElementById("textMessage");
var choiceRunning = false;
function start(){
    // Random position between 0 and 2:
    whereTheCarIs = Math.floor(Math.random() * 100000) % 3; = "none";
    textMessage.innerText = "Choose a block";
    choiceRunning = true;
function choiceMade(choice){
        var returnedGoat;
        if(choice === whereTheCarIs){
            // Randomly Return from the other two numbers
            returnedGoat = Math.floor(Math.random() * 100000) % 2;
            if(returnedGoat === whereTheCarIs)
        } else {
            // Return the non-car number
            for(var i = 0; i < 3; i++){
                if(i != whereTheCarIs && i != choice)
                    returnedGoat = i;
        showGoat(returnedGoat); = "block"; = "block";
        currentChoice = choice;
        choiceRunning = false;
function showGoat(goat){
    document.getElementById("door" + goat).innerText = "GOAT";
    textMessage.innerText = "Would you like to switch or stay?"; = "block"; = "block";
function switchChoice(){
    if(currentChoice === whereTheCarIs){
        textMessage.innerText = "You lost :(";
    } else {
        textMessage.innerText = "You won!";
function stayWithChoice(){
    if(currentChoice === whereTheCarIs){
        textMessage.innerText = "You won!";
    } else {
        textMessage.innerText = "You lost :(";
function updateStats(){
    document.getElementById("door" + whereTheCarIs).innerText = "CAR";
    swichtingStat.innerHTML = "<b>Times When Switching was advantageous: </b>" + switching;
    stayingStat.innerHTML = "<b>Times When Staying was advantageous: </b>" + staying;
    timesWonStat.innerHTML = "<b>Times When your won the car: </b>" + timesWon;
    timesLostStat.innerHTML = "<b>Times When your lost the car: </b>" + timesLost; = "block"; = "none"; = "none";
    startButton.value = "Start Again";
function clearAll(){
    for(var i = 0; i < 3; i++){
        document.getElementById("door" + i).innerHTML = "&nbsp;";
    textMessage.innerText = "";
Now, you must try as many times as you can before judging whether the mathematics is right or wrong. Also, it is not very far fetched to think that my code above is bug free. Feel free to bash.

As I shared this post on my Facebook wall, a friend of mine asked me a few questions regarding the explanation given in the post. He came up with the following comment:
The probability that the car was under either Doors 2 or Doors 3 was 2/3. As well as, the probability that the car was under either Doors 1 or Doors 3 was 2/3. Since it is not in 3 Probability of Car behind 2 is 2/3 and 1 is 2/3. isnt it ?!

As I told him that he may have overlooked the fact that the gamehost already knew the position and that changes things, he replied back that this fact should not have anything to do with the question. He also told me that he had already seen this question before and read different explanations, but he is not quite convinced with those either.

Therefore, I am back to attempt to explain this in a different manner, which was the method that helped me understand the solution in the first place.

Let us assume that we have 5 doors, instead of 3. There is a car behind one of the doors and a got behind each of the other doors. I ask you to choose one door at random. What is the probability that you have a car behind that door? The answer would be 1/5.

Now, I know where I placed the car and the goats. I slowly start opening doors one by one. I open the first door, you see a goat. Does that mean that the probability that you have a car rose to 1/4? No, it simply means that I knowing which position has what simply choose to open the door that has a goat.

I keep on opening doors with goats under them until just 2 doors are left. Now, do you think you have got 50-50 chance on the 2 doors? No, there is 80% chance that the car is behind the other door. Me being the gamehost, simply chose not to open that door.

You could verify this as well by tweaking the code that I wrote above to have 5 doors instead of 3.


Popular posts from this blog

Started a blog under HexHoot

First impression of Lugano - Mindblowing

Bought a new domain -