From Correlation to Causation through stories and math

Image
Correlation and causation are two concepts that people often mixup in their minds. I must admit that I myself have been guilty about this, and it unlikely that I would ever entirely grow out of it as it is wired deeply into our psychology. Let me use this article to briefly emphasise what the concepts of correlation and causation means, some interesting stories that have emerged from people misunderstanding these concepts and an algorithm that attempts to find causal relationship using correlation information. Here is a story that I heard a professor of mine, Prof. Dr. Ernst-Jan Camiel Wit, tell us during a lecture. There was a school that was involved in a study to see if providing free mid-day meals to students, which they could choose to be subscribed to this or not. At the end of the study, both the students who subscribed to it and did not where tested for different health indicators. It was observed that the students who chose to have meals from the programme had poorer health

Creating a Neural Network to Predict Periodic Data

Two years ago, Arijit Mondal,  a professor of mine who was teaching us a course on Deep Learning, asked a question on how to make a Neural Network that can predict periodic data. The students started shouting out their own version of answers, most of them involved the usage of some form of recurrent neural network structure. I had a different answer.

I was not a very good student during my initial semesters during my bachelors. I missed several classes due to the lack of motivation which were supplemented by the change in environmental condition that I had been used to growing up. But, I could recollect some of the things that were discussed in a Math course, which was regarding fourier series. Just by using a bunch of sine waves, the series was able to approximate many functions to an excellent accuracy. How is this any different from the Universal Approximation Theorem that was tought during the initial lectures of this course?

It striked me - the easiest way neural network can learn periodic data is if the network itself has some kind of periodic activation function. Well, I do know that I would not have been the first person in the world to notice this, but yet again, there were a lot of people who did not notice this, and I can write an article on this so that someone in the future could come across this article and give it a thought.

I went ahead with implementing this. I did not generate the weights in a reproducible fashion, so if you try to run this at your end, you may not receive the same results as I did. I am sharing the code anyway.

# Importing the libraries
import math
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Activation, Dense, LSTM, Flatten
from keras.layers.advanced_activations import LeakyReLU
from keras.utils.generic_utils import get_custom_objects
from keras import backend as K
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras.utils import plot_model
import matplotlib.pyplot as plt

# Generating training data
x = []
y = []
def data_fn_njammale(variable):
    return ( (variable%10) / 10 )

for i in range(0, 1000):
    x.append(float(i)/10.0)
    y.append(data_fn_njammale(float(i)/10.0))
 
plt.plot(x,y)
plt.show()

# Neural Network Model
model = Sequential()
model.add(Dense(5, input_dim=1, activation=custom_activation))
model.add(Dense(5))
model.add(Dense(1,activation=custom_activation))
model.summary()
plot_model(model, to_file='model.png')
# Training
model.compile(loss='mse', optimizer='adam',metrics=['accuracy'])
model.fit(x, y, epochs=1000, batch_size=32, verbose=2,validation_data=(x, y))

# Prediction
x_predict = []
y_act = []
for i in range(4000, 5000):
    x_predict.append(float(i)/10.0)
    y_act.append(data_fn_njammale(float(i)/10.0))
 
predict = model.predict(x_predict)
np.set_printoptions(threshold=np.inf)
plt.plot(x_predict,predict,'r')
plt.plot(x_predict,y_act,'g')
plt.show()

As you can see, the training data was generated as y = ( (x%10) / 10) for every x within the range (0, 1000) incremeted by 0.1 at a time, which is a kind of triangular wave. At the output layer, |sin(x)| was used as custom activation function. It was tested against the input values ranging from 4000 to 5000, which is way out of the training range. Here is the output plot:




Considering the fact that I wrote this network from scratch and trained it on my personal computer just to prove my point, I was pretty happy with my results. I think this could inspire someone to build an actual thing some day.

Comments

Popular posts from this blog

Started a blog under HexHoot

Bought a new domain - chickfort.com

First impression of Lugano - Mindblowing