Today I am talking about one of the key concepts in machine learning. That is cost function. In machine learning, cost function is very important. Cost function measures the accuracy of the model. Here cost function in Linear regression is discussed.

The most basic machine learning model is linear regression. For the purpose of today’s discussion and for the simplicity, only single input variable linear regression is considered. The more complicated cases will be discussed in the later articles.

Lets consider the housing prices of Houston, Texas. Training set of the housing prices looks like this:

Size in feet^{2} (X) Price($) in 1000’s

2420 520

955 230

1710 320

2100 410

Here, Number of training example(m) = 4

X = input variable / feature (In this case, Size)

Y = output variable / target variable (In this case, Price)

Each row of this table is a training example. What happens when training examples pass through the learning algorithm. This learning algorithm then outputs a function. This function is usually called hypothesis h. This function takes the input variable (Size of the houses in this example) and output the y (the estimated prices of the houses).

Linear regression with one variable or univariate linear regression is represented as a linear equation like this:

Here h(x) is the hypothesis function for the value X (Size of the houses) that calculates the price of the house for that particular training example. So h(x) actually calculates y, the predicted value of the model.

Now, lets see how to calculate the cost function for this housing price table or any other linear regression with one variable. In machine learning, people usually denote cost function as J( Here is the equation for cost function: