Make today better than yesterday!: ml-305: Univariate Linear Regression

January 30, 2014

ml-305: Univariate Linear Regression

In the last posts here and here, we looked in much depth at the "gradient descent" minimization algorithm. Today we will complete the puzzle by fitting that last piece in the whole jigsaw of the "univariate linear regression" ML algorithm that we have been studying in this 30x series.

To do that, lets recap what we have learnt so far:

x => input variable/feature (in "univariate", we have only 1 feature)
y => output variable/target (this is the value that we need to predict)
m => #training examples
h(x) => hypothesis function, defined as

h_Θ(x) = Θ₀ + Θ₁ x

J(Θ₀, Θ₁) = (Σ_i=1^m (h_Θ(x⁽ⁱ⁾) - y⁽ⁱ⁾)²)/(2*m)
Goal is to minimize the cost function, to find optimal values of Θ₀ and Θ₁ (to be substituted in the hypothesis function). This is achieved using gradient descent algorithm, defined as

repeat until convergence {

Θ_j = Θ_j - α δ/δΘ_j J(Θ₀, Θ₁) (for j = 0, 1)

}

where,

α: is called the "learning rate"
δ/δΘ_j J(Θ₀, Θ₁): is the partial derivative of the cost function (J(Θ₀, Θ₁)) with respect to Θ_j

The only thing left is to substitute the real value of J(Θ₀, Θ₁) inside the gradient descent algorithm. To do that we need to know the partial derivative of J(Θ₀, Θ₁) with respect to Θ₀ and Θ₁. Without going into derivation of the partial derivatives (because

it is too complex and mathematically involved
i don't know it ;) and
knowing the derivation is not needed to correctly implement gradient descent

), the gradient descent steps are defined (for our cost function for univariate linear regression) as:

repeat until convergence {

Θ₀ = Θ₀ - α(1/m) (Σ_i=1^m (h_Θ(x(i)) - y(i))), for j = 0

Θ₁ = Θ₁ - α(1/m) (Σ_i=1^m (h_Θ(x(i)) - y(i)) * (x(i))), for j = 1

}

That is we are inserting the actual values of the partial derivatives of J(Θ₀, Θ₁), with respect to Θ₀ and Θ₁ respectively in the gradient descent implementation. As mentioned previously, while implementing this step in octave (or any other software), it is extremely important to make both updates simultaneously.

With this we have completed learning our first ML algorithm, viz the "univariate linear regression". I hope you have had as much fun learning it as I have teaching it. For me, a lot of my concepts have become clear while explaining things over here in so much detail! This alone has made the whole effort worthwhile. It has also given me motivation to continue this series. So in the upcoming posts I will try to explain other more advanced ML algorithms.

Do leave comments below to share with me your experience here, especially (constructive) criticism about how I can strive to make these posts better. I am eagerly looking forward to hearing from you :)

« ml-401: Multivariate Linear Regression ml-304: Gradient Descent and Learning Rate »

ml-305: Univariate Linear Regression

Recent Posts

Tags