January 24, 2014

ml-201: octave -- part - 1

Hi again! In the previous posts here, here and here, we did pretty theoretical and very basic introduction. But starting now (notice the 201 in the title of the article?), we will be digging deeper into specific algorithms, their working, intuition and implementation. But before we go there, we need to get familiar with some background setup.

Firstly, you should have atleast some basic knowledge of the following disciplines of mathematics:

Don't worry, you don't need to be gurus, just basic knowledge is good enough to understand the intuition of and implement ML algorithms correctly.

Furthermore, for this series I will use the octave programming language because:

  • that is what was used in the coursera course that I recently completed (which is what prompted this series)
  • it is extremely good for rapid prototyping (most algorithms can be implemented in just a few lines of code)
  • the smartest thing that you can do when implementing a ML algorithm is "rapid prototyping" to avoid getting trapped in a black hole of wrong effort which does not lead to a effective solution, but simply wastes timeSo without further ado, lets start with getting a little bit familiar with the octave programming language:
  • Octave is a gnu software and its installers are available for OS X, windoZZZe, and linux here
  • It is a programming language made specifically for ML and has good library support
  • However, it is very slow and hence might not be suitable for production programs
  • You can execute octave commands both from the command line, or even a GUI

Octave is very well documented and the complete documentation is available here, but we will go through some of the most commonly used commands (and more during the actual algorithm discussion). I have only used the command line version during the course, and that is what I will explain here. Maybe the GUI would be much easier for you, YMMV.

Basic: The most frequently used command in octave (for us beginners) is, obviously the "help" command ;). The basic format of the help command is

help <command-name>

For example "help plot" shows the help section for the "plot" command

Some other basic commands are:

  • the '%' character starts a comment. If it comes in the middle of a line, then the part after the '%' is the comment. eg
% this whole line is a command
a = 5; % a is assigned the value 5
  • a ';' suppresses the output. so for eg
a = 5 % this will print 'a = 5'
a = 5; % this will not print anything
  • inside a matrix definition, the ';' character takes the remaining numbers onto the next row of the matrix. eg
A = [1 2 3; 2 3 4; 3 4 5]

is a 3x3 matrix of the form

1 2 3
2 3 4
3 4 5

  • Matrices (and vectors (single dimension matrices)) are the "the" way of representing data and performing computations on them during the implementation of ML algorithms in octave. So obviously, octave has a great support for matrix creation and manipulation. Some of the frequently used operations are:
    • A = zeros(2, 3) % will create a 2x3 matrix filled with the element zero
    • B = ones(3, 2) % will create a 3x2 matrix filled with the element one
    • B2 = 2 * ones(3, 2) % will create a 3x2 matrix filled with the element two
    • C = rand(3, 3) % will create a 3x3 matrix filled with numbers between 0 and 1 with a uniform distribution
    • D = eye(5) % will create a 5x5 identity matrix

Loading, saving and plotting data: Of course the next most common operation that you would do while implementing a ML algorithm would be loading, saving and plotting of data. Octave supports these operations via:

  • pwd: shows the current working directory
  • cd : change to another directory (where data and/or code exists, so that you can load them)
  • data = load(): load data from file into variable named "data"
  • who: shows what variables are present in the memory in the current scope
  • whos: shows what variables are present in the memory, and show their size (and other information)
  • clear [+] : remove variable(s) from memory
  • save : save value of variable (may be a vector or a matrix) into the named file

I think that is a big enough bite for today. We will continue with some more octave in the next post, so keep watching this space :)