Learning From Data - Homework 2 - A solution in LIONoso

Courtesy of Giovanni Pellegrini

Among the exercises of the second week, we provide a solution of the exercises about Linear Regression (exercise 5, 6, 7). The script is written in Python and it uses the numpy module
Before proceeding, be sure to have Python and numpy installed on your computer.

Connecting the Linear Regression Python script to LIONoso

You can download the Linear Regression Algorithm script here, containing our solution.
Please see the notes for Windows users below if the script doesn't work on Windows.

You can load the script by dragging a Parametric table element into the workbench, and by specifying the filename of your script.

In the above figure we just loaded the script.
Depending on the script content, in the left panel you can specify the parameter values, in our case the number of experiments to perform (default 1000), the dimension of the training set for Linear Regression (default 100) and the dimension of training set for the Perceptron (default 10).

By clicking the "Compute" button, the script is launched and a table containing the results of each experiment is produced. The output of every experiment contains 4 columns: Table Row number, In sample error, Out of sample error, Number of iterations the perceptron needs to converge.

To compute the average of each parameter you can:
Open a Bubblechart from the output table (right click on the table generated, select "New panel"->"Bubble"), drag the "Ein" column onto the y axis. Then select the "Advanced properties" tab in the left panel and select "Show polynomial fit" with 0 degree. A red line will appear on the plot showing the average value.

Now repeat for the other two columns, drag and fit them.
To zoom on a specific part of the chart, just select the region.


Results

The results of the tests (1000 tests, 100 points for Linear Training, 10 points for Perceptron Training) are:

Ein:
Average: 0.039

Eout
Average: 0.048

Iterations of the perceptron:
Average: 5

Notes for Windows users

While on most UNIX-based systems (such as Linux and Mac OS X) it is possible to declare the script interpreter in the top line of the script, Windows bases the choice of the interpreter on the filename extension. There can be two types of problems:

  1. The interpreter is installed, but it did not register the file extension (as it happens, e.g., with R)
  2. A specialized application “stole” the file extension and is executed in place of the interpreter (as it happens, e.g., with Canopy, which appropriates the .py extension of Python)
In these cases, you can execute the script from within LIONoso by providing a “wrapper shell script”. In the Python case, use a text editor (e.g., Notepad) to create the file Exercise5-Regression-b.bat containing the following text:
        @echo off
        C:\Python27\python.exe Exercise5-Regression.py %*
where C:\Python27\python.exe must be replaced by the path of the python.exe executable in your system. Next, import this file in the Parametric table.