Using R in LIONoso

Courtesy of: Matteo Presutto

This page is about how to use R scripts in LIONoso. R is a language for statistical computing developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R is a powerful language and commanding R scripts with inputs from LIONoso or receiving data from R calculations can lead to powerful combinations.
At the base of this procedure lies the parametric table tool , to generate and manipulate data directly in LIONoso by using your favorite language. Please get familar with parametric tables before proceeding.

To use R scripts LIONoso needs to know where the Rscript executable lies on your computer and how to interpret the R script.
You can specify it by writing "#! interpreter_location" (without quotations) on top of your script.
For instance, if you are a Linux or Mac user, the executable is by default at /usr/bin/Rscript. If you are using Windows, you may need to write a wrapper script: see the notes for Windows users below.

To catch the command line inputs, R gives you the following intuitive syntax:

    args <- commandArgs(TRUE);
    argc <- length(args);

But LIONoso still doesn't know how to use your script. It requires that you print via standard output (by a simple cat()) a string describing the number of command line parameters and their names followed, by their default values.
Just print the number of parameters in the first line and write parameter_name + tab character + default_value in the following ones, as in the following example:

    if ( argc < 2 ) {
        cat ("2\nNumber of tests\t1000\nNumber of training points\t10\n");

that produces:

    $ ./sample.R
    Number of tests 1000
    Number of training points       10

The lines are printed if the number of arguments is less than 2 (argc was previously computed), that's because LIONoso pre-analyzes the standard output by calling your script without arguments.
Then LIONoso parses the output. In our case, Number of tests corresponds to args[1].
You can check that everything is correct by looking at the bottom-left panel after loading your script in LIONoso with the parametric table tool. If all is correct, you should see something like this:

If you cannot see the parameter input boxes, then something is not working. Try loading this file to see a correct example.

Now, LIONoso gets the output of your script through a file.
In particular, it assumes that the last command line argument is the output filename, that is, args[length(args)] must always be interpreted as a filename in your script. Before LIONoso reads this file, we must ensure that the file is already complete. To ensure this, we can work on a temporary file during the computation and then, at the end, rename it as the output's filename given in the arguments. An example follows:

    out_filename <- args[argc];
    tmp_filename <- paste (out_filename, "_", sep = "");
    ...insert your code here...
    invisible (file.rename(tmp_filename, out_filename));

invisible() is there because LIONoso interprets the standard output and R is verbose with some function, be sure to put it wherever R prints something out or your script may not work!

Last but not least, we need to know what to write in the file. Remember that the parametric table tool parses your table. You need to specify the name and type of each column.
You specify the type of a column by printing, at the beginning of the file, filename::type. The column specifications must be separated by a comma. An example:

    header <- "Test number::label,Number of iterations::number,Disagreement::number";
    write (header, file = tmp_filename, append = TRUE);

Here we have three columns Test number , Number of iterations , Disagreement followed by their type. To print the actual values, just separate every entry line by a newline character and every column value by a comma, as in this sexample:

Test number::label,Number of iterations::number,Disagreement::number

That's all you need to know to create your first R script parametric table in LIONoso. Here you can download a template file summarizing the process. This other example shows you hot to create all results in an internal data.frame structure and ave it as a CSV file.

Notes for Windows users

While on most UNIX-based systems (such as Linux and Mac OS X) it is possible to declare the script interpreter in the top line of the script, Windows bases the choice of the interpreter on the filename extension. There can be two types of problems:

  1. The interpreter is installed, but it did not register the file extension (as it happens, e.g., with R)
  2. A specialized application “stole” the file extension and is executed in place of the interpreter (as it happens, e.g., with Canopy, which appropriates the .py extension of Python)
In these cases, it is possible to execute the script from within LIONoso by providing a “wrapper shell script”. In the R case, use a text editor (e.g., Notepad) to create the file R-sample.bat containing the following text:
        @echo off
        "C:\Program Files\R\R-3.1.1\bin\Rscript.exe" sample.R %*
where C:\Program Files\R 3.1.1\bin\Rscript.exe must be replaced by the path of the Rscript.exe executable in your system. Next, import this file in the Parametric table.