Contents

 

Recurrent neural networks for regression

First, the data have to be preprocessed:

  R> laser  
  R> inputs <- laser[,inputColumns(laser)]
  R> targets <- laser[,outputColumns(laser)]
  R> patterns <- splitForTrainingAndTest(inputs, targets, ratio = 0.15)

Then, an Elman network can be trained on the laser dataset:

  R> model <- elman(patterns$inputsTrain, patterns$targetsTrain, 
  +    size = c(8, 8), learnFuncParams = c(0.1), maxit = 500, 
  +    inputsTest = patterns$inputsTest, targetsTest = patterns$targetsTest, 
  +    linOut = FALSE)

The following figure shows the data and the resulting fitted values:

  R> plot(inputs, type = "l") 
  R> plot(targets[1:100], type = "l") 
  R> lines(model$fitted.values[1:100], col = "green")

 


PIC 

Figure 1: (a) The laser example time series. (b) The first 100 values of the series (black), and the corresponding fits (green).

 


The following shows some plots that can be used to analyze the results:

  R> plotIterativeError(model) 
  R> plotRegressionError(patterns$targetsTrain, model$fitted.values) 
  R> plotRegressionError(patterns$targetsTest, model$fittedTestValues) 
  R> hist(model$fitted.values - patterns$targetsTrain)

 


PIC 

Figure 2: (a) The iterative error plot of both training (black) and test (red) error. (b) Regression plot for the training data, showing a linear fit in the optimal case (black), and to the current data (red). (c) Regression plot for the test data. (d) An error histogram of the training error.

 


Multi-layer perceptron for classification

The following shows the classification using the iris dataset and a multi-layer perceptron for classification. The data has to be loaded and normalized:

 

  R> data("iris") 
  R> iris <-  iris[sample(1:nrow(iris) ,length(1:nrow(iris))), 1:ncol(iris)] 
  R> irisValues <-  iris[,1:4] 
  R> irisTargets <- iris[,5] 
  R> irisDecTargets <-  decodeClassLabels(irisTargets) 
  R> iris <-  splitForTrainingAndTest(irisValues, irisDecTargets, ratio = 0.15) 
  R> iris <-  normTrainingAndTestSet(iris)

The model is then built with: 

  R> model <-  mlp(iris$inputsTrain, iris$targetsTrain, size = 5, 
  +      learnFuncParams = c(0.1), maxit = 60, inputsTest = iris$inputsTest, 
  +      targetsTest = iris$targetsTest) 
  R> predictions <-  predict(model, iris$inputsTest)

Some error plots can be shown:

  R> plotIterativeError(model) 
  R> plotRegressionError(predictions[,2], iris$targetsTest[,2], pch = 3) 
  R> plotROC(fitted.values(model)[,2], iris$targetsTrain[,2]) 
  R> plotROC(predictions[,2], iris$targetsTest[,2])

 


PIC 

Figure 3: A multi-layer perceptron trained with the iris dataset. (a) The iterative error plot of both training (black) and test (red) error. (b) The regression plot for the test data. As a classification is performed, ideally only the points (0,0) and (1,1) would be populated. (c) ROC plot for the second class against all other classes, on the training set. (d) Same as (c), but for the test data.

 


Confusion matrices can be obtained by:

  R> confusionMatrix(iris$targetsTrain, fitted.values(model))
         predictions 
  targets  1  2  3 
        1 42  0  0 
        2  0 40  3 
        3  0  1 41
  R> confusionMatrix(iris$targetsTest, predictions)
         predictions 
  targets 1 2 3 
        1 8 0 0 
        2 0 7 0 
        3 0 0 8
  R> confusionMatrix(iris$targetsTrain, 
  +                  encodeClassLabels(fitted.values(model), 
  +                  method = "402040", l = 0.4, h = 0.6))
         predictions 
  targets  0  1  2  3 
        1  0 42  0  0 
        2  5  0 38  0 
        3  3  0  0 39

Multi-layer perceptron with tuning

This example is the same as the last one, but with a grid search over the hyper-parameters nHidden and learnRate, the number of hidden units and the learning rate. We construct a grid from all combinations of the hyper-parameters and train an MLP for every configuration:

 

  R> parameterGrid <-  expand.grid(c(3,5,9,15), c(0.00316, 0.0147, 0.1)) 
  R> colnames(parameterGrid)  c("nHidden", "learnRate") 
  R> rownames(parameterGrid)  paste("nnet-", apply(parameterGrid, 1, function(x) {paste(x,sep="", collapse="-")}), sep="") 
  R> models  apply(parameterGrid, 1, function(p) { 
  + 
  +        mlp(iris$inputsTrain, iris$targetsTrain, size=p[1], learnFunc="Std_Backpropagation", 
  +            learnFuncParams=c(p[2], 0.1), maxit=200, inputsTest=iris$inputsTest, 
  +            targetsTest=iris$targetsTest) 
  +      })

We can show iterative errors for all of these configurations:

 


  R> par(mfrow=c(4,3)) 
  R> for(modInd in 1:length(models)) { 
  +    plotIterativeError(models[[modInd]], main=names(models)[modInd]) 
  +  }

PIC

Figure 4: Test results for different configurations of hyperparameters.

 


We can then find out training and test RMSEs, and find out, which models performed best:

  R> trainErrors <-  data.frame(lapply(models, function(mod) { 
  +            error  sqrt(sum((mod$fitted.values - iris$targetsTrain)^2)) 
  +            error 
  +          })) 
  R> testErrors <-  data.frame(lapply(models, function(mod) { 
  +        pred <-  predict(mod,iris$inputsTest) 
  +        error <-  sqrt(sum((pred - iris$targetsTest)^2)) 
  +        error 
  +      })) 
  R> t(trainErrors)
                      [,1] 
  nnet.3.0.00316  7.275746 
  nnet.5.0.00316  6.917106 
  nnet.9.0.00316  6.719024 
  nnet.15.0.00316 6.336768 
  nnet.3.0.0147   5.168279 
  nnet.5.0.0147   4.522160 
  nnet.9.0.0147   4.537915 
  nnet.15.0.0147  4.896709 
  nnet.3.0.1      2.389132 
  nnet.5.0.1      2.317903 
  nnet.9.0.1      2.266031 
  nnet.15.0.1     2.261701
  R> t(testErrors)
                       [,1] 
  nnet.3.0.00316  3.0872282 
  nnet.5.0.00316  2.9157225 
  nnet.9.0.00316  2.8615482 
  nnet.15.0.00316 2.7507271 
  nnet.3.0.0147   2.2629653 
  nnet.5.0.0147   2.0537917 
  nnet.9.0.0147   2.1022305 
  nnet.15.0.0147  2.2874399 
  nnet.3.0.1      0.6418449 
  nnet.5.0.1      0.6029194 
  nnet.9.0.1      0.5445354 
  nnet.15.0.1     0.5785385
  R> trainErrors[which(min(trainErrors) == trainErrors)]
    nnet.15.0.1 
  1    2.261701
  R> testErrors[which(min(testErrors) == testErrors)]
    nnet.9.0.1 
  1  0.5445354
  R> model  models[[which(min(testErrors) == testErrors)]]

Finally, we show some characteristics of the model:

  R> model
  Class: mlp->rsnns 
  Number of inputs: 4 
  Number of outputs: 3 
  Maximal iterations: 200 
  Initialization function: Randomize_Weights 
  Initialization function parameters: -0.3 0.3 
  Learning function: Std_Backpropagation 
  Learning function parameters: 0.1 0.1 
  Update function:Topological_Order 
  Update function parameters: 0 
  Patterns are shuffled internally: TRUE 
  Compute error in every iteration: TRUE 
  Architecture Parameters: 
  $size 
  nHidden 
        9 
   
  All members of model: 
   [1] "nInputs"               "maxit" 
   [3] "initFunc"              "initFuncParams" 
   [5] "learnFunc"             "learnFuncParams" 
   [7] "updateFunc"            "updateFuncParams" 
   [9] "shufflePatterns"       "computeIterativeError" 
  [11] "snnsObject"            "archParams" 
  [13] "IterativeFitError"     "IterativeTestError" 
  [15] "fitted.values"         "fittedTestValues" 
  [17] "nOutputs"

  R> summary(model)
  SNNS network definition file V1.4-3D 
  generated at Tue Jun 14 18:29:56 2011 
   
  network name : RSNNS_untitled 
  source files : 
  no. of units : 16 
  no. of connections : 63 
  no. of unit types : 0 
  no. of site types : 0 
   
   
  learning function : Std_Backpropagation 
  update function   : Topological_Order 
   
   
  unit default section : 
   
  act      | bias     | st | subnet | layer | act func     | out func 
  ---------|----------|----|--------|-------|--------------|------------- 
   0.00000 |  0.00000 | i  |      0 |     1 | Act_Logistic | Out_Identity 
  ---------|----------|----|--------|-------|--------------|------------- 
   
   
  unit definition section : 
   
  no. | typeName | unitName   | act      | bias     | st | position | act func     | out func | sites 
  ----|----------|------------|----------|----------|----|----------|--------------|----------|------- 
    1 |          | Input_1    | -0.79506 |  0.03683 | i  | 1,0,0    | Act_Identity |          | 
    2 |          | Input_2    | -0.78576 | -0.16501 | i  | 2,0,0    | Act_Identity |
    3 |          | Input_3    |  0.08197 | -0.06414 | i  | 3,0,0    | Act_Identity |          | 
    4 |          | Input_4    |  0.27179 | -0.03364 | i  | 4,0,0    | Act_Identity |          | 
    5 |          | Hidden_2_1 |  0.09709 | -4.00547 | h  | 1,2,0    |              |          | 
    6 |          | Hidden_2_2 |  0.53819 |  0.08860 | h  | 2,2,0    ||| 
    7 |          | Hidden_2_3 |  0.79931 |  0.73592 | h  | 3,2,0    ||| 
    8 |          | Hidden_2_4 |  0.51168 | -0.07974 | h  | 4,2,0    ||| 
    9 |          | Hidden_2_5 |  0.15274 | -1.41743 | h  | 5,2,0    ||| 
   10 |          | Hidden_2_6 |  0.81100 |  0.84060 | h  | 6,2,0    ||| 
   11 |          | Hidden_2_7 |  0.07041 | -1.76934 | h  | 7,2,0    ||| 
   12 |          | Hidden_2_8 |  0.86656 |  3.65122 | h  | 8,2,0    ||| 
   13 |          | Hidden_2_9 |  0.30518 | -0.22167 | h  | 9,2,0    ||| 
   14 |          | Output_1   |  0.04055 | -0.98037 | o  | 1,4,0    ||| 
   15 |          | Output_2   |  0.93907 | -0.47483 | o  | 2,4,0    ||| 
   16 |          | Output_3   |  0.02019 | -0.31134 | o  | 3,4,0    ||| 
  ----|----------|------------|----------|----------|----|----------|--------------|----------|------- 
   
   
  connection definition section : 
   
  target | site | source:weight 
  -------|------|--------------------------------------------------------------------------------------------------------------------- 
       5 |      |  4: 2.78608,  3: 3.35089,  2:-0.67858,  1:-0.26467 
       6 |      |  4:-1.06826,  3:-0.63044,  2:-0.63331,  1: 0.11463 
       7 |      |  4: 1.34646,  3: 0.91378,  2:-0.59768,  1: 0.33257 
       8 |      |  4:-0.40551,  3:-0.14695,  2:-0.44060,  1: 0.12259 
       9 |      |  4:-1.19147,  3:-1.54435,  2: 0.72196,  1:-0.90792 
      10 |      |  4: 1.10047,  3: 1.21173,  2:-0.63680,  1: 0.35580 
      11 |      |  4:-1.85703,  3:-1.92190,  2: 1.06411,  1:-0.86448 
      12 |      |  4:-2.74663,  3:-3.10505,  2: 0.57721,  1: 0.40978 
      13 |      |  4:-0.14264,  3:-1.01625,  2: 0.64862,  1:-0.03857 
      14 |      | 13: 0.46342, 12: 0.86761, 11: 2.31822, 10:-1.77170,  9: 1.05294,  8:-0.45102,  7:-1.59758,  6:-0.57797,  5:-1.49443 
      15 |      | 13:-0.88071, 12: 3.73700, 11:-3.23268, 10: 0.72368,  9:-2.13537,  8: 0.00212,  7: 0.69394,  6: 0.22842,  5:-4.85466 
      16 |      | 13:-0.69516, 12:-4.26080, 11:-2.05875, 10: 0.57374,  9:-1.15755,  8:-0.41955,  7: 0.90243,  6:-1.24199,  5: 3.62159 
  -------|------|---------------------------------------------------------------------------------------------------------------------

Self-organizing map for clustering

We build a map using the iris data:

 

  R> model  som(irisValues, mapX = 16, mapY = 16, maxit = 500, 
  +                  targets = irisTargets)

  R> plotActMap(model$map, col = rev(heat.colors(12))) 

  R> plotActMap(log(model$map + 1), col = rev(heat.colors(12))) 
  R> persp(1:model$archParams$mapX, 1:model$archParams$mapY, log(model$map + 1), 
  +        theta = 30, phi = 30, expand = 0.5, col = "lightblue") 
  R> plotActMap(model$labeledMap)

 


PIC 

Figure 5: A SOM trained with the iris dataset. (a) A heat map that shows for each unit the amount of patterns where the unit won, from no patterns (white) to many patterns (red). (b) Same as (a), but on a logarithmic scale. (c) Same as (b), but as a perspective plot instead of a heat map. (d) Labeled map, showing for each unit the class to which the majority of patterns belong, for which the unit won.

 


  R> for(i in 1:ncol(irisValues)) plotActMap(model$componentMaps[[i]], 
  +                           col = rev(topo.colors(12)))

 


PIC 

Figure 6: (a)-(d) Component maps for the SOM trained with the iris data. As the iris dataset has four inputs, four component maps are present that show for each input, where in the map it leads to high activation.

 


An ART2 network

We build an ART2 model for the corner data of a tetrahedron in the following way:

 

  R> patterns <-  snnsData$art2_tetra_med.pat 
  R> model  art2(patterns, f2Units = 5, 
  +                   learnFuncParams = c(0.99, 20, 20, 0.1, 0), 
  +                   updateFuncParams = c(0.99, 20, 20, 0.1, 0))

For visualization of this example, we use the package scatterplot3d, for three-dimensional scatter plots:

 

  R> library("scatterplot3d") 
  R> scatterplot3d(patterns, pch=encodeClassLabels(model$fitted.values))

 


PIC 

Figure 7: An ART2 network trained with noisy input data which represent the corners of a three-dimensional tetrahedron. (a) Data with medium noise level, the method correctly assumes four different clusters and clusters the points correctly. (b) Data with high noise level. The method generates three clusters. Especially, the points of two of the corners are in one cluster though they are not spacially near. As ART2 uses normalized vectors, it only is able to take into account the direction of the vectors, which yields this unintuitive result.

 


Associative memory for letters

The example trains an associative memory on the letters data of SNNS, and shows unit activation maps:

 


  R> patterns <-  snnsData$art1_letters.pat 
  R> model <-  assoz(patterns, dimX=7, dimY=5) 
  R> actMaps <-  matrixToActMapList(model$fitted.values, nrow=7) 
  R> par(mfrow=c(3,3)) 
  R> for (i in 1:9) plotActMap(actMaps[[i]])

PIC

Figure 8: Activation maps of the letters A to I in the associative memory.

 


Radial basis function network for function fitting

This example trains an RBF network on a noisy version of the sinus function. At first, we generate the data:

 

  R> inputs <-  as.matrix(seq(0,10,0.1)) 
  R> outputs <-  as.matrix(sin(inputs) + runif(inputs*0.2)) 
  R> outputs <-  normalizeData(outputs, "0_1")

Then, we train the network:

 

  R> model <-  rbf(inputs, outputs, size=40, maxit=1000, initFuncParams=c(0, 1, 0, 0.01, 0.01), 
  +      learnFuncParams=c(1e-8, 0, 1e-8, 0.1, 0.8), linOut=TRUE)

And finally, we show the results:

 


  R> par(mfrow=c(2,1)) 
  R> plotIterativeError(model) 
  R> plot(inputs, outputs) 
  R> lines(inputs, fitted(model), col="green")

PIC

Figure 9: The results of the fitting of an RBF network to a noisy sinus.