We start with other classifcation problem. GunPoint dataset comes from the video surveillance domain. Description by Chotirat Ann Ratanamahatana and Eamonn Keogh in: “Everything you know about Dynamic Time Warping is Wrong“ is as follows:

“…The two classes are:

- Gun-Draw: The actors have their hands by their sides. They draw a replicate gun from a hip-mounted holster, point it at a target for approximately one second, then return the gun to the holster, and their hands to their sides.
- Point: The actors have their gun by their sides. They point with their index fingers to a target for approximately one second, and then return their hands to their sides.

For both classes, we tracked the centroid of the actor’s right hands in both X- and Y-axes, which appear to be highly correlated; therefore, in this experiment, we only consider the X-axis for simplicity…“. The following figure shows a video sequence of GunPoin problem taken from “Everything you know about Dynamic Time Warping is Wrong“.

GunPoint dataset is available in the **LPStimeSeries** R package. We install and load the dataset in the following code:

install.packages("LPStimeSeries")library("LPStimeSeries")data(GunPoint)

GunPoint is a list with a training dataset and a testing dataset provided as separate matrix. At first, we obtain a single dataset by the union of testing and training partitions. Now, we perform a partion that simulate the semi-supervised context.

library("ssc")

x <-rbind(GunPoint$trainseries,GunPoint$testseries) # instances

y <- c(GunPoint$trainclass,GunPoint$testclass) # classesset.seed(1) # set seed

tra.idx <-sample(x =length(y), size =ceiling(length(y) * 0.5))

xtrain <- x[tra.idx,] # training instances

ytrain <- y[tra.idx] # related classes

tra.na.idx <-sample(x =length(tra.idx),

size =ceiling(length(tra.idx) * 0.7))

ytrain[tra.na.idx] <- NA # remove classes from 70% of instances

xttest <- x[tra.na.idx,] # ejemplos no etiquetados de entrenamiento

yttest <- y[tra.na.idx] # clases reales

tst.idx <-setdiff(1:length(y), tra.idx)

xitest <- x[tst.idx,] # ejemplos de prueba

yitest <- y[tst.idx] # classes asociadas

Now we compute the distance matrices needed to perform the training phase:

library(proxy) # cargar paquete

dtrain <-as.matrix(dist(x = xtrain, method = "euclidean",

by_rows = TRUE))

ditest <-as.matrix(dist(x = xitest, y = xtrain,

method = "euclidean", by_rows = TRUE))

dttest <-as.matrix(dist(x = xttest, y = xtrain,

method = "euclidean", by_rows = TRUE))

Using the distance matrices we train five semi-supervised model available in the ssc package.

`m.selft <-`

selfTraining(x = dtrain, y = ytrain)

m.setred <-setred(x = dtrain, y = ytrain)

m.snnrce <-snnrce(x = dtrain, y = ytrain)

m.trit <-triTraining(x = dtrain, y = ytrain)

m.cobc <-coBC(x = dtrain, y = ytrain)

To determine the most accurate classifier, we need to perform a comparison between the classification results obtained with each semi-supervised model. All statistics are included in a matrix and plotted in a barplot for easy comparison. We use the ** statistics** function available in the ssc package.

`matrix.stat <-`

matrix(nrow = 3, ncol = 5)

d <- ditest[, m.selft$included.insts]

p.selft <-predict(m.selft, d) # clasificar con selfTraining

matrix.stat[,1] <-unlist(statistics(p.selft, yitest))

d <- ditest[, m.setred$included.insts]

p.setred <-predict(m.setred, d) # clasificar con setred

matrix.stat[,2] <-unlist(statistics(p.setred, yitest))

d <- ditest[, m.snnrce$included.insts]

p.snnrce <-predict(m.snnrce, d) # clasificar con snnrce

matrix.stat[,3] <-unlist(statistics(p.snnrce, yitest))

d <- ditest[, m.trit$included.insts]

p.trit <-predict(m.trit, d) # clasificar con triTraining

matrix.stat[,4] <-unlist(statistics(p.trit, yitest))

d <- ditest[, m.cobc$included.insts]

p.cobc <-predict(m.cobc, d) # clasificar con coBC

matrix.stat[,5] <-unlist(statistics(p.cobc, yitest))barplot(matrix.stat, beside = T,

names.arg = c("SelfT","SETRED","SNNRCE","TriT","coBC"),

ylim = c(0.6,1),

density = c(10,30,40), angle = c(45, 135, 60),

col=c("cadetblue2","cadetblue3","cadetblue"),

ylab = "Estadísticos de la clasificación", xpd = F,

main="Classification of GunPoint dataset",

legend = c("kappa","Accuracy","F-Measure"),

args.legend = list(x = "top", ncol = 3))

In the specific case of * triTraining* and

**coBC** functions, it is possible to obtaing different results at different runs of the training functions. This is caused by the randomness involved in the bagging process performed at the initial stage of the training process.