### MATLAB >> k fold cross validation algorithm

by **Greg Heath** » Tue, 05 May 2009 18:28:00 GMT

You will need a validation set if you are using Early

Stopping and/or need to make unbiased estimates

of hyperparameters (e.g., learning rate, momentum

coeff, classification threshold, etc). Use the partition

trn/val/tst : 8/1/1. Make sure that each val-tst pair

is unique i.e., don't use subsets p and q for val and

test and then use them again for test and val. Try

the indexing (10,1),(1,2),...(9,10).

Yes! A brand new random weight initialization every time.

If you use the weights from fold j-1 to intialize the

net at fold j you violate the independent unbiased

holdout status of the test set: it was part of the design

(training or validation) set at all previous folds.

Confusion matrix implies a classifier.

Are there only two classes?

Are you using one output with logsig output activation

and unipolar binary {0,1} targets?

Are you using a threshold of T = 0.5 to construct the

confusion matrix?

If so, I would postpone this step because one of the purposes

of the ROC analysis is to

1. Use the validation set to find the optimal value of T to

optimize some objective (e.g., equal error rate, fixed error

rate for false positives, or minimum cost weighted risk).

2. Use that value of T and the test set to construct the confusion

matrix

There will be a different value of T and corresponding test set

confusion matrix for each fold.

It is wise to keep a complete record of every fold just in case

you want to modify your criterion for choosing T:

For each input keep track of

1. Fold

2. Class membership: "Positive" or "Negative"

3. Subset membership: "Training", "Validation" or "Test"

4. Output value (0 < y < 1 for logsig).

For training, validation and test sets

Sort y for each class

Obtain CDF vs y for positives

Obtain 1-CDF vs y for negatives

Combine to obtain ROC coordinates

Determine T from validation ROC

Obtain confusion matrix from test ROC.

Interesting to demonstrate the importance of

independent holdout validation and testing

by comparing the values at the optimum points

on the three ROC curves.

From adding the 10 test set confusion matrices

Since you have a complete record of every fold,

you can obtain 2 ROCs

1. Training: For each class, sort all of the training

set outputs

2. Nontraining (validation + test): For each class,

sort the mixture of validation and test set outputs.

Hope this helps.

Greg