MLCC - Laboratory 2 - Regularization networks

This lab is about Regularized Least Squares (linear and non linear). Follow the instructions below. Think hard before you call the instructors!

Download:

zipfile (unzip it in a local folder)

1. Warm up - data generation

We start again from data generation, use the function MixGauss:

1.A Generate a 2-class training set where each class is centered on (0,0) and (1,1) respectively and their variance is 0.25 and 0.35 (100 points per class): input X, output Y Adjust the output labels so to assign them {-1,1}: Y=(Y==2)+(Y==1)*(-1);
1.B Generate the corresponding test set of 500 points per class: input XT, output YT
1.C Add some noise to the previously generated data, with the function flipLabels (type "help flipLabels" for some guideline). You will obtain a new set of training and test output vectors Yn, YTn

Plot the various datasets with the function scatter, eg:
figure;
hold on
scatter(X(Y==1,1), X(Y==1,2), '.b');
scatter(X(Y==-1,1), X(Y==-1,2), '.y');
title('training set')

2. Linear RLS

2.A Have a look at the code of functions regularizedLSTrain and regularizedLSTest, and see if they look the way you expected them (compare them with what we discussed in class)

2.B Try the functions on the previously generated 2-class data from section 1.A. Pick a "reasonable" lambda.

2.C As you did yesterday, perform parameter selection with the function holdoutCV, looking for a good lambda in the range {exp(-10), ...exp(0)}
[l, s, Vm, Vs, Tm, Ts] = holdoutCV(@regularizedLSTrain, @regularizedLSTest, '', X, Y, 0.35, 20, exp(0:-0.5:-10), [ ])
Have a look at the selected lambda. Plot training and validation mean error, as you did yesterday
2.D Use the chosen value for lambda to compute the separating function (function separatingF). Superimpose to the function both training data (X,Y) and test data (XT,YT), on two separate plots, to analyze the generalization properties of you solution.

3. Non Linear RLS

3.A Same as Section 2, but this time you need to use regularizedKernLSTrain and regularizedKernLSTest. You may adopt a Gaussian kernel, and in this case you will have to choose two parameters (lambda, sigma).
Have a look at the optimal lambda and sigma and compare the separating function with both training and test data (as in 2.D)
3.B Now we try and have a look at the effects of overfitting and oversmoothing: repeat the previous experiments but this time instead than the optimal sigma you should choose a small one (eg sigma=0.05) or a large one (eg sigma=10).
3.C Rules of thumb for choosing a reasonable sigma: instead than exploring many possible values you may initialize your sigma by computing the average distance between close points. Have a look at the function "autosigma". you may give it a try and see what kind of value it selects: s=autosigma(X,5);

4. If you have time - More experiments

4.A Repeat all the experiments in sections 2 and 3, this time using a training set (X,Yn) and a test set (XT,Yn) obtained by flipping a given percentage of labels (function flipLabels). You may try 5%, 10%, 20%, ... of flipped points to see what happens. Give a special look at training and validation errors
4.B Repeat the experiments by varying the size of the training set (as long as matlab supports you!)
4.C You may also generate more complex datasets (with closer gaussians, or more than one gaussian defining the support of a given class) or use the ones you find in the zipfile (ADDITIONAL_DATA/)