1.A Generate a 2-class training set by using the following code:
[Xtr, Ytr] = MixGauss([[0;0],[1;1]],[0.5,0.3],100);
[Xts, Yts] = MixGauss([[0;0],[1;1]],[0.5,0.3],100);
Ytr(Ytr==2) = -1;
Yts(Yts==2)
= -1;
1.B
Have
a look at the code of functions regularizedKernLSTrain
and
regularizedKernLSTest.
1.C Check how the separating function changes with respect to lambda and sigma. Use the Gaussian kernel (kernel='gaussian') and try some regularization parameters, with sigma in [0.1, 10] and lambda in [1e-10, 10]. To visualize the separating function (and thus get a more general view of what areas are associated with each class) you may use the routine separatingFKernRLS (type "help separatingFKernRLS" on the Matlab shell, if you still have doubts on how to use it, have a look at the code)
1.D
Perform
the same experiment by using flipped labels
(Ytrn
= flipLabels(Ytr, p))
with
p equal to 0.05 and 0.1. Check how the
separating
function changes with
respect to lambda.
1.E Load the Two moons dataset by using the command [Xtr, Ytr, Xts, Yts] = two_moons(npoints, pflipped) where npoints is the number of points in the dataset (between 1 and 100) and pflip is the fraction of flipped labels. Then visualize the training and the test set by the following lines.
scatter(Xtr(:,1), Xtr(:,2), 50, Ytr, 'filled')
figure;
scatter(Xts(:,1),
Xts(:,2), 50, Yts, 'filled')
1.F Perform the exercises 1.C and 1.D on this dataset.
2.A By using the dataset in 1.E with 100 points and 0.05 flipped labels, select the suitable lambda, by using holdoutCVKernRLS (see help holdoutCVKernRLS for more informations), and the sequence
intLambda = [5, 2, 1, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0005, 0.0002, 0.0001,0.00001,0.000001];
nrip = 51;
perc
= 0.5;
Then plot the validation error and the test error with respect to the choice of lambda by the following code. (The x-axis has a logarithmic scale)
figure;
semilogx(intLambda, Tm, 'r');
hold on;
semilogx(intLambda, Vm, 'b');
hold off;
legend('Training','Validation');
2.B
Perform
the same experiment for different fraction of flipped labels (0.0,
0.05, 0.1, 0.2, 0.3, 0.4, 0.5).
Check
how the training and validation error
change
with different p.
2.C Now select the suitable sigma, by using holdoutCVKernRLS, on the following collection, as in exercise 2.A. Then plot the validation and test error.
intKerPar = [10, 7, 5, 4, 3, 2.5, 2.0, 1.5, 1.0, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.03, 0.02, 0.01];
intLambda = 0.00001;
nrip = 51;
perc
= 0.5;
2.D
Perform
the same experiment for different fraction of flipped labels (0.0,
0.05, 0.1, 0.2, 0.3, 0.4, 0.5).
Check
how the training and validation error
change
with different p.
2.E Now select the best lambda and sigma, by using holdoutCVKernRLS, on the following collection, as in exercise 2.A.
intKerPar = [10, 7, 5, 4, 3, 2.5, 2.0, 1.5, 1.0, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.03, 0.02, 0.01];
intLambda = [5, 2, 1, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0005, 0.0002, 0.0001,0.00001,0.000001];
nrip = 7;
perc = 0.5;
Then
plot the separating function computed with the best lambda and sigma
you have found. (use separatingFKernRLS).
2.F Compute the best lambda and sigma, and plot the related separating functions with 0%, 5%, 10%, 20%, 30%, 40% of flipped labels. How do the parameters differ, and the curves?
3.A Repeat the experiment in section 2, with less points (70, 50, 30, 20) and 5% of flipped labels.
How do the parameters vary with
respect to the number of points?
3.B Repeat
the experiment in section 1 with the polynomial kernel (kernel
= 'polynomial')
and with parameters lambda in the interval [10, 0] and p, the
exponent of the polynomial kernel, in {10,9,...,1}.
3.C Perform the Exercise 2.F with the polynomial kernel and the following range of parameters.
IntKerPar = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
intLambda = [5, 2, 1, 0.7, 0.5, 0.3, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0005, 0.0002, 0.0001,0.00001,0.000001];
What
is the best exponent for the polynomial kernel on this problem? Why?
3.D Analyze the eigenvalues of the Gram matrix for the polynomial kernel with different values of p (plot them by using semilogy). What happens with different p? Why?