MLCC - Laboratory 1 - Local methods


This lab is about local methods for binary classification on synthetic data. The goal of the lab is to get familiar with the kNN algorithm and to get a practical grasp of what we have discussed in class. Follow the instructions below. Think hard before you call the instructors!

Download:

1. Warm up - data generation

Open the matlab file MixGauss.m

[X1, Y1] = MixGauss([[0;0],[1;1]],[0.5,0.25],1000)
figure(1); scatter(X1(:,1),X1(:,2),25,Y1); %type "help scatter" to see what the parameters mean
title('dataset 1')

2. Core - kNN classifier

The k-Nearest Neighbors algorithm (kNN) assigns to a test point the most frequent label of its k closest examples in the training set.

figure;
scatter(Xt(:,1),Xt(:,2),25,Yt,'filled'); %plot test points (filled circles) associating a different color to each "true" label
hold on
scatter(Xt(:,1),Xt(:,2),25,Yest); % plot test points (empty circles) associating a different color to each estimated label
Matlab line: sum(Yest~=Yt)./Nt %Nt number of test data

3. Parameter selection - What is a good value for k?

So far we considered an arbitrary k...

4. If you have time - More experiments