CBMM Summer School, Day 2: Machine Learning

 Lab 1: Local methods & Bias Variance Trade-Off


This lab is about local methods for binary classification on synthetic data. The goal of the lab is to get familiar with the kNN algorithm and to get a practical understanding of what we discussed in class. Follow the instructions below. Think hard before you call the instructors!

Extract the zip file in a folder and set the MATLAB path to that folder.

1. Generate Classification Data

2. Core - kNN classifier

The k-Nearest Neighbors algorithm (kNN) assigns to a test point the most frequent label of its k closest examples in the training set.

figure; hold all;
scatter(Xte(:,1), Xte(:,2), 25, Yte); % color each point based on the "true" label
sel = (Ypred ~= Yte);
scatter(Xte(sel, 1), Xte(sel, 2), 25, Yte(sel), 'X'); % wrongly predicted test points as filled circles

3. Parameter selection - What is a good value for k?

So far we considered an arbitrary choice for k.

4. (Optional) - Additional experiments