What happens as the K increases in the KNN algorithm ?

Question

What happens as the K increases in the KNN algorithm ?

2 Answers

datascience · Answer 1 · 2018-09-28T15:09:20+0000

First of all, let's talk about the effect of small $k$ , and large $k$ . A small value of $k$ will increase the effect of noise, and a large value makes it computationally expensive. Data scientists usually choose as an odd number if the number of classes is 2 and another simple approach to select k is set $k=\sqrt n$ .

The smaller values for $k$ , not only makes our classifier so sensitive to noise but also may lead to the overfitting problem. Large values for $k$ also may lead to underfitting. So, $k=\sqrt n$ for the start of the algorithm seems a reasonable choice. We need to use Cross-validation to find a suitable value for $k$ .

The location of the new data point in the decision boundary depends on the arrangement of data points in the training set and the location of the new data point among them. Assume a situation that I have 100 data points and I chose $k = 100$ and we have two classes. In this special situation, the decision boundary is irrelevant to the location of the new data point (because it always classify to the majority class of the data points and it includes the whole space). So the new datapoint can be anywhere in this space. Therefore, I think we cannot make a general statement about it.

kalyanak.p · Answer 2 · 2018-09-30T17:54:28+0000

I am assuming that the knn algorithm was written in python. It depends if the radius of the function was set. The default is 1.0. Changing the parameter would choose the points closest to p according to the k value and controlled by radius, among others.

What happens as the K increases in the KNN algorithm ?

Please log in or register to add a comment.

Please log in or register to answer this question.

2 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

Categories