Choice of distance metric for KNN

View: New views
1 Messages — Rating Filter:   Alert me  

Choice of distance metric for KNN

by J_Laberga :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

So, I want to classify observations based on a series of variables using KNN. Some of these variables are categorical and some can be regarded as continuous. The continuous variables are scaled to [0,1] and the categorical variables are split up into individual binary variables, i.e. [0,1].

However, my problem lies in the choice of distance metric when I want to use the k-NN method. How do I ensure that all variables remain equally important when I calculate the distances? For example, what do I do when some of the continuous variables only vary between say [0.1,0.2] and thus get "run over" by the binary variables which always have swings between [0,1]. What metric would be able to downplay the effects of the binary variables? Are there other methods too?

Also, anybody know where I can find comparisons between different metrics and their performances?

I hope that somebody might be able to help me or, at least point me in the right direction.
Thanks!