Outline the k-means clustering algorithm for a set of data


(a) The K nearest neighbour (KNN) algorithm uses a distance metric to order the training data in relation to a given test example. Given a problem with data in the form (x1,.....,xn,Y), where are independent variables, and y is the dependent variable for prediction, describe and explain an approach to weighting the k nearest neighbours so that nearer neighbours are more important when producing the final predicted y value for a test example.

(b) Outline the k-means clustering algorithm for a set of data defined as vectors xi. Include a diagram to support your algorithm description.

(c) Explain why the k-means clustering algorithm does not guarantee finding the optimal cluster locations for any given application of the algorithm. Given this non-optimal clustering, what does this imply in terms of how k-means should be used in practice to ensure a good clustering?

Solution Preview :

Prepared by a verified Expert
Management Information Sys: Outline the k-means clustering algorithm for a set of data
Reference No:- TGS01267233

Now Priced at $20 (50% Discount)

Recommended (91%)

Rated (4.3/5)