Build a very straightforward and fully functional machine


KNN Assignment:

Build a very straightforward and fully functional machine learning classifier with the K-Nearest Neighbor (KNN) algorithm. The KNN model will read a set of data specified by the user, decide the appropriate class of the new instance, and finally output the performance metrics of the model based on test data set.

Requirements:

  • The implementation should  be written in Java.
  • The distance metric used  should be Euclidean distance.
  • The model should use the  training data set as the existing data set to classify the data in the  test set.
  • The performance metrics  output needs to include the confusion matrix and accuracy  (1-misclassification rate).
  • The code must be commented  so that each implemented component of the KNN algorithm is properly  labeled.
  • Use object-orien ted techniques (do not make the whole program one monolithic procedu ral implementation).
    • Separate the distance  calculations, sorting, data structure, and other components so that each  of them can be developed and reviewed individually.
  • Implement the classifier  as a standalone program.

Assignment Deliverable:

This is what I need for my output

1) All source code

2) A copy of the output when K=3

3) A copy of the output when K=5

4) A copy of the output when K=7

5) Confusion matrix and accuracy

A few notes:

- Input/output streams

- Distance calculation

- KNN algorithm

- Confusion matrix and accuracy

- Object-oriented and commented codes

2) You must take the input file path from the command line. We want students to learn how to input dataset from file into memory. In the real world, we will run multiple datasets via multiple files to analyze the results. Coding the file name in your program is not practical.

3) Make sure you test all the rows and use all the other data points as your training data. For example, when you are testing row 1, you need to use rows 2 to 150 as the training data for the calculation. When you are testing row 2, you need to use rows 1 and rows 3 to 150 for the calculation.

4) You should not use a K with an even number. After locating the K nearest neighbors, we need to take the majority for the predicted value. An even number K will potentially have many ties when you are trying to select the majority.

5) I may choose to run a subset of the input data. Please make sure your program works for any input data with the same format.

Request for Solution File

Ask an Expert for Answer!!
JAVA Programming: Build a very straightforward and fully functional machine
Reference No:- TGS0945941

Expected delivery within 24 Hours