What will be the behavior of dbscan on the uniform data set


Assignment: Data Mining

Problem I

Suppose that you are employed as a data mining consultant for an Internet search engine company. Describe how data mining can help the company by giving specific examples of how techniques, such as clustering, classification, association rule mining, and anomaly detection can be applied.

Problem II

Identify at least two advantages and two disadvantages of using color to visually represent information.

Problem III

Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?

Problem IV

Consider a group of documents that has been selected from a much larger set of diverse documents so that the selected documents are as dissimilar from one another as possible. If we consider documents that are not highly related (connected, similar) to one another as being anomalous, then all of the documents that we have selected might be classified as anomalies. Is it possible for a data set to consist only of anomalous objects or is this an abuse of the terminology?

o Is there a difference between the two sets of points? Please explain.
o If so, which set of points will typically have a smaller SSE for K=10 clusters?
o What will be the behavior of DBSCAN on the uniform data set?

Problem V

Give an example of a data set consisting of three natural clusters, for which (almost always) K-means would likely find the correct clusters, but bisecting K-means would not.

Format your assignment according to the following formatting requirements:

o The answer should be typed, using Times New Roman font (size 12), double spaced, with one-inch margins on all sides.

o The response also includes a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.

o Also include a reference page. The Citations and references must follow APA format. The reference page is not included in the required page length.

Attachment:- Data-Mining.rar

Solution Preview :

Prepared by a verified Expert
Database Management System: What will be the behavior of dbscan on the uniform data set
Reference No:- TGS03070001

Now Priced at $30 (50% Discount)

Recommended (99%)

Rated (4.3/5)