The similarity between any two files in our collection, Basic Computer Science

The similarity between any two files in our collection

The basic task is to measure the similarity between any two files in our collection. To do this, we will need a suitable universe of words. This will consist of all words in the collection that are (a) more than four letters long, (b) don't occur more than 20 times overall, and (c) don't occur in more than 7 files in the collection. Now we constructor a vector (in the mathematical sense) corresponding to each file. The vector will have as many coordinates as words in the universe -- so there is one coordinate for each word in the universe. If a word occurs in the file, the corresponding coordinate is 1, otherwise it is 0.

Let us give an example: suppose the universe consists of the five words: apple, grapes, banana, doctor, program. Suppose file1 contains: apple, banana, program. Then the vector for file1 is (1,0,1,0,1).

We need to normalize each of the vectors so that it has unit length. So each coordinate in the above vector gets divided by the square root of 3.

The similarity of two files is defined to be the scalar product of the corresponding two vectors. The scalar product of two vectors is obtained by multiplying corresponding components and adding. For example, the scalar product of (2,1,3) and (0,5,6) is 2 * 0 + 1 * 5 + 3 * 6.

Your task is to write a program that prints the names of the two files with the highest similarity among the files in the collection, and the names of the two files with the lowest similarity

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Basic Computer Science: The similarity between any two files in our collection

Reference No:- TGS0143977

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Recent Questions Asked Basic Computer Science

Q : What is the density mass per unit volume of the asteroid

The volume of the asteroid, measured from the Galileo images, is 14100 km3. What is the density (mass per unit volume) of the asteroid?

Q : What is its speed at 1s and how many revolutions

A magnetic computer disk 8.0 cm in diameter is initially at rest. A small dot is painted on the edge of the disk. A small dot is painted on the edge of the disk. The disk accelerates at 600 rad/s^2 for 1/2 s , then coasts at a steady angular veloc

Q : What is the new temperature in

An ideal gas at 20 consists of atoms. 7.6 of thermal energy are removed from the gas. What is the new temperature in?

Q : Give a nonrecursive algorithm that performs an inorder tree

Give a nonrecursive algorithm that performs an inorder tree walk. (Hint: There is an easy solution that uses a stack as an auxiliary data structure and a more complicated but elegant solution that uses no stack but assumes that two pointers can be te

Q : The similarity between any two files in our collection

Q : Minimum stopping distance for car

The minimum distance required to stop a car moving at 34.0 mi/h is 42.0 ft. What is the minimum stopping distance for the same car moving at 64.0 mi/h, assuming the same rate of acceleration?

Q : What will be the resulting speed of the boat

A fisherman and his rowboat have a combined mass of 125 kg. Standing in the motionless boat in calm water, he tosses out a 5 kg rock from the back of the boat, with a velocity of 5 m/s. What will be the resulting speed of the boat?

Q : Discuss the importance of wind as an agent of erosion

Wind is included along with gravity, water, and ice as an agent of erosion. In many areas of natural beauty, statements are often made that credit wind as having sculpted the landscape.

Q : The name and composer of the selection

Listening Assignment: Introductory Concepts To complete this assignment you must choose a listening selection from Section 1: Basic Musical Concepts (any of the classes from "Unity and Variety" up to and including "Folk Music, Art Music, and All

1943128

Questions
Asked

3,689

Active Tutors

1452358

Questions
Answered

Start Excelling in your courses, Ask a tutor for help and get answers for your problems !!

ask Question

Request for Solution File

Ask an Expert for Answer!!

Basic Computer Science: The similarity between any two files in our collection

Reference No:- TGS0143977

Have a Question? (oR Write a Review)

Recent Questions Asked Basic Computer Science

Q : What is the density mass per unit volume of the asteroid

Q : What is its speed at 1s and how many revolutions

Q : What is the new temperature in

Q : Give a nonrecursive algorithm that performs an inorder tree

Q : The similarity between any two files in our collection

Q : Minimum stopping distance for car

Q : What will be the resulting speed of the boat

Q : Discuss the importance of wind as an agent of erosion

Q : The name and composer of the selection

Outline developmental sequence for social studies

Examining the therapeutic approaches

Discuss academic motivation, relationship problems, anxiety

How might connection to ones culture improve their ability

Measurement of behaviors in applied behavior analysis

Examining counselor competence and trainin

How influential professional organizations are

Request for Solution File

Ask an Expert for Answer!!

Basic Computer Science: The similarity between any two files in our collection

Reference No:- TGS0143977

Recent Questions Asked Basic Computer Science

Q : What is the density mass per unit volume of the asteroid

Q : What is its speed at 1s and how many revolutions

Q : What is the new temperature in

Q : Give a nonrecursive algorithm that performs an inorder tree

Q : The similarity between any two files in our collection

Q : Minimum stopping distance for car

Q : What will be the resulting speed of the boat

Q : Discuss the importance of wind as an agent of erosion

Q : The name and composer of the selection

Asked Questions