What is the map value for each system what is f 1 for each


Assignment

1. Instead of returning nothing for a query, a search engine should return some results even if they are incorrect." Do you agree or disagree? Explain.

2. What are the differences between static and dynamic summaries? Describe a scenario where each type of summary would be the best solution to a search query.

3. Consider an information need for which there are 6 relevant documents in the collection. Contrast two systems run on this collection. Their top 10 results are judged for relevance as follows:

a. Complete this table with numerical values:

 

System 1

System 2

 

 

Recall

Precision

 

Recall

Precision

1

R

 

 

N

 

 

2

N

 

 

R

 

 

3

R

 

 

N

 

 

4

N

 

 

N

 

 

5

N

 

 

R

 

 

6

R

 

 

R

 

 

7

N

 

 

R

 

 

8

N

 

 

N

 

 

9

R

 

 

N

 

 

10

R

 

 

N

 

 

b. What is the MAP value for each system? (equation okay)

c. What is F β=1 for each system with 10 documents returned?

4. True/False

a) _____ The tf-idf weight increases with the number of occurrences of a term within a document
b) _____ The tf-idf weight increases with the rarity of the term in the collection.
c) _____ The summary information displayed by a search engine must come from the "description" meta tag in a html file.
d) _____ Hard clustering is more common and easier than soft clustering.
e) _____ Pseudo relevance feedback is the same as indirect relevance feedback
f) _____ Query expansion means to double the number of results shown to the user
g) _____ Hierarchical agglomerative clustering is a "top down" clustering technique.
h) _____ Linear classifiers partition the dataspace into overlapping regions.
i) _____ K-means clustering is an example of unsupervised learning.
j) _____ Hub sites should be scored higher than authoritative sites.

0.80

0.70

0.90

0.00

0.50

0.10

0.50

0.75

0.25

5.The following probabilities have been Long Sweet Green determined from a training set of 1000. Cucumber

Given a sample that is long, sweet and green; Jalapeno what is the probability that it would be classified Other as a Cucumber using a naïve bayes classifier? Show your work.

6. Apply the KNN algorithm to classify the data item (14) with this known set of data and classes: { (10,1), (11,1), (15,2), (12,1), (18,2), (9,1), (20,2), (17,2) }. Show your work for K = 3 and K = 5.

7. Given this portion of a web graph

1968_Web graph.jpg

a) Show the node adjacency matrix

b) Convert the adjacency matrix to a transition probability matrix (i.e. Markov chain) for PageRank.

9. Complete the table below so that the cosine similarity for the query "brown cat" against document three is 1.0 ( SMART nnc.nnc ).

 

q

qq

d1

d2

d3

dd1

dd2

dd→3

qq•dd1

qq•dd2

qq•dd3

Brown

1

.707

2

1

 

.632

.277

 

.447

.196

 

Cat

1

.707

1

1

 

.316

.277

 

.223

.196

 

how

0

0

1

1

 

.316

.277

 

0

0

 

meow

0

0

0

3

 

0

.832

 

0

0

 

now

0

0

2

1

 

.632

.277

 

0

0

 

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: What is the map value for each system what is f 1 for each
Reference No:- TGS02295824

Expected delivery within 24 Hours