You are going to write a python program that will index a


Assignment Description:

You are going to write a python program that will index a set of documents and build a function to search through these documents using the index.

How to Start?

1. Use the files that are provided for this assignment. There are 20 of them.

2. Indexing of the files:

a. Read each text files that are provided and convert each words in the file to lower case. [see folder: "PyAssignment1 txt files"]

b. Create a list with words from each text files.

c. Remove stop words from each list and get the final list of words for each text files. [A list of stop-words is provided. See: stopwords.txt]

d. Build dictionary for each word with KEY being the document ID (file name) and VALUE as frequency (number of times the word appears in that particular file).

3. Fire a Query:

a. Take a set of words as input.

b. Create a list with words from the query words.

c. Remove stop words.

d. Score each document by summing the frequency of each word of the input in the document.

e. Print pair wise all document ID and score in descending order of score whose score is greater than zero.

Deliverables - Your deliverables for this assignment should include the following:

1. A python file called "studentid_search.py" (please prefix the filename with student ID)

2. Studentid_search.py should have at least two functions - "index" and "search"

3. Submit studentid_search.py by mailing it to the TA.

This is an individual assignment; You may not work in groups and collaborate.

Attachment:- Assignment.zip

Request for Solution File

Ask an Expert for Answer!!
Python Programming: You are going to write a python program that will index a
Reference No:- TGS02435058

Expected delivery within 24 Hours