Csci424 - reasoning and learning - university of wollongong


Question 1

A convicted criminal who reo ends after release is known as a recidivist. The table below lists a dataset describing prisoners released on parole, and whether they reo ended within two years of release.

This dataset list six instances where prisoners were granted parole. Each of these instances are described in terms of the three binary descriptive features (GOOD BEHAVIOUR, AGE < 30, DRUG DEPENDENT) and a binary target feature, RECIDIVIST. The GOOD BEHAVIOUR feature has a value of TRUE if prisoner had not committed any infringements during incarceration, the AGE < 30 has a value of TRUE if the prisoner is under 30 year sof age when granted parole, and the DRUG DEPENDENT feature is TRUE if the prisoner had a drug addiction at the time of parole. The target feature RECIDIVIST, has TRUE value if the prisoner was arrested within two years of being released; otherwise it has a value of FALSE.

a. Using this dataset, construct the decision tree that would be generated by the ID3 algo- rithm using entropy-based information gain.
b. What prediction will the decision tree generated in part (a) of this question return for the

Table 1: LIST OF PRISONERS RELEASED ON PAROLE following query?

ID

GOOD BEHAVIOUR

AGE < 30

DRUG DEPENDENT

RECIDIVIST

1

FALSE

TRUE

FALSE

TRUE

2

FALSE

FALSE

FALSE

FALSE

3

FALSE

TRUE

FALSE

TRUE

4

TRUE

FALSE

FALSE

FALSE

5

TRUE

FALSE

TRUE

TRUE

6

TRUE

FALSE

FALSE

FALSE

GOOD BEHAVIOUR = FALSE; AGE < 30 = FALSE; DRUG DEPENDENT = TRUE

c. What prediction will the decision tree generated in part (a) of this question return for the following query?

GOOD BEHAVIOUR = TRUE; AGE < 30 = TRUE; DRUG DEPENDENT = FALSE

Question 2

Implement the value iteration algorithm for MDP which computes the solution to the situation shown below in Figure 1. You have been given two sample Python codes from the book by Russell and Norvig (speci cally Chapter 17). Study these codes carefully and use them as basis for your solution. Your submitted code (presumably based on the code provided with this assignment) should not require any additional data les during run-time, and should not expect any user inputs. For each value of k, your program is to print (to the screen) the reward vector J . Your program is to terminate when convergence is observed (use epsilon=0.0001). For each time step k print the optimal policy. Use the discount factor λ = 0.9.

As an additional experiment make a change to the state diagram so that the action A in S1 have the probabilities swapped. Run your program again for this modi ed state diagram. Your name and student number should be in the comment header of the source code le.

What to submit and how.

1. A PDF le containing your solution to Question 1.

2. A text le named question2.ipynb containing your code. Ensure that the comment section or header contains your name and student number.

3. Create a Zip le that contains the two les named above. Name your Zip le as follows: student_name_student_number.zip.
For example: a_good_student_1234568.zip

4. Submit the Zip le in Assignment 2 DropBox in Moodle by the due date.

5. Any assignment submitted via e-mail will be IGNORED and deemed not to have been sub- mitted.

Attachment:- prob.rar

Solution Preview :

Prepared by a verified Expert
Python Programming: Csci424 - reasoning and learning - university of wollongong
Reference No:- TGS02903100

Now Priced at $50 (50% Discount)

Recommended (98%)

Rated (4.3/5)