Master of management of information systems mmis 643 data


Part -1:

Consider the Boston Housing Data file (The schema of the data file is given on page 33 in Table 2.2 of the textbook. )

a. Study the Neural Networks Prediction

b. Using XLMINER's neural network routine under predict menu to fit a model using XLMINER default values for neural network parameters by using the predictors such as CRIM, ZN, INDUS, CHAS, NOX, RM, AGE, DIS, RAD, TAX, PTRATIO, B, LSTAT to predict the value of the outcome variable MEDV.

i. Record the RMS errors for the training data and the validation data, and observe the lift charts for repeating the process, changing the number of epochs to 300, 3000, 10,000, 20,000.

ii. What happens to RMS error for the training data set as the number of epochs increases?

iii. What happens to RMS error for the validation data set as the number of epochs increases?

iv. Comments on the appropriate number of epochs for the model.

Note: (Please use the Prediction Option of the Neural Network in order to get RMS error)

c. Please submit your execution results and answers included in MS Excel file

Note:

1. The file BostonHousing.xls is posted along Written Assignment #3B, and description of columns are given in the file.
2. The cloud based XLMiner
3. For the Windows based XLMiner, please check the XLMiner download instruction posted in Discussion in Blackboard

Part -2:

QUESTION 1

Which of the following expression is used for the Naive Bayes classifier?

a.


b.


c.


d.


QUESTION 2

For the given classification tree, please match corresponding rules with the number in each branch.

IF age = "<=30" AND student = "no" THEN buys_computer = "no"

IFage = ">40" AND credit_rating = "fair" THEN buys_computer = "yes"

IF age = ">40" AND credit_rating = "excellent" THEN buys_computer = "no"

IF age = "<=30" AND student = "yes" THEN buys_computer = "yes"

IF age = "31...40" THEN buys_computer = "yes"

A. 1
B. 5
C. 4
D. 2
E. 3

QUESTION 3

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(Prior Legal Trouble = 'No') in decimal format.


QUESTION 4

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x2= Large |C1) = P(Size = Large| Fraudulent) in decimal format.

QUESTION 5

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x2= Small | C2 ) = P(Company Size = Small| Truthful) in decimal format.
(Please keep 3 digits after the decimal point)

QUESTION 6

Which of the following statement(s) is(are) correct?

a. The Naive Bayes method is a supervised learning method.

b. The Naive Bayes can be only used for classification, but not for prediction.

c. The Naive Bayes method is a data driven method.

d. The Naive Bayes uses cut-off value for calculated posterior probability to determine the class label of a given testing sample.

QUESTION 7

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

For the given instance with the input such as prior legal trouble = Yes, company size = Large, please determine if the company is truthful or not.
(If it is truthful, select True, otherwise, select False.)

True
False

QUESTION 8

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x1 = No| C2 ) = P(Prior Legal Trouble =No| Truthful) in decimal format.
(Please keep 3 digits after the decimal point)

QUESTION 9

Which of the following statement(s) is(are) correct?

a. Neural network model can be used for classification.

b. Neural network model can be used for prediction.

c. Both a. and b.

d. Neither a. nor b.

QUESTION 10

Which of the following statement(s) is(are) correct?

a. Fully-grown classification tree may lead to overfitting problem.

b. Overly-pruned classification tree may lead to underfitting problem.

c. Both a. and b.

d. Neither a. nor b.

QUESTION 11

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(Company Size ='Small') in decimal format.

QUESTION 12

The difference(s) between the basic K-Nearest Neighbor classifier and the Naive Bayes classifier is(are)

a. The basic K-Nearest Neighbor classifier uses the majority voting (prior probability value) and the posterior probability to determine the class label of a given testing sample; and Naive Bayes classifier uses only prior probability to determine the class label of a given testing sample.

b. The basic K-Nearest Neighbor classifier uses the majority voting (prior probability value) to determine the class label of a given testing sample; and Naive Bayes classifier uses not only the prior probability, but also the posterior probability to determine the class label of a given testing sample.

c. The basic K-Nearest Neighbor classifier uses the majority voting (prior probability value) to determine the class label of a given testing sample; and Naive Bayes classifier uses only the posterior probability to determine the class label of a given testing sample.

d. The basic K-Nearest Neighbor classifier uses the posterior probability to determine the class label of a given testing sample; and Naive Bayes classifier uses only the prior probability to determine the class label of a given testing sample.

QUESTION 13

What is(are) the ingredient(s) by which the neural net evolves to produce a more accurate prediction?

a. weight updates

b. learning rate

c. learning algoirthm

d. momentum

QUESTION 14

In general, the CART does have to impute values or delete observations with missing values in order to handling missing data.
True
False

QUESTION 15

A CART consists of

a. the root node

b. internal nodes and leaf nodes

c. edges connecting the nodes

d. All of a., b., and c.

QUESTION 16

Which of the following defines the confidence of an association rule?

a.


b.


c.


d.

QUESTION 17

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x2= Large | C2 ) = P(Company Size = Large| Truthful) in decimal format.
(Please keep 3 digits after the decimal point)

QUESTION 18

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

For the given instance with the input such as prior legal trouble = Yes, company size = Small, please determine if the company is truthful or not.

(If it is truthful, select True, otherwise, select False.)

True
False

QUESTION 19

Which of the following defines the support of an association rule?

a.


b.


c.


d.

QUESTION 20

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud


Please give the conditional probability P(x2= Small |C1) = P(Size = Small| Fraudulent) in decimal format.

QUESTION 21

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

For the given instance with the input such as prior legal trouble = No, company size = Small, please determine if the company is truthful or not.

(If it is truthful, select True, otherwise, select False.)

True
False

QUESTION 22

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

For the given instance with the input such as prior legal trouble = No, company size = Large, please determine if the company is truthful or not.

(If it is truthful, select True, otherwise, select False.)

True
False

QUESTION 23

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud


Please give the conditional probability P(x1 = Yes |C1) = P(Prior Legal Trouble =Yes| Fraudulent) in decimal format.

QUESTION 24

In general, the CART is not sensitive to the outliers.
True
False

QUESTION 25

Which of the following statement(s) is(are) correct?

a. There is only one root node in each CART

b. Each node in CART has only one direct parent node.

c. Each leaf node has no child node(s).

d. All of a., b., and c.

QUESTION 26

Which of the following defines the benchmark confidence of an association rule?

a.


b.


c.


d.


QUESTION 27

Which of the following statement(s) is(are) correct?

a. Each node in a classification tree is corresponding to a column in a data table.

b. Each node in a classification tree is corresponding to a dimension in terms of multi-dimensional data space.

c. Each node in a classification tree defines a decision boundary (or split condition) along its corresponding dimension.

d. All of a., b., and c.

QUESTION 28

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(Company Size ='Large') in decimal format.

QUESTION 29

The CART can be used for the purpose(s) of

a. Classification

b. Prediction

c. Either a. or b.

d. Both a. and b.

QUESTION 30

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x1 = Yes | C2) = P(Prior Legal Trouble =Yes| Truthful) in decimal format.
(Please keep 3 digits after the decimal point)

QUESTION 31

Which of the following statement(s) is(are) correct?

a. In XLMiner, the Naive Bayes Classifier can take only the category variables as input to generates the category response or class label.

b. In general, the Naive Bayes Classifier can take not only the category variables as input, but also the continuous variables to generates the category response or class label.

c. Both a. and b.

d. Neither a. nor b.

QUESTION 32

The momentum added in weight update during neural network training process

a. can keep weights changing in the same direction of they did in the preceding interaction.

b. will be reluctant to learn from data that want to change the direction of the weights when the momentum values are set high.

c. can help avoid getting stuck in a local optimum.

d. can help keep the neural network learning process converge to optimum.

QUESTION 33

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(C2) = P(Truthful) in decimal format.

QUESTION 34

What is the meaning of CART in this data mining textbook?

a. Classification, Assertion, Regression, and Translation.

b. Categorization, Assertion, Regression, and Translation.

c. Category and Regression Trees

d. Classification and Regression Trees

QUESTION 35

In CART, it is necessary to normalize the data in the unit range 0 to 1.
True
False

QUESTION 36

Which of the following statement(s) is(are) correct about the CART?

a. For classification, the path from the root node to the leaf node represents a specific decision rule condition, and the majority voting at the leaf node will be used to determine the class label designed by the path.

b. For predicting, the path from the root node to the leaf node represents a specific decision rule condition, and the calculated average value of the variable at the leaf node will be used to predict its value.

c. Both a. and b.

d. Neither a. nor b.

QUESTION 37

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the conditional probability P(x1 = No |C1) = P(Prior Legal Trouble = No| Fraudulent) in decimal format.

QUESTION 38

To build a good classifier, the inductive learning algorithm or classification tree construction algorithm requires a large data set.
True
False

QUESTION 39

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(C1) =P(Fraudulent) in decimal format.

QUESTION 40

For the given table below,

Input Variables

Decision Variables

Prior Legal Trouble

Company Size

Status

Y

Small

Truthful

N

Small

Truthful

N

Large

Truthful

N

Large

Truthful

N

Small

Truthful

N

Small

Truthful

Y

Small

Fraud

Y

Large

Fraud

N

Large

Fraud

Y

Large

Fraud

Please give the prior probability P(Prior Legal Trouble = 'Yes') in decimal format.

QUESTION 41

Multi-layer feedforward neural network consists of

a. Input layer

b. Hidden layer(s)

c. Output layer

d. All of a., b., and c.

Attachment:- BostonHousing.xls

Solution Preview :

Prepared by a verified Expert
Data Structure & Algorithms: Master of management of information systems mmis 643 data
Reference No:- TGS02360389

Now Priced at $80 (50% Discount)

Recommended (97%)

Rated (4.9/5)