These are the results of running 5-fold cross-validation on, Engineering Mathematics

These are the results of running 5-fold cross-validation on

1. This exercise revisits the Hitters data set.

(a) The glmnet() function, by default, internally scales the predictor variables so that they will have standard deviation 1, before solving the ridge regression or lasso problems. This is a result of its default setting standardize=TRUE. Explain why such scaling is appropriate for this application.

(b) Verify that, for a very small value of λ, both the ridge regression and lasso estimates are very close to the least squares estimates. Also verify that, for a very large value of λ, both the ridge regression and lasso estimates approach 0 in all components (except the intercept, which is not penalized by default).

(c) An alternative method for selecting the tuning parameter λ is to use the one-standard-error rule. Under this rule, instead of choosing λ to minimize test MSE, the largest value of λ for which the test MSE is within one standard error of the minimum is chosen. Provide a rationale for the one-standard-error rule.

(d) For each of the ridge regression and lasso models corresponding to the grid of λ values defined in the notes, perform 5-fold cross-validation to determine the best value of λ. Report the results from both the usual minimum MSE rule, and the one-standard-error rule for choosing λ. Note that the cv.glmnet() returns the value of λ selected using the one standard-error rule under the name lambda.1se.

(e) From the last part, you should have computed 4 values of the tuning parameter:

λ^ridge_min, λ^ridge_1se, λ^lasso_min, λ^lasso_1se

These are the results of running 5-fold cross-validation on each of the ridge and lasso models, and using the usual rule (min) or the one-standard-error rule (1se) to select λ. Now, using the predict() function, with type="coef", report the coefficient estimates at the appropriate values of λ. That is, you will report two coefficient vectors coming from ridge regression with λ = λ^ridge_min and λ = λ^ridge_1se , and likewise for the lasso. How do the coefficient estimates from the usual rule compare to those from the one standard error rule? How do the ridge estimates compare to those from the lasso?

(f) Suppose that you were coaching a young baseball player who wanted to strike it rich in the major leagues. What handful of attributes would you tell this player to focus on?

2. Predic the number of applications received (Apps) using the other variables in the College data set, which is available in the ISLR library.

(a) Use ?College to access information about the data set and answer the following questions. Note that you may also find the summary() function useful.

i. Not including Apps, how many variables are in the data set? In other words, what is p?

ii. Are there any missing values in the data set? If so, remove them.

iii. What is the sample size (once missing values have been removed, if necessary)? In other words, what is N?

iv. Are there any qualitative variables in the data set? If so, list them.

(b) Split the data set into a training set and a test set.

(d) Fit a ridge regression model on the training set, with λ chosen by cross-validation. Report the test error obtained.

(e) Fit a lasso model on the training set, with λ chosen by cross-validation. Report the test error obtained, along with the number of non-zero coefficient estimates.

(f) Comment on the results obtained. How accurately can we predict the number of college applications received? Is there much difference among the test errors resulting from these three approaches?

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Engineering Mathematics: These are the results of running 5-fold cross-validation on

Reference No:- TGS02242217

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Request for Solution File

Ask an Expert for Answer!!

Engineering Mathematics: These are the results of running 5-fold cross-validation on

Reference No:- TGS02242217

Have a Question? (oR Write a Review)

Recent Questions Asked Engineering Mathematics

Q : Ensure that your powerpoint presentation fulfills the

Q : In business recognizing cultural differences is important

Q : What are the components that contribute to port security

Q : Review the available literature regarding the

Q : These are the results of running 5-fold cross-validation on

Q : Assuming that daimler-chrysler group needs to cover its

Q : The purpose of the paper is to gain and in depth

Q : The department of homeland security or dhs regularly

Q : How can i validate the accuracy of my statementhow is this

Way to promote safety for an older adult with dementia

How you protect against developing repetitive stress injury

What most likely to be recommended by an office ergonomist

What is common musculoskeletal change in aging

Evaluation of the effectiveness of diet and insulin therapy

What ideas and development of group cohesion reflects

Hiw nurse ensures safety during rectal temperature

Request for Solution File

Ask an Expert for Answer!!

Engineering Mathematics: These are the results of running 5-fold cross-validation on

Reference No:- TGS02242217

Recent Questions Asked Engineering Mathematics

Q : Ensure that your powerpoint presentation fulfills the

Q : In business recognizing cultural differences is important

Q : What are the components that contribute to port security

Q : Review the available literature regarding the

Q : These are the results of running 5-fold cross-validation on

Q : Assuming that daimler-chrysler group needs to cover its

Q : The purpose of the paper is to gain and in depth

Q : The department of homeland security or dhs regularly

Q : How can i validate the accuracy of my statementhow is this

Asked Questions