Create a scatter plot of cooks distance


Advance Biostats SPSS Assignment: Multiple Logistic Regression In Action

Multiple logistic regression is a model that uses analysis of predictor variables to make predictions as to the likelihood of occurrences of an outcome.

For this Assignment, you use multiple logistic regression to analyze a dataset. You identify assumptions required by multiple logistic regression and evaluate whether they have been met by the data. Finally, you interpret your results and evaluate the use of multiple logistic regression.

The Assignment

1. Variables and variable selection

1. Use a table to list the variables, Sex, Age in Years, Serum Cholesterol, Obese, and Hypertension, and each of their levels of measurement.

2. Create new variables Age_Cat and Chole_Cat:

- Age_Cat: Convert Age in Years into a categorical variable with 2 categories, Less than 40, 40 and greater

- Chole_Cat: Convert Serum Cholesterol into 3 categories, Under 200, 200-299, and 300 and greater

Add the new variables to each record by coding the responses to the original variable using the assigned categories. Be sure that the variable view in SPSS has the correct information on the 2 new variables.

1. Simple Binary Logistic Regression

1. Use Hypertension as the dependent variable and Chole_Cat as the independent variable in the first model. Report the Odds Ratio and significance of the Odds Ratio for the relationship between the dependent and independent variables.

2. Use Hypertension as the dependent variable and Serum Cholesterol (the original variable) as the independent variable in the second model. Report the Odds Ratio and significance of the Odds Ratio for the relationship between the dependent and independent variables.

3. How does the level of measurement for the independent variable affect the outcome (include the OR and its significance in your response)? How does the level of measurement of the independent variable change your interpretation of the Odds Ratio?

2. Multivariate Logistic Regression

1. Run a multivariate binary logistic regression model using SPSS and Hypertension as the dependent variable, Chole_Cat, Age_Cat, Obese, and Sex as the Covariates. Include the output in your submission.

2. Identify the Odds Ratio and the significance of the Odds Ratio for each of the covariates. How has the relationship between Chole_Cat and Hypertension changed with the addition of the other variables (compare to the output from # 2a)?

3. Test the assumption that the model fits the data using the Hosmer-Lemeshow Goodness of Fit test. Interpret the Chi Square statistic given in the output of this test and state what it means in terms of the assumptions needed to use logistic regression with this data.

4. Rerun the logistic regression model from #3a and use the save function to create the following new variables: Predicted Probabilities, Deviance Residuals, and Cook's Distance. Evaluate the model using these saved variables and the following Scatter Plots.

- Create a Scatter Plot of the Deviance Residuals (DEV) and the variable ID: Are there any outliers? What does this mean when evaluating your model?

- Create a Scatter Plot of Cook's Distance (COO) and the variable ID: Are there any influential cases? What does this mean when evaluating your model?

- Create a Scatter Plot of Deviance (DEV) and the Predicted Probabilities (PRE). Discuss whether anything in this scatterplot could cause you some concern in terms of your model.

Format your assignment according to the following formatting requirements:

1. The answer should be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides.

2. The response also includes a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.

3. Also include a reference page. The Citations and references should follow APA format. The reference page is not included in the required page length.

Solution Preview :

Prepared by a verified Expert
Advanced Statistics: Create a scatter plot of cooks distance
Reference No:- TGS02983133

Now Priced at $90 (50% Discount)

Recommended (91%)

Rated (4.3/5)