Retrospective sample of males in a heart-disease


All work must be done independently.

A retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case of CHD. Many of the CHD positive men have undergone blood pressure reduction treatment and other programs to reduce their risk factors after their CHD event. In some cases the measurements were made after these treatments. These data are taken from a larger dataset, described in Rousseauw et al, 1983, South African Medical Journal.

There are 463 observations in the dataset. The variables in the dataset are:

sbp - systolic blood pressure
tobacco - cumulative tobacco (kg)
ldl - low density lipoprotein cholesterol adiposity
famhist - family history of heart disease (Present, Absent)
typea - type-A-behavior
obesity
alcohol - current alcohol consumption age - age at onset
chd - response, coronary heart diseease

The data can be found and read into R by the following command:

read.table(" https://www-stat.stanford.edu/ tibs/ElemStatLearn/datasets/SAheart.data" , sep=",",head=T,row.names=1)

If you would prefer to analyze this data in using some other statistical package, you will need to export the data from R using something like a write.table command (or some variation thereof).

The following questions are of practical interest:

1. What are significant predictors of CHD ? What would a final model look like and can you provide an estimate of its predictive accuracy (i.e. do model selection and then evaluate predictive accuracy)? What functional forms are most appropriate for the various predictors in your final model ?

2. Since high Idl often precedes a diagnosis of CHD, will a two stage model which first uses ldl as a response in stage 1 and then CHD as a response in stage 2, provide more accurate predictions of CHD than the model built question 1 above ?

3. There are often situations where finding just one obviously best sub-model is difficult. There may be many good competing sub-models. However, you might decide to bring together multiple models to im¬prove predictive performance. Develop a strategy for doing this on this dataset, being careful to clearly compare and contrast (to the single model approach) predictive performance. Also, make sure to clearly motivate your strategy giving enough intuition so that I can follow things easily.

Please provide complete justifications for why you chose a particular mod¬eling strategy including the underlying assumptions you are making. Analyze the data and provide some overall inferences with regards to the questions being posed. Write a (maximum) 5 page report (tables and figures inclusive) that details your analysis. Computer output may be attached as supplemen¬tary material.

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Retrospective sample of males in a heart-disease
Reference No:- TGS0682395

Now Priced at $30 (50% Discount)

Recommended (93%)

Rated (4.5/5)