For this question you will be examining the data-set


Question 1

Here are advertised house power rating and expected gas mileage, in mile per gallon, for several 2010 vehicles.
Car hp mpg
Audi A4 211 30
BMW 3 series 230 28
Buick LaCrosse 182 30
Chevy Cobalt 155 37
Chevy Suburban 320 21
Ford Expedition 310 20
GMC Yukon 320 21
Honda Civic 140 34
Honda Accord 177 31
Hyundai Elantra 138 35
Lexus IS 350 306 25
Lincoln Navigator 310 20
Mazda Tribute 171 28
Toyota Camry 169 33
Volkswagen Beetle 150 28

a) Which one of these variables is explanatory and which one is response variable ?
b) Using an appropriate tool make a scatterplot for these data.
c) Describe the direction, form, and strength of the association.
d) Find the correlation between horsepower and miles per gallon.
e) Write a few sentences telling what the plot says about fuel economy.
f) Convert the mileage to litres per 100 km (=235.2 ÷ mpg) and repeat part (b) to (e). Did the correlation change ? which scale for fuel consumption do you prefer and why ?

Question 2
For this question you will be examining the data-set labelled "Election_2011" which you can find in the assignment folder. In the dataset you will find the results from the 2011 Canadian federal election for the Maritime provinces, showing the riding name, winning candidate, and percentage of rejected ballots.

a) Using appropriate tool, make a histogram of the percentages of rejected ballots.
b) Find the mean and standard deviation.
c) Report the 5-number summary.
d) Why do the mean and median differ here ?
e) Which of (b) and (c) above does a better job of summarizing the distribution of percentage of rejected ballots? why?
f) Suppose the true percentage for Egmont was %1.8 and not %0.8. How would you expect the mean, median, standard deviation, and IQR to change? Explain your expectations for each (no computations, please)

Question 3
For this question you will be examining the data-set labelled "Homicides_2011" which you can find on the assignment folder. The rate provided in the dataset is the average homicide rate per 100,000 population over the 11 years from 2001 to 2011. Cities were classified into three size categories. I also classified them by geographic region.
a) Use appropriate graphs to examine the relationship between homicide rate and city size (category ). Describe any patterns you see.
b) What type of variable is city size, if determined precisely by a count of residents? What type is in our analysis in part (a) ?
c) If you wanted to use the actually city sizes (populations), can you think of another type of graph to use to explore the relationship between crime rate and city size? if so produce this graph.
d) Produce appropriate graph to examine the relationship between homicide rate and geographical region. Describe any geographical pattern you see.

Question 4
For this question you will be examining the data-set labelled "Blood_Pressure" which is provided on the assignment folder. The dataset contains data on blood pressure screening clinic for employees at a certain company.
a) Using pivot table in excel create contingency table that summarizes number of cases in the dataset by age group and blood pressure level. (you can also use minitab or other softwares to create this contingency table)
b) Find the marginal distribution of blood pressure level.
c) Find the conditional distribution of blood pressure level within each age group
d) Create a segmented bar graph for your result in part (b) to compare conditional distribution of blood pressure level within each age group. (You can create it simply in excel)
e) Write a brief description of the association between age and blood pressure among these employees
f) Do you think that this proves that people's blood pressure increases as they age? explain.

Question 5
The following data represent the square footage of 10 three-bedroom condos for sale in Hilton Head, South Carolina.
1,559 1,625 1,167 1,264 1,676 1,300 2,058 1,126 1,858 1,321
Determine the interquartile range, upper limit and lower limit for this sample. Are there any outliers in this data set? (Show your solution steps)
In order to be consistent, please use the method for finding percentiles (which also applies to quartiles).

Question 6
The following data shows the number of minutes that seven customers waited for a table at a particular restaurant.
1 17 26 10 5 22 19 8
Manually calculate the coefficient of variation (CV) for this data. What does it mean? (Be sure to show your solution steps )

Solution Preview :

Prepared by a verified Expert
Basic Statistics: For this question you will be examining the data-set
Reference No:- TGS01224903

Now Priced at $15 (50% Discount)

Recommended (91%)

Rated (4.3/5)