Draw a graph that displays the distribution of price


Assignment:

Problem Description:

A local used car dealer in Berlin has asked us to evaluate the price of premium vehicles. We have collected data on 906 Mercedes listed on a used car website in 2016. They have provided us data on the price of the vehicle and characteristics of the vehicle such as age, kilometres driven, fuel used and the body style.

You will use descriptive statistics, inferential statistics and your knowledge of multiple linear regression to complete this task.

Price (Dependent Variable) and several characteristics (Independent Variables) are given in the Excel file: MonTuesWed.xlsx. You can find the data that we will use in the project in the "Processed" tab with the definitions of the variables in the "Dictionary" tab. For reference, we include the full dataset that can be found in the "Original" tab.

Required:

A. Calculate the descriptive statistics from the data and display in a table. Be sure to comment on the central tendency (mean median and mode),variability (interquartile range, standard deviation) and shape (whether left or right skewed) for all the variables excluding Year, Name and Model. Include information regarding the quartiles for Price, Kilometres and PowerKW. How would you interpret the mean of dummy variables such as Automatic or Petrol?

B. Draw a graph that displays the distribution of Price. Be sure to comment on the distribution. Does it appear normally distributed?

C. Create a box-and-whisker plot for the distribution of Age and describe the shape. Is there evidence of outliers in the data?

D. What is the probability that we could randomly select a vehicle that is a convertible? What is the likelihood that the age of a convertible exceeds 25 years? Is the age of a vehicle statistically independent of whether they are a convertible? Use a Contingency Table or Pivot Table to show the relative frequencies of these events.

E. Estimate the 95% confidence interval for the population mean price of Hatchbacks. How does this compare to the 95% confidence interval for the population mean price of Coupes?

F. It is traditionally believed that most hatchbacks in Germany have a manual transmission. Test the claim that the population proportion of Hatchback have a manual transmission exceeds 50% at the 5% level of significance.

G. Run a multiple linear regression using the data and show the output from Excel. Important: Exclude the dummy variable Coupe from the regression results as well as "Year" "name" and "model".

H. Is the coefficient estimate for Age statistically different than zero at the 5% level of significance? Set-up the correct hypothesis test using the results found in the table in Part (G) using both the critical value and p-value approach. Interpret the coefficient estimate of the slope.

I. Interpret the remaining slope coefficient estimates. Discuss whether the signs are what you are expecting and explain your reasoning.

J. Interpret the value of the Adjusted R2. Is there a large difference between the R2 and the Adjusted R2? If so, what may explain the reasoning for this?

K. Is the overall model statistically significant at the 5% level of significance? Use the p-value approach.

L. Based on the results of the regressions, what other factors may have influenced the sale price of the used vehicles? Provide a couple possible examples and indicate their predicted relationship with sales if they were included.

M. Predict the average price of a vehicle that is 5 years old, has an automatic transmission, has 75,000 kilometres, uses Petrol, has no damage, has a 110 kw engine, and the body style is a sedan. Discuss if it is appropriate to do predict under these conditions. Show the predicted regression equation.

N. Do the results suggest that the data satisfy the assumptions of a linear regression: Linearity, Normality of the Errors, and Homoscedasticity of Errors? Show using scatter diagrams, normal probability plots and/or histograms and Explain.

O. Does this data indicate the true population distribution of vehicle prices of Mercedes in Berlin? Explain and if not, describe a sampling procedure that could lead to more accurate results. Would you expect these results to hold for BMW as well?

P. The car dealer wants to display a random selection of 5 "high performance" vehicles on their website. They define "high performance" as having an engine exceeding 200kW. The dealer would generally like a mixture of body styles show up on the website. What is the probability that of those selected, all 5 vehicles would be sedans? What is the probability that none would be sedans? Create this using a Binomial Table and construct a bar chart to show the probability distribution of the number of vehicles that are sedans.

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Draw a graph that displays the distribution of price
Reference No:- TGS01997128

Now Priced at $10 (50% Discount)

Recommended (96%)

Rated (4.8/5)