Executive summary of the findings from data analysis, Basic Statistics

Executive summary of the findings from data analysis

Case: Major League Baseball Salaries: “Why They Make What They Make”

Background:

Using the 1986 payrolls and season records, one can easily show the productivity of the players. For example, Table shows a small dataset of all the teams that year. One of the variables is the games won and the other is the average salary of the team. I ran a quick simple linear regression with wins as the dependent variable and average salary as the independent variable. Consider the regression model with these two variables in Figure. One can easily look at the fitted value and the residuals in the table to see who did well and who did not, with the money they paid. If we were to ponder for a moment if player salaries is a good indicator of games won, then it is obvious that this data does not support that hypothesis. First consider the Fitted Line Plot. The relationship is somewhat linear, but the R-squared is very low, meaning that the average salaries do not explain very much of the variation in the number of wins. Also note the hour-glass shape of the scatterplot around the regression line. Analysts far and wide are trying to figure out what is causing that variation.

In Table, it is very easy to see who did good (in the green) and who did bad (in the yellow) that season just by looking at the residuals. (Note: the residual is the actual value minus the predicted value, so a negative residual means that the fitted, or predicted, is actually greater than the actual. Likewise, a positive residual means that the fitted value is less than the actual.) In the National League, based on the amount of money the NY Mets spent, our model predicts that they would win only 84 games. However they won 108 games which is 24 games more than we would have predicted just based on how much money they spent on salaries. In yellow you will see the NL team that did the worst. Our model predicted that Chicago would win 84.4 games, but in reality they only won 70, thus the residual of -14.8. Just based on this data, it is clear that salaries of players are not good indicators of how many games should be won. However, it is relatively easy to see what is driving salaries simply due to their performance variables. These variables consist of things like “At bats”, “#of Home Runs”, “Years in the League” for the hitters, and things like “# of saves” and “ERA” for the pitchers.

1228_Small dataset of all the teams.jpg

1090_Regression model with variables.jpg

For Case we are going to use the 1986 Salary dataset to create a predictive model to predict salaries. This dataset contains 171 pitcher players. (Note: I have trimmed the data set a bit just to get rid of some unusual observations.) Find the excel file and minitab files (Pitchers for Case ) in Canvas.

Case Deliverables:

I. Please submit an Executive Summary of the findings from your data analysis. Also, in the executive summary, provide the general manager any advice that you can based on the data that you analyzed, such as your model and the significant variables. Again, don’t base your advice on your feelings, but purely on the suggestions and conclusions drawn from the data. This section should be very concise and to the point.

II. Create a predictive model using the 5-step methodology used in class and in the text. Step through the process and cut and paste your results from minitab in your report. Throughout your methodology, explain your thought-process and why you make the decisions you make. Give me a narrative of why you are doing what you are doing. I am more interested in your process for model building than the final model that you arrive at.

Although we are interested in using the model for the prediction of baseball salaries, we also want to be able to interpret the contribution of each independent variable in the model. Make sure you provide an interpretation of each coefficient that is in your final model. If you find any weaknesses in your final model after you do your residual analysis in Step 4, be sure to discuss these and perhaps suggest ways to make the model stronger.

Create this model with the 17 original variables provided. Do not do Step 5. Step 5 is “Validation” and to do this you would have to go out and collect more data. There is not a single correct answer/model for this assignment, so I do not expect any two teams to arrive at the same model. If you happen to arrive at the same model, then I wouldn’t expect you to take the same path to that specific model.

Advice: If I recall correctly, I don’t think the “Team” variable is significant in this dataset, so don’t worry about creating a dummy variable for each team. However, the “Position” variable could be interesting. Note that I added a dummy variable to indicate whether the player was a reliever or a starter.

I’m not so concerned about the R-squared value of your final model, but I do think it will be fairly easy for you to get an R-squared value in the 70% to 80% range.

III. Discuss any variable you might like to add to the above model that is not included in the dataset.

IV. If you were the analyst for a team and your general manager wanted to bring in a new reliever (from this dataset), based on your model, who is the best value and who is the worst value. (Hint: you will need to store the Fits and Residuals for your final model.)

Use any of the descriptive statistics tools that you wish from the first portion of the class, such as dot plots, interval plots, histograms, scatter plots, etc., to make your case. Provide accurate interpretations of your findings and try to explain them in simple terms.

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Basic Statistics: Executive summary of the findings from data analysis

Reference No:- TGS01238254

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Recent Questions Asked Basic Statistics

Q : Performing c-v-p analysis

Calculate the per unit figures for each item from the information provided above. Determine which of these figures is needed for performing C-V-P analysis.

Q : Engineering relevancies of your material system

What are the engineering relevancies of your material system? (better, faster, cheaper, greener, etc.) How does your material system benefit the global society or economy?

Q : Disputes occur at various stages throughout our lives

Disputes occur at various stages throughout our lives. It is often how well we resolve these disputes that determines whether we will be successful, especially in our business lives.

Q : Implement your protocol using java-types of threads

Implement your protocol using Java two types of threads representing vehicles going in the two directions, respectively. Explain the purpose from any semaphore/variables in your solution and its initialization.

Q : Executive summary of the findings from data analysis

Please submit an Executive Summary of the findings from your data analysis. Also, in the executive summary, provide the general manager any advice that you can based on the data that you analyzed, such as your model and the significant variables.

Q : Growth and development of african american music

Write a biography, noting how the figure has influenced the growth and development of African American music.

Q : Business communication network-related elements

Taking into consideration the reasons why you selected these organisations that you described in your preliminary work, you now need to describe business communication network-related elements in your research paper.

Q : Employees during the course of the employment

As employers, healthcare organisations are legally responsible for actions of their employees during the course of their employment. This is the principle of:

Q : Exact turnover of the business

a. Advise Eve of her rights, if any, against Adam and any remedies available for her under common law b. Would your advice be different if Eve has asked Adam the exact turnover of the business?

1955185

Questions
Asked

3,689

Active Tutors

1457707

Questions
Answered

Start Excelling in your courses, Ask a tutor for help and get answers for your problems !!

ask Question

Request for Solution File

Ask an Expert for Answer!!

Basic Statistics: Executive summary of the findings from data analysis

Reference No:- TGS01238254

Have a Question? (oR Write a Review)

Recent Questions Asked Basic Statistics

Q : Performing c-v-p analysis

Q : Engineering relevancies of your material system

Q : Disputes occur at various stages throughout our lives

Q : Implement your protocol using java-types of threads

Q : Executive summary of the findings from data analysis

Q : Growth and development of african american music

Q : Business communication network-related elements

Q : Employees during the course of the employment

Q : Exact turnover of the business

How geographic isolation contribute land animal speciation

Which statement best describes genetic drift

Which situation is the best example of gene flow

What causes water to become harmful to bacteria

Discuss biological impacts of climate change on humans

How future global warming affect the distribution of species

Discuss species that flourish in warm and humid area

Request for Solution File

Ask an Expert for Answer!!

Basic Statistics: Executive summary of the findings from data analysis

Reference No:- TGS01238254

Recent Questions Asked Basic Statistics

Q : Performing c-v-p analysis

Q : Engineering relevancies of your material system

Q : Disputes occur at various stages throughout our lives

Q : Implement your protocol using java-types of threads

Q : Executive summary of the findings from data analysis

Q : Growth and development of african american music

Q : Business communication network-related elements

Q : Employees during the course of the employment

Q : Exact turnover of the business

Asked Questions