Perform a test of hypothesis to determine whether the


Overview

In this module, many powerful data analysis strategies will be presented. In many situations, decision makers need to conduct a hypothesis test that consists of more than two samples. In this case, a single or two-sample hypothesis test is not adequate for this investigation. ANOVA, short for analysis of variance, is a technique that allows a researcher to analyze means of two or more samples at a time. Though a series of two-sample hypothesis could be conducted in situations where there are more than two samples, ANOVA is better suited because it reduces the likelihood of a researcher making a Type I error. In fact, ANOVA allows researchers to conduct their analysis more accurately by accounting for other sources of variation.

Another extremely useful data analysis technique that will be presented is regression analysis. For this analysis, causal relationships between dependent and independent attributes can be investigated for their significance. With this tool, decision makers can make forecasts and determine if independent attributes are significant in explaining a dependent attribute that is being modeled. For example, monthly sales volume (i.e. demand) might be forecasted with various sources of advertising (i.e. TV, Internet, Newspaper, etc.). In this example, a decision maker could model demand and decide if a particular type of advertising was statistically significant in increasing demand.

Finally, the module concludes by reviewing techniques that decision makers use when events that might influence their decision are mostly unknown. For example, perhaps you are making a decision on whether or not to expand your current facility or build a new one. This decision might be related to the anticipated demand for a certain product over the next few years, which can be estimated within reason, but is certainly not know. Techniques will be presented to help decision makers with problems where events, or states of nature, are unknown based on a preference of being optimistic, pessimistic, or realist.

Data Analysis ToolPak

The Data Analysis ToolPak is a Microsoft Office Excel add-in program that is available when you install Microsoft Office or Excel. The learning activities in this module will utilize the Data Analysis ToolPak, so it is necessary to have this add-in installed. To install the Data Analysis ToolPak:

Click the File tab, click Options, and then click the Add-Ins category.

In the Manage box, select Excel Add-ins and then click Go.

In the Add-Ins available box, select the Analysis ToolPak check box, and then click OK.

NOTE: There are two options for the Data Analysis ToolPak. One provides extra support for Visual Basic for Application (VBA) and the other one does not. You can select both options if you'd like, but the only option you need for this course is the Data Analysis ToolPak (without VBA).

Learning Outcomes

Upon completion of this module, you will be able to:

ANOVA

  • Perform a test of hypothesis to determine whether the variances of two populations are equal.
  • Describe the ANOVA approach for testing difference in sample means.
  • Conduct a test of hypothesis among three or more treatment means and describe the results.
  • Carry out a test of hypothesis among treatment means using a blocking variable and understand the results.

Correlation and Linear Regression

  • Define the terms dependent and independent variable.
  • Calculate, test, and interpret the relationship between two variables using the correlation coefficient.
  • Apply regression analysis to estimate the linear relationship between two variables
  • Interpret the regression analysis.
  • Evaluate a regression equation to predict the dependent variable.

Multiple Regression Analysis

  • Describe the relationship between several independent variables and a dependent variable using multiple regression analysis.

Decision Making

  • Compute and interpret the expected values for a payoff table.
  • Explain and interpret opportunity loss.
  • Describe three strategies for decision making.
  • Organize possible outcomes into a decision tree and interpret the result

Content Delivery

Start with the Excel workbook (spreadsheet) Content Delivery - start.xlsx.

Debates have been ongoing if a traditional classroom settings provide a more effective way to deliver educational content over more modern alternatives such as online or hybrid online course offerings. In order to shed light on this debate an experiment was conducted over a fifteen week period. In total, 90 students were randomly separated into 3 different courses. In other words, 30 students were in a traditional classroom for a semester, 30 were asked to take the course in a hybrid online model, and the remaining 30 were asked to take the course online. The experiment was controlled as much as possible. For example, the course content was created by a single Professor (i.e. lectures, recorded videos, etc.). At the end of the 15 week investigation, all students were asked to take the same final exam which contained 50 questions. The scores (i.e. number of questions answered correctly) from the 90 students are shown below in the table. Using an alpha value of 0.05, is it reasonable to conclude that there are differences in student scores based on the content delivery method used?

Classroom

Hybrid

Online

21

30

40

26

22

33

22

43

33

28

34

42

30

28

38

28

33

39

42

30

17

32

28

38

21

17

17

45

44

25

36

30

26

38

28

34

21

34

17

32

22

35

18

47

31

39

34

46

50

48

37

27

23

42

37

18

30

26

36

17

30

18

22

21

47

23

21

40

23

33

26

46

21

44

33

31

28

50

49

23

45

50

44

17

43

27

38

23

42

45

a. What is the null hypothesis statement for this problem?

b. What is the alternative hypothesis statement for this problem?

c. What is alpha for this analysis?

d. What is the most appropriate test for this problem? (choose one of the following)

• F-Test Two-Sample for Variance
• Anova: Single Factor
• Anova: Two-Factor with Replication
• Anova: Two-Factor without Replication

e. What is the value of the test statistic for the most appropriate analysis?

f. What is the value of the critical value for the most appropriate analysis?

g. Is it reasonable to conclude that there is a difference in the mean scores based on the content delivery method used in this experiment? (choose one of the following)

• Yes
• No

h. What is the p-value for this analysis?

Real Estate Tax

The city of Tucson, Arizona, employs people to assess the value of homes for the purpose of establishing real estate tax. The city manager sends each assessor to the same five homes and ten compares the results. The information is given below, in thousands of dollars. Can we conclude that there is a difference in the assessors, at a significance level of 0.05?

Home

Smith

Norman

Thomas

Holiday

A

53

55

49

45

B

50

51

52

53

C

48

52

47

53

D

70

68

65

64

E

84

89

92

86

Part A: Assessor
a. What is the null hypothesis statement for this problem?
b. What is the alternative hypothesis statement for this problem?
c. What is alpha for this analysis?
d. What is the most appropriate test for this problem? (choose one of the following)
- F-Test Two-Sample for Variance
- Anova: Single Factor
- Anova: Two-Factor with Replication
- Anova: Two-Factor without Replication
e. What is the value of the test statistic for the most appropriate analysis?
f. What is the value of the critical value for the most appropriate analysis?
g. Is it reasonable to conclude that there is a difference in the mean home value based on the assessor? (choose one of the following)
• Yes
• No
h. What is the p-value for this analysis?

Part B: Home

a. What is the null hypothesis statement for this problem?
b. What is the alternative hypothesis statement for this problem?
c. What is alpha for this analysis?
d. What is the most appropriate test for this problem? (choose one of the following)
• F-Test Two-Sample for Variance
• Anova: Single Factor
• Anova: Two-Factor with Replication
• Anova: Two-Factor without Replication
e. What is the value of the test statistic for the most appropriate analysis?
f. What is the value of the critical value for the most appropriate analysis?
g. Is it reasonable to conclude that there is a difference in the value based on the home? (choose one of the following)
• Yes
• No
h. What is the p-value for this analysis?

NOTE: In order to receive credit for this problem, you must show your work and you must use the appropriate formulas in Excel to answer the questions. Format all percentages as a percentage with 2 decimal places (i.e. 78.29%). Show all other numbers with 3 decimal places.

Stadium Instructions

A professional baseball team is in a bad slump (i.e. losing streak). Ticket sales are plummeting and the Director of Sports Operations is looking for ways to generate more revenue at the gate. Data from the last 30 weeks has been collected in terms of the number of fans in attendance (i.e. gate receipts). In addition to this attendance, data has been collected in terms of various sources of advertising (i.e. television, radio, newspapers, magazines, mail, and internet) and other forms promotion.

 

 

 

Baseball Data

 

 

 

Television
Advertising

Radio
Advertising

Newspaper
Advertising

Magazine
Advertising

Mail Postal
Advertising

Internet
Advertising

Sales
Promotions

Gate
Receipts

60,000

30,000

68,000

60,000

30,000

20,000

60,000

620,000

75,000

35,000

99,000

70,000

10,000

30,000

70,000

750,000

98,000

87,000

98,000

90,000

40,000

80,000

90,000

950,000

80,000

80,000

90,000

80,000

30,000

70 000

70,000

780,000

90,000

100,000

99,000

90,000

40,000

88,000

90,000

950,000

90,000

90,000

97,000

90,000

10,000

89 000

90,000

910,000

70,000

75,000

79,000

70,000

40,000

80 000

70,000

720,000

90,000

98,000

95,000

90,000

50,000

100,000

90,000

950,000

90,000

90,000

85,000

90,000

55 000

100,000

80,000

880,000

100,000

95,000

82,000

90,000

50,000

100,000

90,000

970,000

50,000

50,000

81,000

50,000

56,000

60 000

40,000

490,000

71,000

75,000

87,000

70,000

60,000

70 000

70,000

700,000

75,000

80,000

92,000

70,000

58,000

80 000

75,000

750,000

80,000

85,000

89,000

70,000

50,000

80 000

70,000

780,000

90,000

88,000

97,000

90,000

80,000

85,000

80,000

890,000

88,000

90,000

81,000

90,000

82,000

90,000

80,000

870,000

80,000

85,000

91,000

80,000

75,000

60,000

70,000

790,000

78,000

85,000

97,000

85,000

76,000

70 000

80,000

850,000

60,000

60,000

46,000

60,000

77,000

89,000

50,000

570,000

80,000

80,000

80,000

80,000

80,000

75,000

80,000

810,000

75,000

80,000

80,000

76,000

85 000

80 000

70,000

760,000

88,000

90,000

89,000

90,000

86,000

90,000

80,000

880,000

80,000

85,000

94,000

80,000

90,000

90,000

80,000

830,000

100,000

95,000

71,000

100,000

88,000

100,000

90,000

970,000

58,000

65,000

97,000

60,000

95,000

70,000

60,000

600,000

86,000

80,000

85,000

80,000

102,000

80 000

80,000

840,000

91,000

90,000

50,000

90,000

100,000

90 000

80,000

870,000

99,000

90,000

90,000

100,000

90,000

90,000

90,000

950,000

70,000

70,000

94,000

60,000

99 000

70,000

60,000

640,000

79,000

75,000

81,000

80,000

95 000

85,000

80,000

800,000

 

The director would like to know how the forms of advertising influences gate receipts. Use the most appropriate analysis to answer the following questions.

Part A: Test
a. What is the most appropriate test to answer the following questions? (choose one below)
• Anova: Single Factor
• Anova: Two-Factor with Replication
• Anova: Two-Factor Without Replication
• Correlation
• F-Test Two-Sample for Variance
• Regression
• t-Test: Paired Two Sample for Means
• t-Test: Two Sample Assuming Equal Variances
• t-Test: Two-Sample Assuming Unequal Variances
• z-Test: Two Sample for Means
b. What is the dependent attribute for this question? (choose one below)
• Television Advertising
• Radio Advertising
• Newspapers Advertising
• Magazines Advertising
• Mail Postal Advertising
• Internet Advertising
• Sales Promotion
• Gate Receipts

Part B: Global Test
Carry out a global test of hypothesis to verify if any of the regression coefficients are different than zero (i.e. significant).
c. If the null hypothesis is that all of the regression coefficients are equal to zero and the alternative hypothesis is that at least one regression coefficients is not equal to one, should the null hypothesis be rejected?
• Yes
• No

Part C: Individual Test
Carry out an individual test of hypothesis to verify if any of the regression coefficients are different than zero. From the individual test of the regression coefficients:
d. Is the Intercept statistically significant (i.e. not zero)?
e. Is Television Advertising statistically significant (i.e. not zero)?
f. Is Radio Advertising statistically significant (i.e. not zero)?
g. Is Newspaper Advertising statistically significant (i.e. not zero)?
h. Is Magazine Advertising statistically significant (i.e. not zero)?
i. Is Mail Postal Advertising statistically significant (i.e. not zero)?
j. Is Internet Advertising statistically significant (i.e. not zero)?
k. Is Sales Promotions statistically significant (i.e. not zero)?

Part D: Estimate

Based on your initial analysis, re-run the analysis if any regression coefficient was determined to be insignificant (i.e. equal to zero) in the previous part. In other words, remove all insignificant coefficients from your study (i.e. all of them at once, not individual as in step-wise regression). If you need to rearrange or modify the original dataset to conduct further analysis, please make a copy of the data and paste the data to the right of the original data. Do not disturb the locations of the "answer cells." If all coefficients were found to be statistically significant, there is no need to re-run the analysis.

Television
Advertising

Radio
Advertising

Newspaper
Advertising

Magazine
Advertising

Mail Postal
Advertising

Internet
Advertising

Sales
Promotion

$92,000

$42,000

$80,000

$73,000

$27,000

$78,000

$60,000

$61,000

$90,000

$83,000

$88,000

$91,000

$56,000

$51,000

$74,000

$70,000

$60,000

$61,000

$15,000

$85,000

$53,000

$68,000

$44,000

$55,000

$96,000

$73,000

$78,000

$84,000

1. Based on the attribute values in cells C24:124, make a forecast of Gate Receipts in cell J24.

2. Based on the attribute values in cells C25:125, make a forecast of Gate Receipts in cell J25.

3. Based on the attribute values in cells C26:126, make a forecast of Gate Receipts in cell J26.

4. Based on the attribute values in cells C27:127, make a forecast of Gate Receipts in cell J27.

NOTE: In order to receive credit for this problem, you must show your work and you must use the appropriate formulas in Excel to answer the questions. Format all percentages as a percentage with 2 decimal places (i.e. 78.29%). Show all other numbers with 3 decimal places.

The Fish House instructions

The Fish House ("TFH") sells fresh fish and seafood. TFH receives daily shipments of farm-raised trout from a local supplier. Each trout cost $2.45, which are sold for $3.95. To maintain its reputation for freshness, at the end of each day, TFH sells any leftover trout to a local pet food manufacturer for $1.25 each. Historically, the daily demand is summarized in the table below:

Demand

10

11

12

13

14

15

16

17

18

19

20

Probabilities

2%

6%

9%

11%

13%

15%

18%

11%

7%

5%

3%

The owner of TFH is, of course, interested in his margin, i.e., the sales to customers + sales to pet food manufacturer - purchase cost. Use a Payoff matrix to answer the following questions.

a. How many trout should be ordered each day by maximin?

b. How many trout should be ordered each day by maximax?

c. How many trout should be ordered each day by EMV?

d. How many trout should be ordered each day by minimax regret?

e. How many trout should be ordered each day by EOL?

f. What is the maximum TFH should be willing to pay for a forecast (i.e. EVPI)?

HINT: You will need to calculate a payoff matrix with 11 states of nature and 11 alternatives (it doesn't make sense to order fewer than 10 or more than 20). USE A FORMULA TO CALCULATE THESE 121 PAYOFF AMOUNTS (YOU WILL NEED THE MIN AND MAX FUNCTIONS). NOTE THAT SIX VALUES ARE PROVIDED TO CHECK. The number sold to customers paying $3.95 is the minimum of (1) the quantity ordered and (2) the demand. The number sold to the pet-food manufacturer is the maximum of (1) the quantity ordered minus the demand and (2) zero.

Facility Constr. Instructions

A company needs to decide whether their next manufacturing facility is to be SMALL, MEDIUM, or LARGE. Demand for the products manufactured at this new plant may subsequently turn out to be LOW, MODERATE, or HIGH. The estimated probabilities and the profit (in $10,000) for the various combinations of plant size and demand are given in the following tables:

DEMAND

                                                LOW                            MODERATE                                 HIGH

Probabilities                        0.20                          0.40                                0.40

 

 

PROFIT

 

PLANT SIZE

LOW

MODERATE

HIGH

SMALL

150

160

170

MEDIUM

100

200

220

LARGE

-60

80

320

Once demand is known, the company may attempt to improve profitability by advertising in situations where the size of the plant is larger than appropriate for demand. The parameters describing the various combinations possible are given in the following table:

Plant Size                     Demand                      Possible Results

Probability

Profit


 

 

Poor

0.40

90


Medium

Low


Good

0.60

130


 

 


 

 

Poor

0.20

-90


Large

Low

Good

0.50

60


 

 

Excellent

0.30

90


Large

Moderate

Poor

0.40

40


Good

0.60

100


 

 


Solve this problem by using a decision tree provided in the starting file.

HINT: On the spreadsheet you will find the structure of the decision tree provided. DO NOT RESIZE ROWS OR COLUMNS. Begin by labeling (place text in an appropriate cell) all branches with alternative names and event names and probabilities. Then specify the payoffs at the end of each final branch by entering the correct value in the cell to which to arrow is pointing. Then solve the decision tree by working backwards and labeling each alternative and event node with the correct values and indicate alternative branches not taken with double arrows (There are several double arrow objects in columns x, y and z which you can click and drag to appropriate location). Be sure to label the initial alternative node with the correct value.

Attachment:- HW-2.xlsx

Request for Solution File

Ask an Expert for Answer!!
Applied Statistics: Perform a test of hypothesis to determine whether the
Reference No:- TGS01280470

Expected delivery within 24 Hours