Compute the power of the acceptance region - a randomly


Question 1: The null hypothesis in that setup is no endowment effect; that is, H0: p = 0.5.

(a) Suppose you decided, in advance of seeing the data, to set the acceptance region to [0.45, 0.55]. Also, you've decided to use the sample proportion, pˆ = Y- = (Y1+ ........+ Yn)/n , as your estimator. Calculate the probability of making a type I error.

(b) A type I error is like convicting an innocent person. The more you lean toward "innocent until proven guilty" the more you prefer a lower probability of making type I errors. Suppose you take that view with the endowment effect: The idea that there's no endowment effect (that's the null) is held to be true (innocent) until the evidence is overwhelmingly against it. How can you change the acceptance region so that you're less likely to make a type I error (convict an innocent hypothesis)? Give the lower and upper bounds of the acceptance region that gives a type I error probability of 1 percent.

(c) Suppose that, unknown to you, the real value of p is 0.4. This means the null hypothesis is false. Calculate the probability of making a type II error.

(d) A type II error is like setting free a guilty person. The more averse to that you are, the more you prefer a small type II error probability. How can you change the acceptance region in (a) so that you're less likely to make a type II error (set free a guilty hypothesis)? If you keep the acceptance region symmetric around the hypothesized mean, and without any "holes" in it, what's the smallest you can make the type II error probability? What would the type I error probability be in this case?

(e) Compute the power of the acceptance region in (a).

(f) Another way-besides changing the acceptance region-to lower the chance of a type I error is to increase the sample size. How large would the sample size have to be (the minimum) for the acceptance region in (a) to have a type I error probability of no more than 1 percent? Using the alternative p = 0.4 from (c), what is the probability of a type II error for this new sample size and using the acceptance region from (a)?

Question 2. How large should your sample be? There is an event in Utah for which it is important to know what proportion of people attending the event are from other states or countries. Call this proportion p. There are too many attendees, and the venue too dispersed, to "sample" all the attendees. In fact, for this event, it's particularly costly to survey people. Instead, you will arrange to ask a random sample of attendees whether they're from outside Utah (1 = yes, 0 = no). The event organizers want the margin of error to be no more than 3 percent, with 95 percent confidence. What that means is that they want: Pr(-0.03 ≤ pˆ - p ≤ 0.03) ≥ 0.95. Figure out the smallest sample size, n, that will meet this requirement. Show that this minimum n depends on p: For some values of p the requirement could be met with a smaller sample size than for others. The trouble, of course, is that you don't know p. Still, you can give a lower bound. Do that-give the smallest sample size that guarantees you'll meet or exceeds the requirement no matter the true p.

Question 3. A case of discrimination? To investigate the possibility of gender discrimination in a firm, a sample of 100 men and 64 women with similar job descriptions are selected at random. A summary of the resulting monthly salaries follows:

 

Average Salary (-Y)

Standard Deviation (sY )

n

Men

$3100

$200

100

Women

$2900

$320

64

(a) What do these data suggest about wage differences in the firms? Do they represent statistically significant evidence that average wages of men and women are different? Statistical significance does not imply that the effect is meaningfully large. An effect can be statistically significant but tiny. If there is a statistically significant difference in salaries, does this difference seem large to you? (To answer this question, first state the null and alternative hypotheses; second, compute the relevant t-statistic; third, compute the p-value associated with the t-statistic; and finally, use the p-value to answer the question.)

(b) Do these data suggest to you that the firm is guilty of gender discrimination in its compensation policies? Explain. If there is something lacking, what is it? If there is not, why is this evidence sufficient?

Question 4. A professor decides to run an experiment to measure the effect of time pressure on final exam scores. He gives each of the 400 students in his course the same final exam, but some students have 90 minutes to complete the exam while others have 120 minutes. Each student is randomly assigned one of the exam times, based on the flip of a coin. Let Yi denote the number of points scored on the exam by the ith student (0 ≤ Yi ≤ 100), let Xi denote the amount of time that the student has to complete the exam (Xi = 90 or 120), and consider the regression model Yi = β0 + β1Xi + ui.

(a) Explain what the term ui means. Why will different students have different values of ui ?

(b) Explain why E.ui|Xi. = 0 for this regression model.

(c) Are the other least-squares assumptions (see Key Concept 4.3) satisfied? Explain.

(d) The estimated regression is Yˆi = 49 + 0.24Xi.

i. Compute the estimated regression's prediction for the average score of students given 90 minutes to complete the exam. Repeat for 120 minutes and 150 minutes.

ii. Compute the estimated gain in score for a student who is given an additional 10 minutes on the exam.

Question 5. (a) Show that the ftrst least-squares assumption, E(ui|Xi) = 0, implies E(ui) = 0.

(b) Show that E(ui|Xi) = 0, implies that E(Yi|Xi)  = β0 + β1Xi.

Question  6. Earnings and Height In this exercise you will investigate the relationship between a person's earnings and their height. The data you will use, Earnings_and_Height.csv, contains data on earnings, height, and other characteristics of a random sample of U.S. workers. This dataset was used in the research behind the peer-reviewed publication "Stature and Status: Height, Ability, and Labor Market Outcomes," by Anne Case and Christina Paxson (Journal of Political Economy, 2008, 116(3): 499-532). Both the data and data description files are on Canvas.

(a) What is the median value of height in the sample?

(b) i. Estimate average earnings for workers whose height is at most 67 inches.

ii. Estimate average annual earnings for workers whose height is greater than 67 inches.

iii. On average, do taller workers earn more than shorter workers? How much more? What is a 95% confidence interval for the difference in average earnings?

(c) Construct a scatter plot of annual earnings (earnings) on height (height). Notice that the points on the plot fall along horizontal lines (there are only 23 distinct values of earnings). Why? (Hint: Carefully read the detailed data description.)

(d) Run a regression of earnings on height.

i. What is the estimated slope?

ii. Use the estimated regression to predict earnings for a worker who is 67 inches tall, for a worker who is 70 inches tall, and for a worker who is 65 inches tall?

(e) Suppose height were measured in centimeters instead of inches. Answer the following questions about the earnings on height regression.

i. What is the estimated slope of the regression?

ii. What is the estimated intercept?

iii. What is the R2?

iv. What is the standard error of the regression?

(f) Run a regression of earnings on height, using data for female workers only.

i. What is the estimated slope?

ii. A randomly selected woman is 1 inch taller than the average woman in the sample. Would you predict her earnings to be higher or lower than the average earnings for women in the sample? By how much?

(g) Repeat 6f for male workers.

(h) Do you think that height is uncorrelated with other factors that cause earnings? That is, do you think that the regression error term, say ui , has a conditional mean of zero, given height (Xi )? (We will investigate this important issue in more depth later.)

Question 7. Wage earnings and educational attainment Use R to answer the following questions. Submit your answers on paper and the R code you used to get those answers to Canvas. You might find it helpful to look at the sidebar "The Gender Gap of Earnings of College Graduates in the United States" on pages (86-87).

The file cps92_12.csv contains data from the Current Population Survey, a monthly survey administered by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics. It contains data for years 1992 and 2012, for workers age 25-34, and for workers with either a high school diploma or B.A./B.S. as their highest degree. A more detailed description of the data is given in the file CPS92_12_Description.pdf on Canvas. Use these data to answer the following questions.

(a)

i. Compute the sample mean for average hourly earnings (ahe) in 1992 and 2012.

ii. Compute the sample standard deviation for ahe in 1992 and 2012.

iii. Construct a 95% confidence interval for the population means of ahe in 1992 and 2012.

iv. Construct a 95% confidence interval for the change in the population mean of ahe in 1992 and 2010.

(b) In 2012, the value of the Consumer Price Index (CPI) was 229.6. In 1992, the value of the CPI was 140.3. Repeat (a) but use ahe measured in real 2012 dollars ($2012); that is, adjust the 1992 data for the price inflation that occurred between 1992 and 2012.

(c) If you were interested in the change in worker's purchasing power from 1992 to 2012, would you use the results from (a) or (b)? Explain.
(d) Using the data for 2012:

i. Construct a 95% confidence interval for the mean of ahe for high school graduates.

ii. Construct a 95% confidence interval for the mean of ahe for workers with a college degree.

iii. Construct a 95% confidence interval for the difference between the two means.

(e) Repeat (d) using the 1992 data expressed in $2012.

(f) Using the appropriate estimates, confidence intervals, and test statistics, answer the following questions:

i. Did real (inflation-adjusted) wages of high school graduates increase from 1992 to 2012?

ii. Did real wages of college graduates increase?

iii. Did the gap between earnings of college and high school graduates increase? Explain.

Attachment:- Assignment.rar

Request for Solution File

Ask an Expert for Answer!!
Applied Statistics: Compute the power of the acceptance region - a randomly
Reference No:- TGS01275885

Expected delivery within 24 Hours