Let y1 y2yn be n iid random variables with common mean


1. Let Y1, Y2,...,Yn be n i.i.d. random variables with common mean µ and common variance σ2. Let Y¯ denote the sample average.

(a) Define the class of linear estimators of µ by

Wa = a1Y1 + a2Y2 +...+ anYn

where ai are constants. What restriction on the ai is needed for Wa to be an unbiased estimator of µ?

(b) Find Var(Wa).

(c) For any numbers a1, a2,...,an the following inequality holds: (a1 + a2 +...+ an)2/n ≤ a12 + a22 + ... + an2. Use this, along with parts (a) and (b), to show that Var(Wa) ≥ Var(Y¯) whenever Wa is unbiased, so that Y¯ is the best linear unbiased estimator. [Hint: What does the inequality become when the ai satisfy the restriction from part (a)?

2. For positive random variables X and Y, suppose the expected value of Y given X is E(Y|X) = θX. The unknown parameter θ shows how the expected value of Y changes with X.

(a) Define the random variable Z = Y/X. Show that E(Z) = θ. [Hint: Use the law of iterated expectations. In particular, first show that E(Z|X) = θ and then use the law of iterated expectations].

(b) Use part (a) to prove that the estimator W1 = 1/n i=1n(Yi/Xi) is unbiased for θ, where {(Xi ,Yi): i = 1, 2, ... ,n} is a random sample.

(c) Explain why the estimator W2 = Y¯/X¯, where the overbars denote sample averages, is not the same as W1. Nevertheless, show that W2 is also unbiased for θ.

(d) The following table contains data on corn yields for several counties in Iowa. The USDA predicts the number of hectares of corn in each county based on satellite photos. Researchers count the number of "pixels" of corn in the satellite picture (as opposed to, for example, the number of pixels of soybeans or of uncultivated land) and use these to predict the actual number of hectares. To develop a prediction equation to be used for counties in general, the USDA surveyed farmers in selected counties to obtain corn yields in hectares. Let Yi equal corn yield in county i and let Xi equal number of corn pixels in the satellite picture for county i. There are n = 17 observations for eight counties. Use this sample and Stata to compute the estimates of θ devised in parts (b) and (c). To input the data into Stata, start with an empty dataset and use the edit command. To estimate W1, it would be helpful to generate a new variable that equals Yield/Pixels for each observation.

Plot

Corn Yield

Corn Pixels

1

165.76

374

2

96.32

209

3

76.08

253

4

185.35

432

5

116.43

367

6

162.08

361

7

152.04

288

8

161.75

369

9

92.88

206

10

149.94

316

11

64.75

145

12

127.07

355

13

133.55

295

14

77.70

223

15

206.39

459

16

108.33

290

17

118.17

307

(e) Compute the variance of Yieldi/Pixelsi "manually", by the following steps:

i. Compute (Yield/Pixels -Yield/Pixels)2 for each observation.

ii. Sum that up and divide by n-1 (Hint: to calculate the total of a variable, you can calculate the mean and multiply by the number of observations.)

(f) Compute the variance using the summarize command and confirm they are the same.

3. Let Y denote a Bernoulli(θ) random variable with 0 < θ < 1. Suppose we are interested in estimating the odds ratio, γ = θ/(1 - θ), which is the probability of success over the probability of failure. Given a random sample {Y1,...,Yn}, we know that an unbiased and consistent estimator of θ is Y¯, the proportion of successes in n trials. A natural estimator of γ is G = Y¯/(1-Y¯), the proportion of successes over the proportion of failures in the sample

(a) Why is G not an unbiased estimator of γ?

(b) Use the properties of plim to show that G is a consistent estimator of γ.

4. You are hired by the governor to study whether a tax on liquor has decreased average liquor consumption in your state. You are able to obtain, for a sample of individuals selected at random, the difference in liquor consumption (in ounces) for the years before and after the tax. For person i who is sampled randomly from the population, Yi denotes the change in liquor consumption. Treat these as a random sample from a Normal (µ,σ2) distribution.

(a) The null hypothesis is that there was no change in averae liquor consumption. State this formally in terms of µ.

(b) The alternative is that there was a decline in liquor consumption; state the alternative in terms of µ.

(c) Now, suppose your sample size is n = 900 and you obtain the estimates y¯ = -32.8 and s = 466.4. Calculate the t statistic for testing H0 against H1; obtain the p-value for the test (Use the standard normal distribution table attached below). Do you reject H0 at the 5% level? At the 1% level?

(d) Would you say that the estimated fall in consumption is large in magnitude? Comment on the practical versus statistical significance of this estimate.

(e) What has been implicitly assumed in your analysis about other determinants of liquor consumption over the two-year period in order to infer causality from the tax change to liquor consumption?

5. (Based on old exam question) You are asked to study the relationship between maternal smoking and low birthweight. You have a Stata dataset of babies' birth weights and whether the mother smoked during pregnancy. Let Yi be a binary variable that equals 1 if a baby is born with low birthweight. Unless otherwise indicated, assume that {Y1, Y2,...,Yn} are independent and identically distributed. Use the dataset bwght2.dta for this question.

(a) Use Stata to compute the mean of Yi for mothers who didn't smoke. Using only the mean and number of observations, show how you can compute the sample standard deviation.

(b) Your estimate for the proportion of babies with low birthweight is Y¯ = .014. Provide an estimate for the variance of Y¯.

(c) Suppose you want to test the null hypothesis that the proportion of babies born with low birthweight in this population is greater than or equal to .02. Conduct a one-sided test at 5% confidence level manually by constructing the following:

i. The test statistic

ii. The distribution of the test statistic under the null. Explain why you do not need to know the distribution of Y¯ in order to know this distribution of the test statistic. What feature(s) of the setup make it possible to know this distribution?

iii. The rejection rule

iv. The outcome of the test

(d) What is the p-value for this test? Use the standard normal distribution table below.

(e) Confirm the above test results using the built-in Stata command (Hint: to perform t-test, use the ttest command).

(f) Now use Stata to compute the mean of Yi for mothers who smoked. Test whether mothers who smoke have a different incidence of low birthweight than mothers who don't smoke. Conduct a two-sided test at the 5% confidence level manually by constructing the following:

i. The null hypothesis

ii. The test statistic

iii. The distribution of the test statistic under the null

iv. The rejection rule

v. The outcome of the test

(g) Confirm the above test results using the built-in Stata command (Hint: to perform t-test, use the ttest command).

(h) Construct a 95% confidence interval for the above test. Explain what this interval represents.

Attachment:- Assignment.rar

Solution Preview :

Prepared by a verified Expert
Econometrics: Let y1 y2yn be n iid random variables with common mean
Reference No:- TGS01516844

Now Priced at $30 (50% Discount)

Recommended (92%)

Rated (4.4/5)