Chi-square distribution and standard deviation


1) The t and F Distributions

The T random variable with v degrees of freedom has the distribution

1886_T and F distribution.jpg

Show that the random variable F = T2 has the F distribution with 1 and v degrees of freedom.

2) An Unbiased Estimate of σ

The chi distribution models the square root of a chi-square random variable:

                        Y= √x      X~ x2 v

(a) Find the pdf of the chi distribution.

(b) We know that the sample variance from a normal distribution follows a chi-square distribution

265_chi square distribution.jpg

Find the expected value of the sample standard deviation, s, and suggest an adjustment that would make it unbiased. You may find the following formulas useful:

2057_Expected value.jpg

3) A Catch-and-Release Estimate

A park has N raccoons of which 10 were previously captured and tagged. Suppose that 20 raccoons are captured. Find the probability that n = 5 of these are found to be tagged. Denote this probability by p(N).

(a) Find the value of N that maximizes p(N); this is called a maximum likelihood estimate. Hint: compare the ratio p(N)/p(N -1) to unity.

(b) Plot the maximum likelihood estimate of N against varying values of n, from 1 to 10.

4) Finite Population Correction Factor

For a finite population of size N with mean µ and variance σ2, it can be shown that the covariance of any two observations in a sample is

398_covariance.jpg

and that the sample variance is slightly biased

1226_biased sample variance.jpg

This means that the variance of the sample mean, when drawn from a finite population, includes a covariance term. This gives rise to the finite population correction factor. Use these relationships to show that the variance of the sample mean is

2482_mean variance.jpg

and that this is estimated by

813_Estimation of variance.jpg

5) Zero-Intercept Regression

Some phenomena follow a linear model which has an inherent zero response for a zero predictor, so the candidate regression model is

y=βx+ε

(a) Find the ordinary least squares estimator for β  by minimizing the sum of squared errors

535_least square estimator.jpg

(b) It seems reasonable that the variance around the regression line would increase as the predictor increases (think of the line swinging around the fixed "pivot" at the origin). If the error was normally distributed, this could give rise to the conditional distribution

1940_conditional distribution.jpg

Assuming this distribution, find the maximum likelihood estimator for β .

Explain why this is called a ratio estimator.

(c) for the following data, estimate β using both your OLS and MLE estimators.

x  0.5 1.5 3.2 4.2 5.1 6.5

y  1.3 3.4 6.7 8.0 10.0 13.2

6) A Truncated Distribution

Students in several statistics classes were asked to complete a questionnaire. One of the quantities asked was the number of siblings a student had. This is a summary of the responses:

siblings    frequency

0                  4

1                 22

2                 22

3                 11

4                  8

5                  3

6                  3

12               1

20               1

6.1) Problem. Use the sibling data to estimate a distribution for the number of children in a family. Obviously, the number of children is one more than the number of siblings. However, there is a selection bias in this measurement; families with no children cannot be reported this way! Therefore the data follows a zero-truncated Poisson distribution (ZTPD)

1160_zero-truncated poisson distribution.jpg

(a) Find the MLE for the parameter λ of the ZTPD.

(b) What are the mean and variance of the ZTPD? Rather than directly calculating the moments, you might find it simpler to use the probability generating function

                                                  Gk(z) = E[zk]

and then take advantage of the properties of the pgf:

E[K] = G'k(z = 1)   V ar[K] = G"k(z = 1) + G'k(z = 1)- [G'k(z=1)]2

(c) Estimate the mean for the number of children and determine whether the data fits a ZTPD with that mean.

7) Estimating with Confidence The times to failure (in hours) for a sample of n = 30 backup generators are

7494.7       8801.7     9990.7      11277.7     10173.3      7746.8

9003.6       8242.9     4532.2      12541.8      6766.9       9898.9

8922.0       13429.8   17623.5    9135.6        6029.8       9038.7

20972.0     7605.1     5396.6      7528.2        10330.6     6475.4

12390.9     9857.0     7067.6      9704.2         5055.8      9942.4

(a) Find the mean time to failure and a 95% confidence interval (use a t distribution).

(b) Find a 95% confidence interval for the standard deviation (start with a chi-square interval for the variance).

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: Chi-square distribution and standard deviation
Reference No:- TGS0290

Expected delivery within 24 Hours