Table 1 shows the number of applicants to graduate school


1. Table 1 shows the number of applicants to graduate school at Berkeley for the six largest departments in 1973 by gender and department. Table 2 shows the number of rejected applicants by gender and department. Recall the notation GM, GF , DA, DB, DC , DD, DE and DF

Table 1: The contingency table by gender and department (counting the number of applicants)

Table 2: The contingency table by gender and department (counting the number of rejected applicants)

a. By dividing the number of rejected applicants by the number of applicants, complete Table 3. Note that this table is not a contingency table. The table summarizes the rejection rate by each subgroup. For instance, 0.379 represents the rejection rate for male applicants who applied to Department A. It can be translated to P(Rejected | GM ∩ DA) = 0.379 . In addition, you should interpret P(Rejected | DA) = 0.356 , as the rejection rate in Department A (regardless of gender). Table 3: The proportions of rejected applicants by gender and department

b.  From Table 3, report the conditional probability that an applicant was rejected among male applicants, namely P(Rejected | GM).

c.  From Table 3, report the conditional probability that an applicant was rejected among female applicants, namely P(Rejected | GF ).

d.  Based on P(Rejected | GM) and P(Rejected | GF ) only, which gender has a greater rejection rate?

e.  From Table 1, report the six ratios of conditional probabilities P(GM | DA) P(GF | DA) , . . . , P(GM | DF ) P(GF | DF ) . Round to two decimal places.

f.  From Table 3, report the six conditional probabilities P(Rejected | DA), . . . , P(Rejected | DF ).

g.  Find the six ratios of conditional probabilities P(Rejected | GM ∩ DA) P(Rejected | GF ∩ DA) , . . . , P(Rejected | GM ∩ DF ) P(Rejected | GF ∩ DF ) .

h.  In part g, what does a ratio greater than one imply? What does a ratio close to one imply?

i.  Do you still believe that there was gender discrimination in the freshmen recruitment? Provide your reason based on part g.

j.  Using Law of Total Probability, write P(Rejected | GF ) = 0.696 as a weighted average of six probabilities.

k.  Using Law of Total Probability, write P(Rejected | GM) = 0.555 as a weighted average of six probabilities.

l.  In one sentence, explain why P(Rejected | GF ) > P(Rejected | GM) happened

2. Suppose a company which produces fire alarms has claimed that the fire alarms make only one false alarm per year, on average. Let X denote the number of false alarms per year. Assume X ∼ Poisson(λ). Under the company's claim, the probability of observing x fire alarms per year is P(X = x) = e -λ λ x x! = e -1 x! , x = 0, 1, . . . . A customer had a bad experience with the fire alarm he purchased before. He wants to conduct hypothesis testing H0: λ = 1 versus H1: λ > 1, and he purchased another fire alarm from the same company. He allows 1% chance for falsely rejecting H0. After a year, he observed three false alarms.

a.  Find the p-value based on the single observation (three false alarms for the year).

b.  Draw a conclusion based on the p-value in part a (in the context of this problem without using any symbols).

c.  He gathered one hundred people who observed three or more false alarms and observed (X1, X2, . . . , X100) = (4, 6, . . . , 3) with X¯ 100 = 1 100 X 100 i=1 Xi = 3.32 . Ignoring any flaw of data collection, calculate the test statistic (which is compared to the standard normal distribution) and the approximate p-value for testing H0: λ = 1 versus H1: λ > 1. (Hint: If we observe Poisson random variables, the population mean and the population variance are equal to λ.)

d.  In two sentences, argue why the sample of size n = 100 is not useful for the hypothesis testing.

3. In lecture we discussed the association between gestational age X and birth weight Y . Here is a portion of the R output. Intercept - 1410.7 155.8 - 9.055 < 2e - 16 x 124.1 4.0 31.026 < 2e - 16 We estimate the slope β1 as βˆ 1 = Pn i=1(Xi - X¯ n)(Yi - Y¯ n) Pn i=1(Xi - X¯ n) 2 . and the intercept β0 as βˆ 0 = Y¯ n - βˆ 1 X¯ n . If we transform a random sample (X1, Y1), . . . ,(Xn, Yn) as T = βˆ 1 - β1 SE , SE = s 1 n-2 Pn i=1(Yi - βˆ 0 - βˆ 1Xi) 2 Pn i=1(Xi - X¯ n) 2 , the transformed random variable T follows the T distribution with n - 2 degrees of freedom, where β1 is the true slope under the linear model. In this exercise, our goal is to derive a 95% confidence interval (CI) for the unknown slope β1. In the dataset, we observed 2500 babies.

a.  Find the constant t ∗ such that P -t ∗ ≤ βˆ 1 - β1 SE ≤ t ∗ ! = 0.95 . (1) You need to use R. Round t ∗ to three decimals.

b.  Using algebra inside the probability statement, we are able to rewrite Equation (1) as P (L ≤ β1 ≤ U) = 0.95 for some L and U. Since the true value of β1 is in (L, U) with probability 0.95 (if we take a random sample of size 2500 many times), the random interval (L, U) becomes a 95% CI for β1. Using algebra inside the probability statement of Equation (1), derive L and U in terms of t ∗ , SE, and βˆ 1. Do not insert any numeric value yet.

c.  In the R output, the estimated β1 is 124.1 and the calculated SE is 4.0. (SE quantifies the uncertainty associated with our estimate βˆ 1 which is called the standard error.) Report the observed 95% CI for β1. Round to two decimal places.

d.  We have 103 students in Stats 67. Suppose all students collect a random sample of size 2500 from the same population, and each student constructs a 95% CI for β1 from his/her own data. What is the expectation for number of students who will miss the true value of β1? (Hint: The number of students who miss the true value of β1 follows a binomial distribution.)

4. Suppose we observe a random sample (X1, . . . , Xn) with n = 10, where Xi ∼ Bernoulli(p) and p is the proportion of black cars at UCI. Suppose we observed (x1, . . . , x10) = (1, 0, 0, 1, 0, 1, 0, 0, 1, 0).

a. Find the likelihood function L(p) given the ten binary observations.
b. Find the log-likelihood function l(p) = log L(p).
c. Take the first derivative of l(p) with respect to p.
d. Report the estimate of the population proportion of black cars based on the method of maximum likelihood estimation.

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Table 1 shows the number of applicants to graduate school
Reference No:- TGS01233259

Now Priced at $30 (50% Discount)

Recommended (97%)

Rated (4.9/5)