Calculate the principal components of the first two flowers


Problem 1: Suppose that X1, X2, X3 are i.i.d. normal random variables with mean 0 and variance 1.

(a) Define a random vector x= 860_Define a random vector.png. What is the distribution of X?

Note: characterizing a multivariate distribution means providing its parameters and name of the distribution, in this case the parameters will be the mean vector and the variance covariance matrix.)

(b) What is the distribution of X¯ = 1 .3

Xi?

. X1 .

3 i=1

(c) Let Y =  874_Define a random vector1.png. Find the distribution of Y.

(d) Suppose that Z ∼ N(1, 22) and is independent of all Xi. Define Zi = Z + Xi for i = 1, 2, 3. What is the distribution of the random vector

1797_Define a random vector2.png? Determine the correlation coefficient between Z1 and Z2 and Z.

Problem 2. (a) Generate 200 random observations (a random sample) from the 3-dimensional multivariate normal distribution having mean vector µ =1328_Define a random vector3.png and covariance matrix

Σ =  1260_Define a random vector4.png

using the Choleski factorization method (Use the program given below). Use the R pairs plot to graph an array of scatter plots for each pair of variables. For each pair of variables, (visually) check that the location and correlation approximately agree with the theoretical parameters of the corresponding bivariate normal distribution. What should those parameters be for each of the scatterplots? Write them down near the corresponding plot. And figure out (with the program below) what the sample values are for them. Write down: the theoretical values of means and variance-covariances and the sample values of means and variance-covariances. Compare them. No need to turn in the program for this, since I am giving it to you.

(b) Repeat the exercise in part (a) but now for µ = 2067_Define a random vector5.png and covariance matrix

Σ =  1260_Define a random vector4.png

Turn in an R script for the program used for this part.

Problem 3. For each of the following bivariate normal distribution, find its principal components (showing work), sketch a typical scatter plot of data from the distribution, and label the eigenvectors of its covariance matrix that you found on the scatter plot.

1220_Define a random vector6.png

Problem 4. In this problem we will perform principal component analysis on the sepal and petal measurements of the first 50 flowers of the Iris data, i.e, iris[1:50, 1:4].

(a) Center the data matrix and denote the centered data matrix by X. Find the covariance matrix Sx of X.

(b) Report the eigenvectors of Sx and the variance of X along each eigenvector direction.

(c) Calculate the principal components of the first two flowers.

(d) Make a scatterplot of the first two principal components of the data. Do you see any correlation between the two principal components?

(e) If we wish to keep at least 85% of the total variance, how many principal components do we need to keep?

Problem 5. Suppose that Z1, Z2, Z3 are the principal components of a data set and Y is a vector of the response variable. The correlation coefficients between Y and Z1, Z2, Z3 are 0.25, -0.4, 0.7, respectively.

(a) If we decide to use only two principal components in the PC regression of Y, which two principal components should we choose? Why?

(b) If || Z1 ||= 2, || Z2 ||= 1, || Z3 || 5, || Y ||= 4, find the coefficients in the PC regression in (a).

Request for Solution File

Ask an Expert for Answer!!
Engineering Mathematics: Calculate the principal components of the first two flowers
Reference No:- TGS01222046

Expected delivery within 24 Hours