the mock data file consists of 3 columns each


The mock data file consists of 3 columns, each containing 1000 numbers:

1. a flag indicating which data row

2. the sampled x value (1 - 1000)

3. the corresponding sampled y value (1 - 1000)

The challenge is essentially to determine the linear relationship between x and y using these 1000 data pairs. It divides up into three steps, of increasing complexity.

Step 1: Use ordinary least squares to fit the linear model  y = a + bx  to the mock data

          (a) compute LS estimators of a and b,

          (b) estimate the variance of the (assumed Gaussian) noise which has been added to the mock y values

          (c) estimate errors on a_LS and b_LS, and their covariance

Step 2: By casting the data analysis challenge not as a least squares problem, but as a maximum likelihood problem, form an appropriate likelihood function for the mock data, which depends on the parameters (a,b).

          Then, by computing the log likelihood on a rectangular grid of values of a and b (you need to think carefully about the range of a and b values you should consider, and the spacing between them), and in turn computing the value of chi-squared for each (a,b) pair on your grid, you should find the minimum value of chi-squared.  You then should turn your grid of values into a rectangular array of Delta chi-squared values.  Finally, using the information in the table in Section 6, you should compute and plot Bayesian credible regions  for the parameters at e.g. 68.3%, 95.4%, 99.73%. (Carrying out the calculations and making a contour plot from the results is straightforward in e.g. MATLAB, although you are welcome to use any programming language you wish).

Step 3: Finally, using the Metropolis algorithm, and assuming a Gaussian likelihood function for the model parameters a and b, write an MCMC code to generate a sample from the likelihood function - thinking carefully about your choices of proposal density and prior range for a and b. Use this sample to estimate the mean values, errors and covariance of the parameters a and b from their sampled marginal distributions. Devise a method for estimating and plotting Bayesian credible regions for the paramters, using your MCMC sample.

while Steps 2 and 3 both involve more sophisticated methods and will require you to write some simple computer code (e.g. in MATLAB)

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: the mock data file consists of 3 columns each
Reference No:- TGS0484744

Expected delivery within 24 Hours