Discuss the parallel performance of the lu factorization


You are provided with a program matrix_serial . c to solve a linear system of equations Ax = b. The code includes routines to initialize A and b, compute the LU factorization of A, and solve triangular systems with lower and upper triangular matrices. The main program uses these routines to compute the solution of the system Ax = b by computing A = LU, and solving Ly = b for y, followed by Ux = y for x. Instructions to compile and execute the code are included in the file.

You need to parallelize the routines for computing LU factorization and solving the triangular systems using pthreads. The file should be named matrix . c.

1. You need to parallelize the routines for computing LU factorization and solving the triangular systems using pthreads. The file should be named matrix . c. A total of 20 points are reserved for performance of the code: speedup obtained by the multithreaded code and the overall execution time will be considered when awarding these points.

2. Execute the code for n= 212 with p chosen to be 2k, for k = 0, 1,...,6. Plot execution time versus p to demonstrate how time varies with the number of threads. Use logarithmic scale for the x-axis. Plot speedup versus p to demonstrate the change in speedup with p.

3. Discuss the parallel performance of the LU factorization routine and the triangular solver routines. Comment on the observed performance and the possible reasons for the observations.

4. You will receive bonus points equal to the amount the sum of speedups observed in the following routines - LU factorization, lower triangular solve, and upper triangular solve - exceeds 2.0. Speedup is computed as the speed improvement achieved by each routine over the execution time of the routine reported by matrix_serial . c for a single run. Bonus points are subject to a maximum of 10 points. Total speedup value will be rounded. Individual routine speedup values lower than 1.0 will be raised to 1.0 to compute bonus points. For example, a speedup of 3.5, 2.1, and 0.7, respectively, in the three routines is awarded 5 bonus points. In your submission, you need to specify the input arguments to the executable that produce the best speedup. Also indicate the speedup you observe in each of the routines. Compilation will be done using icc with default optimization.

Attachment:- hw4.txt

Request for Solution File

Ask an Expert for Answer!!
Theory of Computation: Discuss the parallel performance of the lu factorization
Reference No:- TGS01229162

Expected delivery within 24 Hours