Computer Architecture Project

Hardware performance counters

  • set of special-purpose registers built into modern microprocessors to store the counts of hardware related activities within computer systems
  • low overhead compared to software based methods
  • types and meanings of hardware counters vary from one kind of architecture to another due to the variation in hardware organizations.

Overflow handling

  • generate an overflow signal after every threshold events are counted

- each counter has to be registered separately

- the value of each registered hardware counter is maintained separately

- (LONG_)LONG_MAX:

  • 32 bit: 2,147,483,647
  • 64 bit: 9,223,372,036,854,775,807
  • overflow_handler(): user-defined function to process overflow events.

- function will be called by the PAPI library every time the threshold is reached overflow_vector: a bit-array that can be processed to determined which event(s) caused the overflow

- e.g. using PAPI_get_overflow_event_index()

• Software vs. hardware overflows:

- if processor does not support hardware overflow, software emulates it be periodically checking the counter values

- software overflow handling inaccurate and more expensive than hardware handling

- often implemented using a zero-crossing algorithm

  • value of counter is set to -threshold and increased accordingly

1st Assignment

  • Rules

- Each student should deliver

  • Source code (.c files)

- Please: no .o files and no executables!

  • Documentation (.pdf, .doc, .tex or .txt file)

- In case of questions:

  • ask the TAs first, i

About the Project

  • Given the source code for matrix-multiply operation( File hwmatmul. c).
  • The code contains a trivial implementation of the matrix multiply operation and a blocked implementation

- The blocked implementation is called with block sizes of 16, 32, 64 and 128

  • You can compile the C file, e.g. with

cc -O3 hw-matmul.c -o hw-matmul

  • Once you added the PAPI functions

cc -o3 hw-matmul.c -o hw-matmul

-I/opt/papi/4.2.0/include -L/opt/papi/4.2.0/lib

-lpapi -lperfctr

- Run:

  • allocate a node (see later in the lecture)
  • type: ./hw-matmul

f he doesn't know t

Notes

  • The PAPI version installed on shark is 4.2.0
  • On the front-end node you can find tons ton's of examples in C and

Fortran on how to use PAPI in

/opt/papi/4.2.0/share/examples/ctests.

E.g.

- all_events.c -> how to check on a processor whether a counter

is available

- low-level.c -> how to use the low-level API of PAPI

- memory.c -> how to extract information of the memory

subsystem (e.g. cache sizes)

- overflow_index.c -> how to handle overflow correctly

he answer, he will ask me.

  • Ask early, not the day before the submission is due

 

 

   Related Questions in Basic Computer Science

©TutorsGlobe All rights reserved 2022-2023.