libezpm
libezpm - Easy Performance Monitor
libezpm provides a simple access mechanism to the Itanium2 performance
monitor for applications that wish to monitor their own performance.
libezpm is not meant to replace libpfm or libpapi; it's just another option.
libezpm implements the EZPM API which consists of just 6 functions:
1. Initialization.
C/C++: int ezpm_initialize(void)
Fortran: INTEGER*4 FUNCTION EZPM_INITIALIZE
Called once by each thread wanting to access performance monitor data.
2. Finalization
C/C++: int ezpm_finalize(void)
Fortran: INTEGER*4 FUNCTION EZPM_FINALIZE
Called once by each thread when no subsequent performance monitor data
is required.
3. Name to ID lookup
C/C++: int ezpm_lookup(const char *name)
Fortran: INTEGER*4 FUNCTION EZPM_LOOKUP(NAME)
CHARACTER*(*) NAME
Called to map a performance monitor event name to an integer
"event id" and subsequently passed to ezpm_set_events.
ezpm_initialize must be called prior to calling this function.
4. Event set up
C/C++: int ezpm_set_events(const int *event_id, int num_events)
Fortran: INTEGER*4 FUNCTION EZPM_SET_EVENTS(EVENT_ID, NUM_EVENTS)
INTEGER*4 NUM_EVENTS
INTEGER*4 EVENT_ID(*)
Called to specify which events to monitor during the next
ezpm_begin / ezpm_end interval. ezpm_initialize must be called
prior to calling this function.
5. Begin counting
C/C++: int ezpm_begin(void)
Fortran: INTEGER*4 FUNCTION EZPM_BEGIN
Called to begin counting the events specified by the thread's
most recent call to ezpm_set_events. Each ezpm_begin call must be
followed by a call to ezpm_end. No begin/end nesting is allowed.
6. Finish counting
C/C++: int ezpm_end(unsigned long *count)
Fortran: INTEGER*4 FUNCTION EZPM_END(COUNT)
INTEGER*8 COUNT(*)
Called to stop counting events, and to obtain the number of each event
specified by the thread's most recent call to ezpm_set_events.
Each ezpm_end call must be preceded bu a call to ezpm_begin. No
begin/end nesting is allowed.
Return values:
All calls return 0 upon success, and a negative integer upon failure.
The return value depends on the type of failure. See
ezpm.h (C/C++) or ezpm.inc (Fortran) for details.
Here is an example of C/C++ usage. Error returns are ignored for
brevity.
#include "ezpm.h"
...
int event_id[2];
unsigned long count[2];
ezpm_initialize();
...
event_id[0] = ezpm_lookup("CPU_CYCLES");
event_id[1] = ezpm_lookup("FP_OPS_RETIRED");
...
ezpm_set_events(event_id, 2);
...
for (iter=0; iter<num_iter; ++iter) {
ezpm_begin();
// ...interesting work here...
ezpm_end(count);
printf("iteration: %d megaflops: %.3lf\n", iter,
(double)count[1] / (double)count[0] * cpu_MHz);
}
...
ezpm_finalize();
For OpenMP applications, the easiest way to initialize EZPM for all
threads is to introduce a new parallel region before any preexisting ones:
!$OMP PARALLEL
ISTAT = EZPM_INITIALIZE()
!$OMP END PARALLEL
Similarly, after any all preexisting parallel regions, add:
!$OMP PARALLEL
ISTAT = EZPM_FINALIZE()
!$OMP END PARALLEL
For MPI ranks requiring access to performance monitor data, ezpm_initialize
should be called after MPI_Init, and ezpm_finalize just before MPI_Finalize.
For pthread applications, ezpm_initialize and ezpm_finalize should be
called at the beginning and end, respectively, of the start function
of each thread needing performance monitor data. Threads should also
call ezpm_finalize prior to any explicit calls to pthread_exit.
How to compile:
module load histx
ifort ... myprog.f $HISTX_LIB
icc ... $ HISTX_INC myprog.c $HISTX_LIB
Simple Fortran example:
INTEGER*4 EZPM_INITIALIZE
INTEGER*4 EZPM_FINALIZE
INTEGER*4 EZPM_LOOKUP
INTEGER*4 EZPM_SET_EVENTS
INTEGER*4 EZPM_BEGIN
INTEGER*4 EZPM_END
INTEGER *8 COUNTS(4)
INTEGER *4 EVENT(4)
INTEGER, PARAMETER :: M=1000000
REAL *8 X(M),Y(M),Z(M)
IERR=EZPM_INITIALIZE()
EVENT(1) = EZPM_LOOKUP("CPU_OP_CYCLES.ALL");
EVENT(2) = EZPM_LOOKUP("FP_OPS_RETIRED");
EVENT(3) = EZPM_LOOKUP("LOADS_RETIRED");
EVENT(4) = EZPM_LOOKUP("STORES_RETIRED");
WRITE(6,*) ' EVENTS ', EVENT
IERR=EZPM_SET_EVENTS(EVENT, 4)
X=1
Y=1
Z=1
DO IT=1,10
IERR=EZPM_BEGIN()
DO I=1,M/10*IT
X(I)=X(I)+Z(I)*Y(I)
ENDDO
IERR=EZPM_END(COUNTS);
WRITE(6,*) IT, COUNTS
ENDDO
WRITE(1,*) X(M),S
IERR=EZPM_FINALIZE()
END
Simple C example:
#include "ezpm.h"
main()
{
double x[10000];
int i,iter;
int event_id[2];
unsigned long count[2];
unsigned long allcounts[2];
ezpm_initialize();
event_id[0] = ezpm_lookup("CPU_OP_CYCLES.ALL");
event_id[1] = ezpm_lookup("FP_OPS_RETIRED");
for(i=0;i<10000;i++) x[i]=1.;
ezpm_set_events(event_id, 2);
ezpm_begin();
for (iter=0; iter<10; ++iter) {
for(i=0;i<100*iter;i++) x[i]=x[i]+i;
ezpm_end(count);
allcounts[0]= allcounts[0]+count[0];
allcounts[1]= allcounts[1]+count[1];
ezpm_begin();
printf("iteration: %d %ld %ld \n",iter, count[0] , count[1]);
printf(" %d %ld %ld \n",iter, allcounts[0] , allcounts[1]);
}
ezpm_finalize();
}