ALIs

kommt noch

libezpm

libezpm - Easy Performance Monitor

libezpm provides a simple access mechanism to the Itanium2 performance
monitor for applications that wish to monitor their own performance.
libezpm is not meant to replace libpfm or libpapi; it's just another option.

libezpm implements the EZPM API which consists of just 6 functions:

1. Initialization.

   C/C++: int ezpm_initialize(void)

   Fortran: INTEGER*4 FUNCTION EZPM_INITIALIZE

Called once by each thread wanting to access performance monitor data.

2. Finalization

   C/C++: int ezpm_finalize(void)

   Fortran: INTEGER*4 FUNCTION EZPM_FINALIZE

Called once by each thread when no subsequent performance monitor data
is required.

3. Name to ID lookup

   C/C++: int ezpm_lookup(const char *name)

   Fortran: INTEGER*4 FUNCTION EZPM_LOOKUP(NAME)
            CHARACTER*(*) NAME

Called to map a performance monitor event name to an integer
"event id" and subsequently passed to ezpm_set_events.
ezpm_initialize must be called prior to calling this function.

4. Event set up

   C/C++: int ezpm_set_events(const int *event_id, int num_events)

   Fortran: INTEGER*4 FUNCTION EZPM_SET_EVENTS(EVENT_ID, NUM_EVENTS)
            INTEGER*4 NUM_EVENTS
            INTEGER*4 EVENT_ID(*)

Called to specify which events to monitor during the next
ezpm_begin / ezpm_end interval.  ezpm_initialize must be called
prior to calling this function.

5. Begin counting

   C/C++: int ezpm_begin(void)

   Fortran: INTEGER*4 FUNCTION EZPM_BEGIN

Called to begin counting the events specified by the thread's
most recent call to ezpm_set_events.  Each ezpm_begin call must be
followed by a call to ezpm_end.  No begin/end nesting is allowed.

6. Finish counting

   C/C++: int ezpm_end(unsigned long *count)

   Fortran: INTEGER*4 FUNCTION EZPM_END(COUNT)
            INTEGER*8 COUNT(*)

Called to stop counting events, and to obtain the number of each event
specified by the thread's most recent call to ezpm_set_events.
Each ezpm_end call must be preceded bu a call to ezpm_begin.  No
begin/end nesting is allowed.

Return values:
   All calls return 0 upon success, and a negative integer upon failure.
   The return value depends on the type of failure.  See
   ezpm.h (C/C++) or ezpm.inc (Fortran) for details.

Here is an example of C/C++ usage.  Error returns are ignored for
brevity.

      #include "ezpm.h"

      ...

      int event_id[2];
      unsigned long count[2];

      ezpm_initialize();

      ...

      event_id[0] = ezpm_lookup("CPU_CYCLES");
      event_id[1] = ezpm_lookup("FP_OPS_RETIRED");

      ...

      ezpm_set_events(event_id, 2);

      ...

      for (iter=0; iter<num_iter; ++iter) {
        ezpm_begin();

        // ...interesting work here...

        ezpm_end(count);

        printf("iteration: %d  megaflops: %.3lf\n", iter,
               (double)count[1] / (double)count[0] * cpu_MHz);
      }

      ...

      ezpm_finalize();

For OpenMP applications, the easiest way to initialize EZPM for all
threads is to introduce a new parallel region before any preexisting ones:

    !$OMP PARALLEL
          ISTAT = EZPM_INITIALIZE()
    !$OMP END PARALLEL

Similarly, after any all preexisting parallel regions, add:

    !$OMP PARALLEL
          ISTAT = EZPM_FINALIZE()
    !$OMP END PARALLEL

For MPI ranks requiring access to performance monitor data, ezpm_initialize
should be called after MPI_Init, and ezpm_finalize just before MPI_Finalize.

For pthread applications, ezpm_initialize and ezpm_finalize should be
called at the beginning and end, respectively, of the start function
of each thread needing performance monitor data.  Threads should also
call ezpm_finalize prior to any explicit calls to pthread_exit.

How to compile:

module load histx
ifort ... myprog.f $HISTX_LIB
icc ... $ HISTX_INC myprog.c $HISTX_LIB

Simple Fortran example:

      INTEGER*4 EZPM_INITIALIZE
      INTEGER*4 EZPM_FINALIZE
      INTEGER*4 EZPM_LOOKUP
      INTEGER*4 EZPM_SET_EVENTS
      INTEGER*4 EZPM_BEGIN
      INTEGER*4 EZPM_END
      INTEGER *8 COUNTS(4)
      INTEGER *4 EVENT(4)
      INTEGER, PARAMETER :: M=1000000
      REAL *8 X(M),Y(M),Z(M)
      IERR=EZPM_INITIALIZE()
      EVENT(1) = EZPM_LOOKUP("CPU_OP_CYCLES.ALL");
      EVENT(2) = EZPM_LOOKUP("FP_OPS_RETIRED");
      EVENT(3) = EZPM_LOOKUP("LOADS_RETIRED");
      EVENT(4) = EZPM_LOOKUP("STORES_RETIRED");
      WRITE(6,*) ' EVENTS ', EVENT
      IERR=EZPM_SET_EVENTS(EVENT, 4)
      X=1
      Y=1
      Z=1
      DO IT=1,10
        IERR=EZPM_BEGIN()
        DO I=1,M/10*IT
           X(I)=X(I)+Z(I)*Y(I)
        ENDDO
        IERR=EZPM_END(COUNTS);
      WRITE(6,*) IT, COUNTS
      ENDDO
      WRITE(1,*) X(M),S
      IERR=EZPM_FINALIZE()
      END

Simple C example:

#include "ezpm.h"
main()
{
      double x[10000];
      int i,iter;
      int event_id[2];
      unsigned long count[2];
      unsigned long allcounts[2];
      ezpm_initialize();
      event_id[0] = ezpm_lookup("CPU_OP_CYCLES.ALL");
      event_id[1] = ezpm_lookup("FP_OPS_RETIRED");
      for(i=0;i<10000;i++) x[i]=1.;
      ezpm_set_events(event_id, 2);
      ezpm_begin();
      for (iter=0; iter<10; ++iter) {
        for(i=0;i<100*iter;i++) x[i]=x[i]+i;
        ezpm_end(count);
        allcounts[0]= allcounts[0]+count[0];
        allcounts[1]= allcounts[1]+count[1];
        ezpm_begin();
        printf("iteration: %d %ld %ld \n",iter, count[0] , count[1]);
        printf("           %d %ld %ld \n",iter, allcounts[0] , allcounts[1]);
      }
      ezpm_finalize();
}