ALIs

kommt noch

persystreport

User documentation on how to generate reports using persystreport tool.

Basic Report

Usage

    persystreport <command line options>

Called without any parameters persystreport will print out the performance properties aggregate information about all your jobs. It produces an html file viewable with any modern browser (For example: Firefox or Internet Explorer)

Check for reasons why there is no report for a particular job.

 

Overview of command line options:

To specify a time interval you can specify begin time and end time:

    -b <yyyy-mm-dd hh:mm:ss>

    -e <yyyy-mm-dd hh:mm:ss>

To specify the minimum number of cpus assigned. Defaults to zero.

    -m <min no of cpus>

e.g. if you want to see all your jobs with at least 128 cores:

    persystreport -m 128

To specify a job or more than one job

    -j <jobid> or -j <jobid_1>,<jobid_2>,<jobid_3> 

To write to a specified file

    -f <file>

The file defaults to report.html for a basic report and report.zip for a detailed report.

Example of usage:

persystreport -b '2011-01-20 13:30:00' -e '2011-01-20 14:30:00' -j 456789,467890 -f myFile.html

 

Basic Report - Output

    1. Jobs Overview
    2. Job Average Data

 

Jobs Overview

Initially, a list of jobs is shown which match the parameters passed to the tools that generates this report.

  • The list to the left shows all active properties and their hierarchy, even if some of them may have never been measured for the jobs displayed. Clicking on the name of a property scrolls down to its description.
  • The colour tags in the table to the right represent the average severity ranging from green (severity is 0) to purple (severity is 1) per job and property while also taking into account how often a property appeared. Grey badges indicate that a property never appeared. The formula used to calculate this value is SUM(a1 ... an)/c where a is the average severity at one timestamp, n the number of measurements of the property and c the number of measurements of the reference property (CPU_OP_CYCLES_ALL).
  • The Column headers show the cores assigned to a batch job and its id.
  • Clicking on an id scrolls down to the corresponding job average data

Job Average Data

This list shows the occurrence counts and average values of the properties for a certain job. Description of the columns from left to right:

  • property: The property hierarchy with the severity colour badges (see Jobs Overview).
  • # of Occurrences: How often a certain property was measured.
  • % of Occurrences: How often a certain property was measured relative to the number of measurements of the reference property (CPU_OP_CYCLES_ALL)
  • % of Cores: How often cores were affected relative to the number of cores assigned to a job in percent. The formula is: 100*SUM(a1 ... an)/(t*c) where a is the number of cores which were measured at one timestamp, n the number of measurements of the property, t the number of cores assigned to the job and c the number of measurements of the reference property (CPU_OP_CYCLES_ALL)
  • avg Value: The arithmetic average of the average values of a certain property in engineering notation.
  • Unit: The unit of a property.
  • avg Severity: arithmetic average of the average severities of a certain property

 

Detailed report

    persystreport -d <command line options>

Use the switch -d to get a detailed report on performance properties. Like the basic report it will include all your jobs submitted. The detailed report includes timeline view with the severity distribution over the affected CPUs. Also the data view with comparison graphs between properties. All the other command line options of the basic report can be used with the detailed report.

Check for reasons why there is no report for a particular job.

Once you have unzipped the result file, you can open the the report.html contained inside the extracted folder to view the results.

Detailed Report - Output

1. User/Job List View
2. Timeline View
3. Data View
4. Browser Compatibility

 

User/Job List View

The table on the right hand side contains jobs that match the parameters passed as command line options. The user information on the left gives information on number of jobs and user details.

  • Clicking a job shows detailed information about that job above the table.
  • The table is sortable by clicking the column headers. Columns can be shown/hidden using the dropdown menu to the upper right.
  • Entering a search string restricts the rows to those matching the string across all columns.
  • Double clicking a row switches to the timeline view for the corresponding job.

Timeline View

This view shows the average severity of each property over time. Every measurement is represented by a colored rectangle, ranging from green (severity is 0) to purple (severity is 1). The grey box this indicates that either PerSyst Monitoring didn't measure the property, or the measurement for this timestamp is missing completely.

The list to the left shows all active properties and their hierarchy, even if some of them may have never been measured for the job. The color tag left to a property's name represents the average severity while also taking into account how often it appeared. The formula is SUM(a1 ... an)/c where a is the average severity at one timestamp, n the number of measurements of the property and c the number of measurements of the reference property (CPU_OP_CYCLES_ALL)

  • Clicking a line or a property name selects the corresponding property.
  • An explanation as well as a hint for the selected property are shown above the timeline, details appear beneath on the left side, for example:

122 Occurrences (42.95%) / 1.03% of Cores
avg Value: 411.61809e+0 1/cycles
avg Severity: 0.1472

Where the first line shows:

  • the number of measurements (122)
  • the percentage of measurements (42,95%) relative to the number of measurements of the reference property (CPU_OP_CYCLES_ALL)
  • and the percentage of affected cores. The formula is: 100*SUM(a1 ... an)/(t*c) where a is the number of cores which were measured at one timestamp, n the number of measurements of the property, t the number of cores assigned to the job and c the number of measurements of the reference property (CPU_OP_CYCLES_ALL)

The second and third line show the arithmetic average of the average values/severities

  • Hovering with the mouse along the timeline brings up a little overlay window which shows the distribution of the severities for the selected Property at a certain point in time.
  • Clicking the back button in the top left corner switches back to the job list view.
  • Double clicking a line switches to the data view.

Data View

The data view is split horizontally to allow the comparison of the value/severity distribution between two properties of the selected job. Each part shows a plot of values and/or severities for the currently selected property and a table of the plotted data. The property for the upper part is set to the one selected in the timeline view by default.

  • The series dropdown menu allows the selection/deselection of all plottable data series. The first two options are shortcuts for the average, minimum, median and maximum series for the Values or Severities of the current property.
  • Mouse behaviour within the Plot:
  • Hovering shows the corresponding values in an overlay window.
    • Dragging horizontally zooms the x-axis
    • Dragging vertically zooms the y-axis
    • Double clicking resets zoom state of the plot
  • The selector above the plot changes the currently displayed property. Greyed out options indicate missing data for the property.
  • If the property that is currently displayed is measured per second, a Unit dropdown menu above the data table lets you switch to per cycle values.
  • The table contains the raw data for the selected property. Dashes indicate that the property was not measured at a certain time. Contrary to the timeline view, missing measurements are not interpolated.
  • Clicking the Back button in the top left corner switches back to the timeline view.

Browser Compatibility

The browser must support the HTML 5 canvas tag, and allow loading data from local files via XMLHTTPRequests.

In Feb '11 this applies to: Firefox 3.6, Safari 5, IE 9 RC. Works with Chrome 9 if launched with the --allow-file-access-from-files switch.

Jobs which aren't monitored

Please note that some jobs will not appear in neither the basic nor the detailed report because:

  • Jobs that run in the login partition (a01) are not monitored.
  • Measurements are carried out every 10 minutes beginning every day at 00:00:00. Jobs which are running less than 10 minutes might not be captured by our monitoring tool.
  • In some rare occasions the monitoring tool is switched off for an entire partition in order to carry out special performance measurements.
  • Jobs were submitted before December 2010.

PerSyst Monitoring Code

Part of the code which was used to collect the data is available on the following links. Please note that we provide it as framework (not the actual code runnning on our systems) with additional example classes for a cluster an main function:

PerSyst Monitoring

Documentation