ALIs
kommt nochThe reservation system and how to use it
This section gives a more detailed description of the inner workings of the remote visualisation servers at the LRZ. You should read the next section if you want to work more than one hour with the remote desktop or want to have a bigger remote desktop. The rest of this page is then written for the power-user that prefers the command-line interface and wants to be in full control.
Advanced Options for TurboVNC users
The Quickstart section described the easiest and most basic method to log on to the remote visualisation server and get going as quick as possible. This, however, implies that we had to reduce-to-the-max. You only get a remote desktop for one hour and the size of the desktop is limited to 1024x768 pixel.
Here, you will learn the necessary commands to make a reservation for a prolonged period of time and request a TurboVNC session with the resolution you want. Continue reading in the following section to learn how to check the current status of the remote visualisation server.
When TurboVNC is started a number of files starting with "vnc_" will be created in your home directory, which contain the information on how to connect to the VNC-server with the TurboVNC-Viewer on the client. Use e.g.
rvs1:> cat vnc_displaynumber
to show on which display the VNC-server is running (this will output something like '51'). If your client runs under Linux,
rvs1:> /opt/TurbVNC/vncviewer rvs1.hlrb2.lrz-muenchen.de:51
(the number may be different in your case and Linux-Cluster users have to replace rvs1.hlrb2 by either gvs1 or gvs2) will then connect you to the VNC-server. For Windows clients you have to enter these values in the graphical user interface of the vncviewer (you have to give the full hostname, e.g. rvs1.hlrb2.lrz-muenchen.de).
Requesting a TurboVNC session for more than one hour
The most basic command 'rvnc' was already discussed in the Quickstart section. This command starts a TurboVNC remote desktop session for one hour. If you want to work longer on rvs1, use the following command:
rvs1:> rvrun -d <hours:minutes> tvncserver
On gvs1, use this command:
gvs1:> rvrun -d <hours:minutes:seconds> tvncserver
Replace 'hours:minutes' (or 'hours:minutes:seconds' on gvs1) by the desired reservation time. For example, 'rvrun -d 3:47 tvncserver' would give you a remote desktop for 3 hours and 47 minutes.
Requesting a TurboVNC session with a bigger desktop
The command 'rvnc' starts a TurboVNC remote desktop session for one hour and with a standard size of 1024x768 pixel. If your connected to a high-speed network and want to have a bigger desktop on rvs1, you can use the following command:
rvs1:> rvrun -d 1 'tvncserver -v GEOMETRY=1280x1024'
If you are working on gvs1, use this command:
gvs1:> rvrun -d 1:0:0 -v GEOMETRY=1280x1024 tvncserver
This would give you a remote desktop with 1280x1024 pixel. You can replace this geometry argument with any resolution you desire, but keep in mind that a desktop with twice the size requires four times the bandwidth of your internet connection!
More about the advance reservation system
LRZ uses the Sun N1 Grid Engine together with the extension 'advance reservation server' to manage the resources on the remote visualisation servers. This section describes the relevant features and shows their usage.
Checking the current usage of the remote visualisation server
Log on to the remote visualisation server using
client:> vglconnect rvs1.hlrb2.lrz-muenchen.de
If your firewall blocks port 4242, use this command:
client:> vglconnect -s rvs1.hlrb2.lrz-muenchen.de
(Replace 'rvs1.hlrb2.lrz-muenchen.de' with 'gvs1.lrz-muenchen.de' if you work on the Linux-Cluster and gvs1.)
Now, you can check the current status of the remote visualisation server with the rvstat command. You will see something like this (on rvs1 - output on gvs1 similar):
rvs1:> rvstat
============= status of remote visualisation server rvs1 ===============
V. 0.7
List of active reservations:
queue name | gfx | start date | starts @ | ends @ | mins.
-----------------------+-----+-------------+----------+----------+------
a2832ls1195036446206 | 1/1 | 14.nov.2007 | 11:35:00 | 14:35:00 | 180
a2832ci1194541913479 | 0/2 | 14.nov.2007 | 15:00:00 | 17:00:00 | 120
a2832ci1194541917891 | 0/1 | 15.nov.2007 | 10:00:00 | 15:00:00 | 300
Graphics card usage in the next 6 hours:
........................................................................
........................................................................
.........................................###############################
####################################.....###############################
| | | | | |
12 13 14 15 16 17
Next possibility to run a job with 1 graphics cards for 1 hours:
From 11:38 on this would be possible -- to reserve use this command:
runar Reserve -a 11141138 -duration 1:00 -l gfx=1
use rvstat -h for help
========================================================================
The first part shows the reservations of all users. The first column contains the queue-name. In the beginning of the queue-name, you will recognize your user-id (first 7 characters of the queue-name). The next 10 characters are the start-time in Unix-format (i.e. the number of seconds since January 1st, 1970). The last three characters are numbers used by the Sun N1 Grid Engine.
The second column lists reserved and currently used graphics cards per queue, where e.g. "0/2" in column labeled "gfx" means that two graphics cards are reserved but at the moment none is actually used (i.e. there is no job running in this queue). The other 4 columns are self-explanatory: date of the reservation, start and end time in 24-hour format and reservation time in minutes.
The second part ("Graphics card usage in the next 6 hours:") displays how many graphics cards are reserved in the next six hours. The four lines represent the four available graphics cards. A dot indicates that the graphics card is free, the pound sign ('#') indicates an active reservation.
The third part ("Next possibility to ...") proposes a reservation command that will reserve one graphics card for one hour as soon as possible. If you require more graphics cards or a longer duration you can use the -g and -d command line options of rvstat.
You can also check the graphics card usage and get a reservation command suggestion for later times by using the -s option. For example, if you intend to reserve two graphics cards for three hours and 20 minutes tomorrow (i.e. at least 24 hours from now), you could get a reservation suggestion by executing
rvs1:> rvstat -g 2 -d 3:20 -s 24
(Note: This command option is not available on gvs1, use rvar instead.)
How to submit a (complex) reservation
The most general form to submit a reservation on rvs1 is this:rvs1:> runar Reserve -a <month:day:hour:minute> -duration <hours:minutes> -l gfx=<number of graphics cards>
The '-a' switch determines the starting time of your reservation and has to be in the format <month:day:hour:minute> (in 24h-format and with two digits for each part). The '-duration' switch has to be in the format <hours:minutes> and obviously determines the length of the reservation. Finally, the '-l' switch determines the number of graphics cards that you want to use for one(!) application (if you want to start for applications that each use one graphics card, you would have to submit for seperate reservation commands; if you have an application that supports parallel rendering and want to use more than one graphics card, you have to use the '-l' switch).
Example: To make a reservation for August 3rd, starting 2pm, running for 4 hours and 7 minutes and using two graphics cards, you would submit the following command:
rvs1:> runar Reserve -a 08031400 -duration 4:07 -l gfx=2
(The "-l" option can be omitted if you want to reserve only one graphics card.)
After a successful reservation you will get a message like this
Reservation requested for Fri Aug 03 16:58:00 CEST 2007
Confirmed: ReplyReservation for host 'rvs1.hlrb2.lrz-muenchen.de':
Reservation{ queue=a2832ls1186138867890 duration=01:00:00.000
Reserve Resources at Fri Aug 03 16:50:00 CEST 2007
[Start=Fri Aug 03 16:58:00 CEST 2007 to End=Fri Aug 03 17:58:00 CEST 2007]
Resources: [graphics=1]
User=a2832ls Email=root@
State=Confirmed Reservation_Made Fri Aug 03 13:01:07 CEST 2007 (changed) }
which contains the id of the queue that was created for you by the reservation system (in the example above "a2832ls1186138867890"). This queue should also show up in the output of rvstat and qstat -f.
If you are working on gvs1, the command for the example above has to be replaced by:
gvs1:> rvar -a 08031400 -d 4:07:00 -g 2
Why did my reservation not go through?
There are mainly two reasons why reservation could fail: If the error message is
Failed: ResultFailure: Start time must be in the future.
you just have to increase the reservation starting time a bit (you submitted the reservation a few seconds before the next minute and the reservation system doesn't allow the current minute to be the starting time). If the error is
Failed: ResultFailure: Reservation could not be assured.
you probably tried to reserve more graphics cards than there are available in the specified period. This should not happen when you have used the runar command suggested by rvstat.
Please delete reservations that you don't need anymore!
If your reservation period has not yet ended but you do not need the reservation any more, on rvs1, please delete it with
rvs1:> rvdel <queue-id>
To get the queue-id, use the 'rvstat' command.
On gv1, please use the command:
gvs1:> rvdel <ar-id>
To get the advance-reservation ID (ar-id), use the 'rvstat' command on gvs1.
How to (re-)submit a job to your queue
If you're inside a TurboVNC session, then you don't have to worry about this. If a program crashes or you want to start another application, simply click on the corresponding desktop icon and you're all set.
This section is for the command-line user. Say, you have a big paper coming up soon and therefore made a reservation for five hours straight. In the middle of your visualisation session, your application crashed. How can you resume your work?
The problem is, that your current reservation is still active and blocking the resource. One way, of course, would be to delete your current reservation and submit a new one. More elegant, however, is to simply re-use the currently active queue and to just submit a new job to this queue.
First, you have to get your queue-id - use the 'rvstat' command for this (you need the queue-id on rvs1, or the ar-id on gvs1). You can recognize your queue-id by the start and end time and the first seven characters match your user-id. Once you have your queue-id, use this command to submit a new job to this queue (i.e. start a new program that uses the reserved resources):
rvs1:> qsub -q <your queue-id> -l gfx=<#reserved gfx cards> <program-name>
More information on how to start which application can be found here.
If you are working on gvs1, use this command:
gvs1:> qsub -ar <ar-id> <your-application-name>
After you have submitted your job, it may take one or two minutes until it starts. You can check the status of your jobs using
rvs1:> qstat -f
As long as the job is listed under "pending jobs" it is not yet started. If your job is not "pending" any more but fails to run, you can find informations about the cause in a log file that is created in your home directory, e.g. visit.log in case of VisIt. There may also be a second log file in the /tmp folder that contains output of the queue script, e.g. /tmp/visit.124356.log.
Limitations
The advance reservation system can only handle one resource -- the graphics cards. Therefore, it is necessary to have a fixed relation between the number of reserved graphics cards an other resources like the number of CPUs and the amount of memory that will be used by a job. To avoid "overbooking" (and leave some resources for the operating system) please adhere to these rules on rvs1 (gvs1 has 32 CPU cores and 256 GByte of RAM with similar restrictions):
| #reserved graphics cards | 1 | 2 | 3 | 4 | ||
| max. #CPU cores | 3 | 7 | 11 | 14 | ||
| max. memory [GB] | 25 | 50 | 75 | 100 |
The queue scripts provided in /usr/local/qscripts automatically start parallel applications with the right number of CPU cores.