ALIs
kommt nochResource Limits
Description of constraints under which jobs run on the Cluster systems: maximum run times, maximum memory and other SGE-imposed parameters
Policies for interactive shells and interactive jobs
Limitations
-
On login shells, programs should not run for more than a few minutes. If possible, please start interactive runs with a nice value of at least 5, e. g.:
nohup nice -n 10 ./my_program > prog.out 2> prog.err &will increase the nice value of my_program from the default of 0 to 10. At LRZ's discretion, too long-running jobs will be forcibly removed from the interactive nodes.
-
Overloading of interactive nodes with jobs may also lead to job termination by LRZ personnel, especially if memory consumption exceeds available resources.
-
Usage of the netscape/mozilla browsers is only allowed on interactive nodes; on all other nodes instances of netscape/mozilla are regularly removed by the LRZ surveillance system.
-
Usage of the cron system (e.g. via /usr/bin/crontab) as well as the /usr/bin/at or /usr/bin/batch commands is not allowed.
Resource limits for interactive jobs
|
Partition |
Host Name |
remarks |
Run time limit (hours) |
Memory limit (GBytes) |
|---|---|---|---|---|
|
x86_64 interactive node (login shell) |
lx64ia2 |
2 load balanced Opteron nodes (4 sockets, 8 cores) |
4 |
32 (shared) |
|
EM64T interactive node (login shell) |
lx64ia3 |
Intel Nocona (2 cores) |
4 |
2 (shared) |
Software licenses
Many commercial software packages have been licensed for usage on the cluster; most of these require the use of so-called floating licenses, only a limited amount of which are typically available. Since it is not possible to check whether a license is available before a batch job starts, LRZcannot provide any guarantees that an SGE job requesting use of such a license will run.
Policies for queued batch jobs
Scheduling
The scheduler assigns an initiation priority to all queued batch jobs; the priority value will increase while the job is waiting until the head of the queue is reached; as soon as the needed resources are available, the job will be started.
LRZ has also introduced user shares to prevent individual users from monopolizing the cluster. This means that if you used lots of cycles during the last few weeks and the cluster is very busy, your presently queued jobs may get started at a much lower rate until your share - as compared to other users' - has again dropped to the threshold value.
Jobs in Hold
Jobs in user hold may be removed by LRZ administrators if older than 8 weeks.
Memory use
Jobs exceeding the physical memory available on the selected node (set) will be removed at LRZ's discretion since such a usage typically has a negative impact on system stability.
Resource Limits
The following is an overview of the resource limits imposed for various classes of jobs. These are comprised of run time limits, and memory limits.
|
Job Type |
Architecture |
Remarks |
Run time limit (hours) |
Memory limit (GByte) |
|---|---|---|---|---|
|
serial execution |
4-way Opteron or Intel EM64T |
Single core in a multi-core node. -l march=x86_64 If more than 2 GB are needed, please explicitly say-l mf=6gb (for e.g., 6 GB) |
240 |
7.9 |
|
long running
serial execution (at increased risk) |
4-way Opteron |
Single core in a multi-core node. The SGE specifications
-l march=x86_64 -l h_rt=hh:mm:ss with hour values larger than 240 must be specified. The remarks on memory usage from the entry above also apply here. Warning:
|
336 (two weeks)
or 1344 (8 weeks) |
7.9 |