File Systems and File transfer on SuperMUC

Overview of available file systems

File systemEnvironment
Variable
for Access
PurposeImplementation
Overall size
Bandwidth
Backup
And
Snapshots
Intended
Lifetime and
Cleanup Strategy
Quota size
(per project)


NAS-based shared file system

/home/hpc $HOME store the user's source, input data, and small and important result files.

Gobally accessible from login and compute nodes.

NAS-Filer
1.5 PB
10 GB/s

YES:
backup to
tape and
Snapshots

Project duration

default: 100 GB per project1


High-performance parallel file system on Phase 1 and Phase 2

/gss/scratch $SCRATCH temporary huge files (restart files, files to be pre-/postprocessed).

Gobally accessible from login and compute nodes.

GPFS
5.2 PB
up to 150 GB/s

NO

Automatic deletion will happen.

no quota, but high water mark deletion if necessary
/gpfs/work $WORK

huge result files.

Gobally accessible from login and compute nodes.

GPFS
10 PB
(shared with old scratch area)
up to 200 GB/s

NO

Project duration
(beware of technical problems, archive important data to tape or other safe place!)

default: 1 TB per project1


Different on Thin/Fat nodes and on Compute/Login Nodes

various $TMPDIR temporary filesystem for system command use.

different on login and compute nodes.

setting may vary

NO

May be deleted after job end or logout  
/tmp direct use of /tmp is strongly discouraged on compute nodes (may impact system usability) !
1 Can be increased upon request

Data after the end of project

Data on disc and in the tape archive will be deleted one year after the end of the project. However, for the data in the tape archive, the project manager can request that the project is converted into a data-only project to gain further access to the archived data. Additionally, the project manager is warned by email after the project end that the data will be deleted.

User's responsibility for saving important data

Having (parallel) filesystems of several hundreds of Terabytes (/scratch and /work), it is technically impossible (or too expensive) to backup these data automatically. Although the disks are protected by RAID mechanisms, other severe incidents might destroy the data. In most cases however, it is the user himself who incidently deletes or overwrites files. Therefore it is within the responsibility of the user to transfer data to more safe places (e.g. $HOME) and to archive them to tapes. Due to the long off-line times for dump and restoring of data, LRZ might not be able to recover data from any type of file outage/inconsistency of the scratch or work filesystems. The alias name $WORK and the intended storage period until the end of your project should not be misguided as an indication for the data safeness!

There is no automatic backup for $SCRATCH and $WORK. Beside automatic deletion, severe technical problems might destroy your data. It is your obligation to copy, transfer, or archive the files you want to keep!

Limitations and advantages of the GPFS file system

On SuperMUC, two GPFS file spaces named fs1 and fs2 have been established to provide high performance parallel I/O. The environment variables $WORK and $SCRATCH point to the location of file systems fs1 and fs2 respectively. These file systems are tuned for high bandwidth, but it is not optimal for handling large quantities of small files located in a single directory with parallel accesses. In particular, generating more than ca. 1000 files per directory at approximately the same time from either a parallel program or from simultaneously running jobs will probably cause your application(s) to experience I/O errors (due to timeouts) and crashes. If you require this usage pattern, please generate a directory hierarchy with at most a few hundred files per subdirectory.

Temporary filesystems $SCRATCH

Please use the environment variable $SCRATCH to access the temporary filesystems. This variable points to the location where the underlying file system will deliver optimal IO-Performance. Do not use /tmp or $TMPDIR for storing temporary files! (The filesystem where /tmp resides in memory is very small and slow. Files will be regularly deleted by automatic procedures or sysadmins.

Coping with high watermark deletion in $SCRATCH

The high watermark deletion mechanism may remove files which are only a few days old if the file system is used heavily. In order to cope with this situation, please note:

  • The normal tar -x command preserves the modification time of the original file and not the time when the archive has been unpacked. Therefore, files which have been unpacked from an older archive are one of the first candidates to be deleted. To prevent this, use tar -xm to unpack your files, which will give them the actual date.
  • Please use the TSM system to archive/retrieve files from/to SCRATCH to/from the tape archive.
  • Please always use $WORK or $SCRATCH for files which are considerably larger than 1 GB.
  • Please remove any files which are not needed any more as soon as possible. The high watermark deletion procedure is then less likely to be triggered.
  • More information about the filling of the file systems and about the oldest files will be made available on a web site in the near future.

Snapshots, backup, archiving, and restoring

Snapshots

For all files in $HOME backup copies are kept and made available in the special (read-only) subdirectory $HOME/.snapshot or in any directory as .snapshot. Please note that the .snapshot directories are not visible by simple ls command. A file can be restored by simply copying the file from the appropriate snapshot directory to its original or other location. Example:

cd .snapshot
ls -l
cp daily.YYYY-MM-DD_hhmm/missingfile ..

Archive

For using the TSM tape archiving and backup it is necessary to login to the archiving nodes supermuc-tsm.lrz.de. The regular login nodes do not support TSM usage. Conversely, the archiving nodes should not be used for any purpose than TSM data handling.


Transferring files from/to other systems

Please see the appropriate subsection in the login document for a description.


Quotas

General information on quota

To see your quota please issue the following command:

budget_and_quota

Quota/Volume limits in $HOME

The storage for $HOME is limited. Each project is assigned a separate volume on the NAS-Filer which will be mounted to /home/hpc/<project_name> and will contain the home-directories of the project's users. The maximum size of the volumes is limited. The command to get information about your quota and the disk space used is::

df $HOME or df -Bg $HOME for output in GByte

The disk space in $HOME is not only occupied by your current data but also by snapshots ("backup copies") from the last 10 days. Typically your file space consists of 150 GB quota + an additional 150 GB for the snapshots. If you change and delete lots of files in your home directory during 10 days so that the amount of changes is larger than 150 GB in 10 days the additional space is not sufficient and snapshots will also take up space from the "real" quota until they are automatically deleted.

It might help if you do not place any temporary files in your home directory ($HOME) but use the large parallel project file system $WORK or the parallel temporary file system $SCRATCH.

Quota limits in $WORK on SuperMUC

The storage is limited on the level of projects. Each project is assigned a file set. Individual user quotas or sub-quotas are not possible. To directly see the quotas use:

 /usr/lpp/mmfs/bin/mmlsquota -j work_projectID fs1

where projectID is your project ID/Unix group ID.

Pfeil nach oben


Conversion of a SuperMUC project into a Data-Only Project (after project end)

Data in the tape archive will be deleted one year after the project end if the project is not converted into a data only project. However, the project manager can request that the project is converted into a data-only project to have further access to the archived data. The project manager is warned by email after the project end that the data will be deleted.

On request, it is possible  to convert a SuperMUC project into a Data-Only project. Within such a Data-Only project the project manager is able to further retain and access the data once archived on tape, thus using the tape archive as a safe and reliable long term storage for the data generated by an SuperMUC project.

Data can than be accessed via the gateway node "tsmgw.abs.lrz.de" using the SuperMUC username and password of the project manager. Access to the server is possible via SSH with no restricitons on the IP address. However, access to SuperMUC itself is not possible after the end of a project. Currently, the server is equipped with a 37 TB local disk storage (/tsmtrans) to buffer the data retrieved from tape. There is a directory /tsmtrans/<username> where you can store the data and transfer them via scp.

The project manager can access all data of the project that are stored in the tape archive, but it is necessary to use the -fromowner=otheruser flag for data which was not archived by him/herself but another project member. Also, the password for accessing the tape archive (TSM Node) is not stored on the gateway node and must be set and remembered by the project manager.  

  • When a SuperMUC project ends, the project manager will receive a reminder E-Mail, explaining the steps necessary to convert the project.

Pfeil nach oben