Selected Intel Compiler Options

Option Description
-O0 Disables all optimizations. Recommended for program development and debugging
-O1 Enables optimization for speed, while being aware of code size (e.g no loop unrolling)
-O2 Default optimization. Optimizations for speed, including global code scheduling, software pipelining, predication, and speculation.
-O3 -O2 optimizations plus more aggressive optimizations such as prefetching, scalar replacement, and loop transformations. Enables optimizations for technical computing applications (loop-intensive code): loop optimizations and data prefetch.
-Oi Inline expansion of intrinsic functions
-xcode

SSE4.2: On the Westmeer Fat Nodes of SuperMUC Phase1: generate SSE4.2 instructions
AVX: On SandyBridge cores of the thin nodes of SuperMUC: generate Intel Advanced Vector Extensions.
CORE-AVX2: On Haswell cores of Phase2 of SuperMUC or on CollMUC2 of the Linux-Cluster: generate Advanced Vector Extensions 2.
MIC-AVX512: On the Knights Landing many-core nodes of the Linux-Cluster, may generate Advanced Vector Extensions 5
CORE-AVX512/COMMON-AVX512: On Skylake Nodes:  may generate Intel® Advanced Vector Extensions 512  

host: Tells the compiler to generate instructions for the highest instruction set available on the compilation hos

-xcode SANDYBRIDGE, HASWELL, KNL, SKYLAKE, SKYLAKE-512: May generate instructions for processors that support the specified Intel® microarchitecture code name. These keywords are only available for Intel compilers from 18.0 and higher.

-axcode1,code2

This option tells the compiler to generate multiple, processor-specific auto-dispatch code paths for Intel processors if there is a performance benefit. It also generates a baseline code path which can run on non-AVX processors. The Intel processor-specific auto-dispatch path is usually more optimized than the baseline path. May generate Intel(R) Advanced Vector Extensions 2 (AVX2), AVX, SSE4.2,  SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions for Intel(R) processors.
The  option  tells  the compiler to find opportunities to generate separate versions of functions that take advantage of features of the specified instruction features. If the compiler finds such an opportunity, it first checks whether generating a feature-specific version of a function is likely to result in a performance gain. If this is the case, the compiler generates both a feature-specific version of a function and a baseline version of the function. At run time, one of the versions is chosen to execute, depending on the Intel(R) processor in use.
Three version will be generated with -axAVX,CORE-AVX2: baseline, Sandy-Bridge and Haswell.

qopt-zmm-usage=
[low|high]
low: Tells the compiler that the compiled program is unlikely to benefit from zmm registers usage. It specifies that the compiler should avoid using zmm registers unless it can prove the gain from their usage (default for CORE-AVX512)
high: Tells the compiler to generate zmm code without restrictions (default for COMMON-AVX512)
-fno_alias Specifies that aliasing should not be assumed in the program. Allows the compiler to generate faster code.
-
-ftz Enables flush denormal results to zero (default with -O3)
-ipo Enables interprocedural (IP) optimizations, e.g. inline function expansion for calls to functions defined in separate files
-p Compiles and links for function profiling with gprof
-prof_genx Instruments a program for profiling with codecov
-prof_use Use formerly collected profiling information during optimization
-g Produces a symbol tables, i.e. line numbers for profiling are available.
-openmp Enables the parallelizer to generate multithreaded code based on OpenMP directives.
-parallel Tells the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. To use this option, you must also specify -O2 or -O3.
-opt_report generate an optimization report to stderr.