Content-type: text/html Manpage of pgfortran

pgfortran

Section: User Commands (1)
Updated: May 2016
Index Return to Main Contents
 

NAME

pgfortran - The Portland Group Fortran compiler  

SYNOPSIS

pgfortran [ -flag ]... sourcefile...  

DESCRIPTION

pgfortran is the interface to The PGI Fortran compiler for AMD and Intel processors. pgfortran invokes the Fortran compiler, assembler, and linker with options derived from its command line arguments. pgfortran is an alias for pgf90 and pgf95.

Suffixes of source file names indicate the type of processing to be done:

.f, .for, .ftn
fixed-format Fortran source; compile
.f90, .f95, .f03
free-format Fortran source; compile
.F, .FOR, .FTN, .fpp, .FPP
fixed-format Fortran source; preprocess, compile
.F90, .F95, .F03
free-format Fortran source; preprocess, compile
.cuf
free-format CUDA Fortran source; compile
.CUF
free-format CUDA Fortran source; preprocess, compile
.s
assembler source; assemble
.S
assembler source; preprocess, assemble
.o
object file; passed to linker
.a
library archive file; passed to linker

If coinstalled with pgcc, C file suffixes are also recognized and compiled with the pgcc compiler; see pgcc and PGI User's Guide. Other files are passed to the linker (if linking is requested) with a warning message.

Unless one overrides the default action using a command-line option, pgfortran deletes the intermediate preprocessor and assembler files (see the options -c, -E, -F, and -Mkeepasm); if a single Fortran program is compiled and linked with one pgfortran command, the intermediate object file is also deleted. Linking is the last stage of the compile process, unless you use one of the -c, -E, -F, or -S options, or unless compilation errors stop the whole process.  

OPTIONS

Options must be separate; -cs is different from -c -s. Here is a list of all options, grouped by type. More detailed explanations are in following sections.
Overall Options

-- -# -### -c -[no]defaultoptions -dryrun -drystdinc -echo --flagcheck -flags -help[=option] -Manno -Minform=level -Mkeepasm -M[no]list -noswitcherror -o file -rc rcfile -S -show -silent -time -V -V<ver> -v --version -w -Wpass,option -Ypass,directory
Optimization Options

-fast -fastsse -fPIC -fpic -KPIC -Kpic -Mcache_align -Mconcur=option -M[no]depchk -M[no]dse -Mextract=option -M[no]fma -M[no]frame -Minfo=option -Minline=option -Minstrument=option -M[no]ipa=option -M[no]lre[=assoc|noassoc] -M[no]movnt -Mneginfo=option -Mnoopenmp -Mnosgimp -Mnovintr -Mpfi[=option] -Mpfo[=option] -M[no]pre -M[no]prefetch=option -Mprof=option -M[no]propcond -Mquad -Msafe_lastval -M[no]smart -M[no]smartalloc[=option] -M[no]stride0 -M[no]unroll=option -M[no]unsafe_par_align -M[no]vect=option -M[no]zerotrip -mp[=option] -Olevel -pg
Debugging Options

-C -g -gopt -M[no]bounds -Mchkfpstk -Mchkptr -Mchkstk -Mcoff -Mdwarf1 -Mdwarf2 -Mdwarf3 -Melf -Mnodwarf -M[no]pgicoff -[no]traceback
Preprocessor Options

-Dmacro -E -F -Idirectory -Mcpp=[[no]comment|m|md|mm|mmd|mq:target|mt:target|suffix:suff] -Mnostddef -Mnostdinc -Mpreprocess -Umacro -YI,directory -Yp,directory
Assembler Options

-Wa,argument[,argument]... -Ya,directory
Linker Options

-acclibs --[no-]as-needed -Bdynamic -Bstatic -Bstatic_pgi -cudalibs -g77libs -Ldirectory -llibrary -m -Mcudalib=libname -M[no]eh_frame -Mlfs -Mmpi=option -Mnostartup -Mnostdlib -M[no]rpath -Mscalapack -pgc++libs -pgcpplibs -pgf77libs -pgf90libs -Rdirectory -r -rpath directory -s -shared -soname name -uname --[no-]whole-archive -Wl,argument[,argument]... -YC,directory -Yl,directory -YL,directory -YS,directory -YU,directory
Language Options

-asmsuffix=suffix -byteswapio -csuffix=suffix -fsuffix=suffix -FSUFFIX=suffix -i2 -i4 -i8 -i8storage -Mallocatable[=95|03] -M[no]backslash -Mbyteswapio -Mcray=pointer -Mcuda=option -M[no]dalign -M[no]dclchk -M[no]defaultunit -M[no]dlines -Mdollar=char -Mextend -Mfixed -M[no]free[form] -M[no]i4 -M[no]iomutex -Mlibsuffix=suffix -M[no]llalign -Mnomain -Mobjsuffix=suffix -M[no]onetrip -M[no]r8 -M[no]r8intrinsics=float -M[no]recursive -M[no]ref_externals -M[no]save -M[no]signextend -M[no]stack_arrays -Mstandard -M[no]unixlogical -M[no]upcase -module directory -r4 -r8 -Wh,argument[,argument]...
Target-specific Options

-acc -K[no]ieee -Ktrap=option -M[no]daz -M[no]flushz -M[no]fpapprox=option -M[no]fpmisalign -M[no]fprelaxed=option -M[no]func32 -M[no]large_arrays -M[no]longbranch -M[no]loop32 -M[no]second_underscore -M[no]varargs -Mwritable-strings -m32 -m64 -mcmodel=small|medium -pc=val -ta=target -tp=target

When source files are compiled using any of the -g, -mp, -Mconcur, -Mipa, or -Mprof options, the same option(s) should be passed when using pgfortran to link the objects.

 

Overall Options

--
Anything after this switch is treated as a filename. Note that most tools will not allow a filename starting with a dash, so these should be avoided.
-#
Display the invocations of the compiler, assembler, and linker. These invocations are the command lines created by pgfortran.
-###
Display invocations of the compiler, assembler and linker, but do not execute them.
-c
Skip the link step; compile and assemble only.
-defaultoptions (default) -nodefaultoptions
Use (don't use) the default options set in site-specific or user-specific PREOPTIONS or POSTOPTIONS driver variables.
-dryrun
Use this option to display the invocations of the compiler, assembler, and linker but do not execute them.
-drystdinc
Display the standard include directories without invoking the compiler.
-echo
Echo the command line flags and stop. This is useful when the compiler is invoked by a script.
--flagcheck
Don't compile anything; just emit any messages for command-line switch errors. Return a success error code if there are no command-line switch errors.
-flags
Display all valid pgfortran command-line options in alphabetical order.
-help[=option]
Displays command-line options recognized by pgfortran on the standard output. pgfortran -help -otherswitch will give help about -otherswitch. The default is to list pgfortran command line options by group; options are:
groups
Print out the groups into which the switches are organized.
asm
Print help for assembler command-line options.
debug
Print help for debugging command-line options.
language
Print help for language-specific command-line options.
linker
Print help for linker options.
opt
Print help for optimization command-line options.
other
Print help for any other command-line options.
overall
Print help for overall command-line options.
phase
Print help for the known compiler phases.
prepro
Print help for preprocessor command-line options.
suffix
Describe the known file suffixes.
switch
Print all switches in alphabetical order.
target
Print help for target-specific command-line options.
variable
Show the pgfortran configuration; this is the same as -show.
-Manno
Produce annotated assembly files, where source code is intermixed with assembly language; implies -Mkeepasm.
-Minform=level
Specify the minimum level of error severity that the compiler displays during compilation.
fatal
Instructs the compiler to display fatal error messages.
file (default) nofile
Print out (don't print out) the names of files as they are compiled; this is only active when there is more than one file on the command line.
severe
Instructs the compiler to display severe and fatal error messages.
warn
Instructs the compiler to display warning, severe and fatal error messages.
inform  
Instructs the compiler to display all error messages (inform,
warn, severe and fatal).

The default is -Minform=warn.

-Mkeepasm
Keep the assembly file for each source file, but continue to assemble and link the program. This is mainly for use in compiler performance analysis and debugging.
-Mlist -Mnolist (default)
Create (don't create) a listing file.
-noswitcherror
Ignore unknown command line switches after printing an warning message; the default behavior is to print an error message and halt.
-o file
Use file as the name of the executable program, rather than the default a.out. If used with -c or -S and a single input file, file is used as the name of the object or assembler output file.
-rc rcfile
Specifies the name of a pgfortran startup configuration file. If rcfile is a full pathname, then use the specified file. If rcfile is a relative pathname, use the file name as found in the $DRIVER directory.
-S
Skip the assembly and link steps. Leave the output from the compile step in a file named file.s for each file named, for instance, file.f. See also -o.
-show
Produce help information describing the current pgfortran configuration.
-silent
Do not print warning messages. Same as -Minform=severe.
-time
Print execution times for the various steps in the compiler itself.
-V
Display version messages and other information.
-V<ver>
If the specified version of the compiler is installed, that version of the compiler is invoked.
-v
Verbose mode; print out the command line for each tool before it is executed.
--version
Display version messages and other information.
-w
Do not print warning messages.
-Wpass,option[,option...]
Pass option to the specified pass. Each comma-delimited option is passed as a separate argument. The passes are:
h
for the Fortran 90/95 front end,
0
for the compiler back end,
a
for the assembler,
i
for the interprocedural analyzer, and
l
for the linker.
-Ypass,directory
Look in directory for pass pass, rather than in the standard area. The passes are:
h
Search for the Fortran 90/95 front end executable in directory.
0
Search for the compiler back end executable in directory.
a
Search for the assembler executable in directory.
C
Search for the compiler library in directory.
i
Search for the InterProcedural Analyzer (IPA) in directory.
l
Search for the linker in directory.
I
Set the compiler's standard include directory to directory. The standard include directory is set to a default value by the driver and can be overridden by this option.
L
If the linker supports the -YL option, then pass the option -YL,directory to the linker. Otherwise, use directory as the standard library location.
S
Search for the startup object files in directory.
U
If the linker supports the -YU option, then pass the option -YU,directory to the linker. Otherwise this option is ignored.

 

Optimization Options

-fast
Chooses generally optimal flags for the target platform. Use pgfortran -fast -help to see the equivalent switches. Note this sets the optimization level to a minimum of 2; see -O.
-fastsse
Chooses generally optimal flags for a processor that supports to vectorize for the SSE/AVX instructions. Use pgfortran -fastsse -help to see the equivalent switches.
-fPIC
Equivalent to -fpic; provided for compatibility with other compilers.
-fpic
(Linux only) Instructs the compiler to generate position-independent code which can be used to create shared object files (dynamically linked libraries).
-KPIC
Equivalent to -fpic; provided for compatibility with other compilers.
-Kpic
Equivalent to -fpic; provided for compatibility with other compilers.
-Mcache_align
Align unconstrained data objects of size greater than or equal to 16 bytes on cache-line boundaries. An unconstrained object is a variable or array that is not a member of an aggregate structure or common block, is not allocatable, and is not an automatic array.
-Mconcur[=option[,option,...]]
Instructs the compiler to enable auto-concurrentization of loops. This also sets the optimization level to a minimum of 2; see -O. If -Mconcur is specified, multiple processors will be used to execute loops which the compiler determines to be parallelizable. When linking, the -Mconcur switch must be specified or unresolved references will occur. The OMP_NUM_THREADS or NCPUS environment variables control how many processors will be used to execute parallelized loops. The options can be one or more of the following:
allcores
Use all available cores when the environment variables OMP_NUM_THREADS and NCPUS are not set. This must be specified at link time.
bind
Bind threads to cores or processors. This must be specified at link time.
altcode:n noaltcode
Generate (don't generate) alternate scalar code for parallelized loops. The parallelizer generates scalar code to be executed whenever the loop count is less than or equal to n. If noaltcode is specified, the parallelized version of the loop is always executed regardless of the loop count.
altreduction[:n]
Generate alternate scalar code for parallelized loops containing a reduction. If a parallelized loop contains a reduction, the parallelizer generates scalar code to be executed whenever the loop count is less than or equal to n.
assoc (default) noassoc
Enable (disable) parallelization of loops with reductions.
cncall nocncall (default)
Assume (don't assume) that loops containing calls are safe to parallelize. Also, no minimum loop count threshold must be satisfied before parallelization will occur, and last values of scalars are assumed to be safe.
dist:block
Parallelize with block distribution. Contiguous blocks of iterations of a parallelizable loop are assigned to the available processors.
dist:cyclic
Parallelize with cyclic distribution. The outermost parallelizable loop in any loop nest is parallelized. If a parallelized loop is innermost, its iterations are allocated to processors cyclically. For example, if there are 3 processors executing a loop, processor 0 performs iterations 0, 3, 6, etc; processor 1 performs iterations 1, 4, 7, etc; and processor 2 performs iterations 2, 5, 8, etc.
innermost noinnermost (default)
Enable (disable) parallelization of innermost loops.
levels:n
Parallelize loops nested at most n levels deep; the default is 3.
numa nonuma
(Linux only) Use (don't use) thread/processor affinity for NUMA architectures; use this option when linking the program. -Mconcur=numa will link in a numa library and objects to prevent the operating system from migrating threads from one processor to another.
-Mdepchk (default) -Mnodepchk
Assume (don't assume) that potential data dependencies exist. -Mnodepchk may result in incorrect code.
-Mdse -Mnodse (default)
Enable (disable) the dead store elimination optimization.
-Mextract=[option[,option,...]]
Run the subprogram extraction phase to prepare for inlining. The =lib:filename option must be used with this switch to name an extract library. See -Minline for more details on inlining.
subprogram[,subprogram]
A non-numeric option not containing a period is assumed to be the name of a subprogram to be extracted.
name:subprogram[,subprogram]
Specifies the name of a subprogram or subprograms to be extracted.
lib:directory
Specifies the name of a directory to contain the extracted subprograms; this directory will be created if it does not exist.
[size:]number
A numeric option is assumed to be a size. Functions containing number or less statements are extracted. If both number and function are specified, then functions matching the given name(s) or meeting the size requirements, are extracted.
-Mfma -Mnofma (default)
Generate (don't generate) fused multiply-add (FMA) instructions for targets that support it. FMA instructions are generally faster than separate multiply-add instructions, and can generate higher precision results since the multiply result is not rounded before the addition. However, because of this, the result may be different than the unfused multiply and add instructions. FMA instructions are enabled with higher optimization levels.
-Mframe -Mnoframe (default)
Set up (don't set up) a true stack frame pointer for functions; -Mnoframe allows slightly more efficient operation when a stack frame is not needed, but some options override -Mnoframe.
-Minfo[=option[,option,...]]
Emit useful information to stderr. The options are:
all
Includes options accel, inline, ipa, loop, lre, mp, opt, par, unified, vect.
accel
Emit information about accelerator region targeting.
ccff
Append complete CCFF information to the object files.
ftn
Emit Fortran-specific information.
inline
Emit information about functions extracted and inlined.
intensity
Emit compute intensity information about loops.
ipa
Emit information about the optimizations enabled by interprocedural analysis (IPA).
loop | opt
Emit information about loop optimizations. This includes information about vectorization and loop unrolling.
lre
Emit information about loop-carried redundancy elimination.
mp
Emit information about OpenMP parallel regions.
par
Emit information about loop parallelization.
pfo
Emit profile feedback information
time | stat
Emit compilation statistics.
unified
Emit information about which routines are selected for target-specific optimizations using the PGI Unified Binary.
vect
Emit information about automatic loop vectorization.
With no options, -Minfo is the same as -Minfo=accel,inline,ipa,loop,lre,mp,opt,par,unified,vect.
-Minline[=option[,option,...]]
Pass options to the function inliner. The options are:
lib:filename.ext
Specify an inline library created by a previous -Mextract option. Functions from the specified library are inlined. If no library is specified, functions are extracted from a temporary library created during an extract prepass.
except:func
Specifies which functions should not be inlined.
[name:]function
A non-numeric option is assumed to be a function name. If name: is specified, what follows is always the name of a function.
[size:]number
A numeric option is assumed to be a size. Functions containing number or less statements are inlined. If both number and function are specified, then functions matching the given name(s) or meeting the size requirements, are inlined.
levels:number
number of levels of inlining are performed. The default is 1.
reshape
For Fortran, the default is to not inline subprograms with array arguments if the array shape does not match the shape in the caller. This overrides the default.
-Minstrument [=option]
(linux86-64 only) Generate additional code to enable function-level instrumentation. This option implies -Minfo=ccff and -Mframe. The option is
functions (default)
-Mipa [=option[,option,...]] -Mnoipa (default)
Enable and specify options for InterProcedural Analysis (IPA). Note: IPA is not compatible with parallel make environments (e.g., pmake). IPA also sets the optimization level to a minimum of 2; see -O. If no option list is specified, then it is equivalent to -Mipa=const. The options are:
align noalign (default)
Enable (disable) recognition when pointer targets are all cache-line aligned, allowing better SSE code generation.
arg noarg (default)
Remove (don't remove) arguments replaced by -Mipa=ptr,const. -Mipa=noarg implies -Mipa=nolocalarg.
cg nocg (default)
Generate information for the pgicg call graph display tool. Run pgicgexecutable to see the call graph information.
const (default) noconst
Enable (disable) propagation of constants across procedure calls.
f90ptr nof90ptr (default)
Enable (disable) Fortran 90 pointer disambiguation across procedure calls.
fast
Chooses generally optimal -Mipa flags for the target platform; use pgfortran -Mipa -help to see the equivalent options.
force
Force all objects to recompile regardless of whether IPA information has changed.
globals noglobals (default)
Analyze (don't analyze) which globals are modified by procedure calls.
inline:n
Determine additional functions to inline, allowing up to n levels of inlining. Additional suboptions are:
except:proc
Disables inlining of procedure proc.
nopfo
Ignore any profile frequency information from -Mpfo when choosing which functions to inline.
reshape noreshape (default)
Enable (disable) Fortran inlining with mismatched array shapes.
ipofile
Save IPA information in a .ipo file instead of the default of appending the information to the object file.
jobs:n
Use up to n jobs in parallel to reoptimize object files.
keepobj (default) nokeepobj
Keep (don't keep) the optimized object files, using file name mangling, to reduce recompile time in subsequent application builds.
libc nolibc (default)
Optimize calls to certain standard C library routines.
libinline nolibinline (default)
Allow (don't allow) inlining from routines in libraries; -Mipa=libinline implies -Mipa=inline.
libopt nolibopt (default)
Allow (don't allow) recompiling and reoptimizing routines from libraries with IPA information.
localarg nolocalarg (default)
Enable (disable) feature to externalize local variables to allow arguments to be replaced by -Mipa=ptr. -Mipa=localarg implies -Mipa=arg.
main:func
Specify a function to serve as a global entry point; may appear multiple times; disables linking.
ptr noptr (default)
Enable (disable) pointer disambiguation across procedure calls.
pure nopure (default)
Detect (don't detect) pure functions.
quiet
Don't print out messages about which files are recompiled at link time.
reaggregation noreaggregation (default)
Enable (disable) global struct reaggregation. This can change the order of struct members, or split structs into multiple structs, to improve memory locality and cache utilization.
required
Return an error condition if IPA is inhibited for any reason, rather than the default behavior of linking without IPA optimization.
safe:[function|library]
Declares that the named function, or all functions in the named library are safe; a safe procedure does not call back into the known procedures and does not change any known global variables. Without -Mipa=safe, any unknown procedures will cause IPA to fail.
safeall nosafeall (default)
Declares that all unknown functions are safe (not safe); see -Mipa=safe.
shape noshape (default)
Perform (don't perform) Fortran 90 shape propagation.
summary
Only collect IPA summary information when compiling; this prevents IPA optimization of this file, but allows optimization for other files linked with this file.
vestigial novestigial (default)
Remove (don't remove) functions that are not called.
-Mlre[=assoc|noassoc] -Mnolre
Enable (disable) loop-carried redundancy elimination. The assoc option allows expression reassociation, and the noassoc option disallows expression reassociation.
-Mmovnt -Mnomovnt
Force (disable) generation of nontemporal moves. -Mmovnt used with -fastsse can sometimes be faster than -fastsse alone. By default nontemporal moves are generated for loops with large loop counts.
-Mneginfo=option[,option...]
Instructs the compiler to produce information on why certain optimizations are not performed. Use the -Minfo flag instead.
-Mnoopenmp
When -mp is present, ignore the OpenMP pragmas.
-Mnosgimp
When -mp is present, ignore the SGI parallelization pragmas.
-Mnovintr
Do not generate vector intrinsic calls.
-Mpfi[=option]
Generate profile feedback instrumentation; this includes extra code to collect run-time statistics to be used in a subsequent compile; -Mpfi must also appear when the program is linked. When the program is run, a profile feedback file pgfi.out will be generated; see -Mpfo. The allowed options are:
indirect noindirect (default)
Enable (disable) collection of indirect function call targets, which can be used for indirect function call inlining.
-Mpfo[=option[,option,...]]
Enable profile feedback optimizations; there must be a profile feedback file pgfi.out in the current directory, which contains the result of an execution of the program compiled with -Mpfi. The options are:
indirect noindirect (default)
Enable (disable) indirect function call inlining; this requires a pgfi.out file generated from a binary built with -Mpfi=indirect.
layout (default) nolayout
Enable (disable) basic block layout to take advantage of instruction cache locality by keeping hot paths close together.
dir=directory
Specify the directory containing the pgfi.out profile feedback information file; the default is the current directory.
-Mpre -Mnopre (default)
Enable (disable) the partial redundancy elimination optimization.
-Mprefetch[=option:n] -Mnoprefetch
Add (don't add) prefetch instructions for those processors that support them (Pentium 4, Opteron); -Mprefetch is default on Opteron; -Mnoprefetch is default on other processors. The options are:
distance:d
Set the fetch-ahead distance for prefetch instructions to d cache lines.
n:n
Set the maximum number of prefetch instructions to generate in a loop to n.
nta
Use the prefetchnta instruction.
plain
Use the prefetch instruction.
t0
Use the prefetcht0 instruction.
w
Allow the AMD-specific prefetchw instruction.
-Mprof[=option[,option,...]]
Set performance profiling options. Use of these options will cause the resulting executable to create a performance profile that can be viewed and analyzed with the PGPROF performance profiler. In the descriptions below, PGI-style profiling implies compiler-generated source instrumentation. MPICH-style profiling implies the use of instrumented wrappers for MPI library routines. The -Mprof options are:
ccff
dwarf
Generate limited DWARF symbol information sufficient for most performance profilers.
func
Perform PGI-style function level profiling.
hwcts
Generate a profile using event-based sampling of hardware counters via the PAPI interface (linux86-64 only, PAPI must be installed).
lines
Perform PGI-style line level profiling.
mpich1
(PGI CDK only) Perform MPICH-style profiling for MPICH-1. Implies -Mmpi=mpich1. Use MPIDIR to point to the MPICH-1 libraries. This flag is no longer fully supported.
mpich2
(PGI CDK only) Perform MPICH-style profiling for MPICH-2. Implies -Mmpi=mpich2. Use MPIDIR to point to the MPICH-1 libraries. This flag is no longer fully supported.
mvapich1
(PGI CDK only) Perform MPICH-style profiling for MVAPICH. Implies -Mmpi=mvapich1. Use MPIDIR to point to the MPICH-1 libraries. This flag is no longer fully supported.
sgimpi
(PGI CDK only) Perform MPICH-style profiling for the SGI MPI library. Implies -Mmpi=sgimpi.
time
Generate a profile using time-based instruction-level statistical sampling. This is equivalent to -pg, except that the profile is saved in a file named pgprof.out instead of gmon.out.

On Linux systems that have OProfile installed, PGPROF supports collection of performance data without recompilation. Use of -Mprof=dwarf is useful for this mode of profiling.

-Mpropcond (default) -Mnopropcond
Enable (disable) propagation of constant values derived from conditional branches with equality tests.
-Mquad
Align large objects on quad-word boundaries.
-Msafe_lastval
In the case where a scalar is used after a loop, but is not defined on every iteration of the loop, the compiler does not by default parallelize the loop. However, this option tells the compiler it is safe to parallelize the loop.
-Msmart -Mnosmart (default)
Enable (disable) optional AMD64-specific post-pass instruction scheduling.
-Msmartalloc=option[,...] -Mnosmartalloc (default)
Add (don't add) a call to the routine mallopt in the main routine; this can have a dramatic impact on the performance of programs that dynamically allocate memory. To be effective, this switch must be specified when compiling the file containing the Fortran, C, or C++ main routine. This is currently only available on 64-bit Linux systems. The behavior of -Msmartalloc can be modified with the following options:
huge
Link in the huge page runtime library, so dynamic memory will be allocated in huge pages.
huge:n
Link in the huge page runtime library and allocate n huge pages.
hugebss
(x86-64 only) Link in the huge page runtime library and allocate the BSS section (containing uninitialized static symbols) in huge pages. This requires that the huge page runtime library be linked dynamically, so the -rpath option for that directory will be added regardless of the setting of -Mnorpath.
nohuge
Override any previous -Msmartalloc=huge or -Msmartalloc=hugebss switches; do not link in the huge page runtime library.
-Mstride0 -Mnostride0 (default)
Generate (don't generate) alternate code for a loop that contains an induction variable whose increment may be zero.
-Munroll[=option[,option...]] -Mnounroll (default)
Invoke (don't invoke) the loop unroller. This also sets the optimization level to a minimum of 2; see -O. The option is one of the following:
c:m
Instructs the compiler to completely unroll loops with a constant loop count less than or equal to m, a supplied constant. If this value is not supplied, the m count is set to 4. If m is set to 1, a compiler heuristic determines the maximum loop count at which such loops will be completely unrolled.
n:u
Instructs the compiler to unroll u times, a single-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the unroller computes the number of times a candidate loop is unrolled.
m:u
Instructs the compiler to unroll u times, a multi-block loop which is not completely unrolled, or has a non-constant loop count. If u is not supplied, the unroller computes the number of times a candidate loop is unrolled.

-Mnounroll instructs the compiler not to unroll loops.

-Munsafe_par_align -Mnounsafe_par_align
Use (don't use) aligned moves for array loads in parallelized loops as long as the first element of the array is aligned; this is only effective with -Mvect=simd. It is unsafe because there are situations where the array elements allocated to some processors are not aligned.
-Mvect [=option[,option,...]] -Mnovect (default)
Pass options to the internal vectorizer. This also sets the optimization level to a minimum of 2, the equivalent of -O; for more information see optimization levels under -O. If no option list is specified, then the following vector optimizations are used: assoc,cachesize:c,nosimd, where c is the actual cache size of the machine. The -Mvect options are:
altcode (default) noaltcode
Enable (disable) alternate code generation for vector loops, depending on such characteristics as array alignments and loop counts.
fuse nofuse (default)
Enable (disable) loop fusion to combine adjacent loops into a single loop.
prefetch
Use prefetch instructions in loops where profitable.
simd[:128|256] nosimd (default)
Use vector SIMD instructions (SSE, AVX) instructions. The argument may be used to limit usage to 128-bit SIMD instructions. Specifying 256-bit SIMD instructions is only possible for target processors that support AVX.
uniform nouniform (default)
Perform the same optimizations in the vectorized and residual loops. This may affect the performance of the residual loop.
These options are also supported, but are not recommended for use in new development, except by experienced users, and may be phased out in future releases:
assoc (default) noassoc
Enable (disable) certain associativity conversions that can change the results of a computation due to floating point roundoff error differences. A typical optimization is to change the order of additions, which is mathematically correct, but can be computationally different, due to roundoff error.
cachesize:number (default=automatic)
Instructs the vectorizer, when performing cache tiling optimizations, to assume a cache size of number.
gather (default) nogather
Enable (disable) vectorization of loops with indirect array references.
idiom noidiom (default)
Enable idiom recognition; this currently has no effect.
levels:n
Set maximum nest level of loops to optimize.
partial
Enable partial loop vectorization via innermost loop distribution.
short noshort (default)
Enable (disable) recognition of short vector operations that arise from scalar code outside of loops or within the body of loops.
sizelimit[:number] nosizelimit (default)
Limit the size of loops that are vectorized; the default is to attempt to vectorize all loops.
sse nosse (default)
Use (don't use) SSE, SSE2, 3Dnow, and prefetch instructions in loops where possible. The sse option is now deprecated, and the simd option should be used instead.
tile notile (default)
Enable (disable) loop tiling to optimize for cache locality.

-Mnovect disables the vectorizer, and is the default.

-Mzerotrip (default) -Mnozerotrip
Include (don't include) a zero-trip test for loops. Use -Mnozerotrip only when all loops are known to execute at least once.
-mp[=option]
Interpret OpenMP directives to explicitly parallelize regions of code for execution by multiple threads on a multi-processor system. Most OpenMP directives as well as the SGI parallelization directives are supported. See Chapters 5 and 6 of the PGI User's Guide for more information on these directives. The options allowed are:
align noalign (default)
Modify (don't modify) default loop iteration scheduling to align iterations with array references. The default is to use simple static scheduling.
allcores
Use all available cores when the environment variables OMP_NUM_THREADS and NCPUS are not set. This must be specified at link time.
bind
Bind threads to cores or processors. This must be specified at link time.
numa nonuma
Use (don't use) libraries to give affinity between threads and processors; this is useful with NUMA (non-uniform memory access) parallel architectures, so memory allocated by a particular thread will be allocated close to that processor, and will remain close to that thread. The default depends on the host machine.
-O[level]
Set the optimization level. If -O is not specified, then the default level is 1 if -g is not specified, and 0 if -g is specified. If a number is not supplied with -O then the optimization level is set to 2. The optimization levels and their meanings are as follows:
-O0
Sets the optimization level to 0. A basic block is generated for each statement. No scheduling is done between statements. No global optimizations are performed.
-O1
Sets the optimization level to 1. Scheduling within extended basic blocks is performed. No global optimizations are performed.
-O
Sets the optimization level to 2, with no SIMD vectorization enabled. All level 1 optimizations are performed. In addition, traditional scalar optimizations such as induction recognition and loop invariant motion are performed by the global optimizer.
-O2
All -O optimizations are performed. In addition, more advanced optimizations such as SIMD code generation, cache alignment and partial redundancy elimination are enabled.
-O3
All -O1 and -O2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or may not be profitable.
-O4
All -O1, -O2, and -O3 optimizations are performed. In addition, hoisting of guarded invariant floating point expressions is enabled.
-pg
(Linux only) Enable gprof-style sample-based profiling; implies -Mframe.

 

Debugging Options

-C
Add array bounds checking; the same as -Mbounds.
-g
Generate symbolic debug information. This also sets the optimization level to zero, unless a -O switch is present on the command line. Symbolic debugging may give confusing results if an optimization level other than zero is selected. Using -O0 the generated code will be slower than code generated at other optimization levels.
-gopt
Generate symbolic debug information, without affecting optimizations. This may give confusing results when debugging with optimizations; it is intended for use with other tools that use the debug information.
-Mbounds -Mnobounds (default)
Add (don't add) array bound checking.
-Mchkfpstk
Check for internal consistency of the IA-32 floating point stack in the prologue of a function and after returning from a function or subroutine call. If the PGI_CONTINUE environment variable is set, the stack will be automatically cleaned up and execution will continue. There is a performance penalty associated with the stack cleanup. If PGI_CONTINUE is set to verbose, the stack will be automatically cleaned up and execution will continue after a warning message is printed.
-Mchkptr
Check for unintended de-referencing of NULL pointers.
-Mchkstk
Check the stack for available space upon entry to and before the start of a parallel region. Useful when many private variables are declared.
-Mcoff
Generate a COFF formatted object.
-Mdwarf1
(IA-32 only) Generate DWARF1 debug information with -g.
-Mdwarf2
Generate DWARF2 debug information with -g.
-Mdwarf3
Generate DWARF3 debug information with -g.
-Melf
Generate an ELF formatted object.
-Mnodwarf
Don't add the default dwarf information.
-Mpgicoff -Mnopgicoff
Generate additional symbolic debug information.
-traceback (default) -notraceback
Add debug information for runtime traceback

 

Preprocessor Options

-Dname[=def]
Define name to be def in the preprocessor. If def is missing, it is assumed to be empty. If the = sign is missing, then name is defined to be the string 1.
-E
Preprocess each .c file and send the result to standard output. No compilation, assembly, or linking is performed.
-F
Stop after preprocessing.
-Idirectory
Add directory to the compiler's search path for include files. For include files surrounded by < >, each -I directory is searched followed by the standard area. For include files surrounded by " ", the directory containing the file containing the #include directive is searched, followed by the -I directories, followed by the standard area.
-Mcpp=[[no]comment|m|md|mm|mmd|mq:target|mt:target|suffix:suff]
Only runs the preprocessor on the input file(s); by default, the output is written to file.i, unless renamed with the -o switch. The options are:
comment nocomment
Keep (don't keep) C-style comments in the preprocessed output.
include:file
Include this file before processing the source file.
m
Print makefile dependencies to stdout, a la -M.
md
Print makefile dependencies to file.d, a la -MD.
mm
Print makefile dependencies to stdout, ignoring system includes (includes with angle braces), a la -MM.
mmd
Print makefile dependencies to file.d, ignoring system includes (includes with angle braces), a la -MMD.
mq:'target'
Print makefile dependencies to stdout, a la -MQ.
mt:target
Print makefile dependencies to stdout, a la -MT.
line
Include line numbers into the preprocessed output.
suffix:suff
When generating makefile dependencies, name the dependent file file.suff; the default is to name the dependent file file.o.
-Mnostddef
Do not predefine any macros to the preprocessor.
-Mnostdinc
Do not search in the standard location for include files when those files are not found elsewhere.
-Mpreprocess
Run the preprocessor on Fortran or assembler source files. By default, the preprocessor is run when the source's suffix is .fpp, .F, .F90, .F95, or .HPF.
-Uname
Remove the definition of the name macro in the preprocessor.
-YI,directory
Change the standard include directory to directory.
-Yp,directory
Look in directory for the preprocessor executable.

 

Assembler Options

-Wa,option[,option...]
Pass each comma-delimited option to the assembler.
-Ya,directory
Look in directory for the assembler executable.

 

Linker Options

-acclibs
Link-time option to add the accelerator libraries to the link line.
--as-needed --no-as-needed
(Linux only; not supported by all linkers) Passed to the linker. Instructs the linker to only set the DT_NEEDED flag for subsequent shared libraries, requiring those libraries at run time, if they are used to satisfy references. --no-as-needed restores the default behavior.
-Bdynamic
(Linux only) Passed to the linker to specify dynamic binding.
-Bstatic
(Linux only) Passed to the linker to specify static binding.
-Bstatic_pgi
(Linux only) Statically link in the PGI libraries, while using dynamic linking for the system libraries; implies -Mnorpath.
-cudalibs
Link-time option to add the CUDA runtime API library.
-g77libs
(Linux only) Link-time option which allows object files generated by GNU g77 (or gcc) to be linked in to pgfortran main programs.
-Ldirectory
Passed to the linker; add directory to the list of directories in which the linker searches for libraries.
-llibrary
Passed to the linker; load the library liblibrary.a from the standard library directory. See also the -L option.
-m
Cause the linker to display a link map.
-Mcudalib[=libname[,libname...]
Add the names CUDA libraries to the link line. -Mcudalib will use the version of the library appropriate to the CUDA version being used. The libraries recognized are:
cublas
cufft
curand
cusparse
-Meh_frame -Mnoeh_frame
Add (don't add) arguments to the link line to preserve the stack frame information for zero-cost exception handling frames. The default is -Mnoeh_frame unless changed in a site or user rcfile.
-Mlfs
(32-bit Linux only) Link in the Large File Support routines available on Linux versions later than Red Hat 7.0 or SuSE 7.1. This will support files from Fortran I/O that are larger than 2GB. Equivalent to -L$PGI/linux86/16.5/liblf.
-Mmpi=option
(PGI CDK only) -Mmpi adds the include and library options to the compile and link commands necessary to build an MPI application using MPI libraries installed with the PGI Cluster Development Kit (CDK). -Mmpi inserts -I$MPIDIR/include into the compile line, and -L$MPIDIR/lib -lfmpich -lmpich into the link line. The specified option is used to determine whether to select MPICH-1 or MPICH-2 headers and libraries. The base directories for MPICH-1 and MPICH-2 are set in localrc. The -Mmpi options are:
mpich
Use the MPICH v3.0 libraries; if MPIDIR is set, the MPI libraries in that directory are used. mpich1 Use the MPICH-1 libraries. Deprecated; requires that MPIDIR be set to the MPICH v1 directory.
mpich2
Use the MPICH-2 libraries. Deprecated; requires that MPIDIR be set to the MPICH v2 directory.
mvapich1
Use the MVAPICH libraries. Deprecated; requires that MPIDIR be set to the MVAPICH directory.
sgimpi
Use the SGI MPI libraries.

The user can set the environment variables MPIDIR and MPILIBNAME to override the default values for the MPI directory and library name.

-Mnostartup
Do not link in the usual startup routine. This routine contains the entry point for the program.
-Mnostdlib
Do not link in the standard libraries when linking a program.
-Mrpath (default) -Mnorpath
The default is to add -rpath to the link line giving the directories containing the PGI shared objects. Use -Mnorpath to instruct the driver not to add any -rpath switches to the link line.
-Mscalapack
(PGI CDK only) Add the Scalapack libraries.
-pgc++libs
Link-time option to add the C++ runtime libraries, allowing mixed-language programming.
-pgcpplibs
Link-time option to add the C++ runtime libraries, allowing mixed-language programming.
-pgf77libs
Link-time option to add the pgf77 runtime libraries, allowing mixed-language programming.
-pgf90libs
Link-time option to add the pgf90 runtime libraries, allowing mixed-language programming.
-Rdirectory
Passed to the linker; instructs the linker to hard-code the pathname directory into the search path for generated shared object files. Note that there cannot be a space between R and directory .
-r
Passed to the linker; generate a re-linkable object file.
-rpath directory
Passed to the linker to add the directory to the runtime shared library search path.
-s
Passed to the linker; strip symbol table information.
-shared
(Linux only) Passed to the linker. Instructs the linker to generate a shared object file (dynamically linked library). Implies -fpic.
-soname name
(Linux only) Passed to the linker. When creating a shared object, instructs the linker to set the internal DT_SONAME field to the specified name.
-uname
Passed to the linker; generate undefined reference.
--whole-archive --no-whole-archive
(Linux only) Passed to the linker. Instructs the linker to include all objects in subsequent archive files. --no-whole-archive restores the default behavior.
-Wl,option[,option...]
Pass each comma-delimited option to the linker.
-YC,directory
Look in directory for the standard compiler library files.
-Yl,directory
Look in directory for the linker.
-YL,directory
Change the standard library directory to directory.
-YS,directory
Look in directory for the standard system startup object files.
-YU,directory
Passed to the linker; change library search path.

 

Language Options

-asmsuffix=suffix
Define that a file with the given suffix is an assembly language file.
-byteswapio
Swap bytes from big-endian to little-endian or vice versa on input/output of unformatted Fortran data. Use of this option enables reading/writing of Fortran unformatted data files compatible with those produced on Sun or SGI systems.
-csuffix=suffix
Define that a file with the given suffix is a C source file.
-fsuffix=suffix
Define that a file with the given suffix is a Fortran source file.
-FSUFFIX=suffix
Define that a file with the given suffix is a Fortran source file.
-i2
Treat INTEGER variables as two bytes.
-i4
Treat INTEGER variables as four bytes.
-i8
Treat default INTEGER and LOGICAL variables as eight bytes. For operations involving integers, use 64-bits for computations.
-i8storage
Allocates 8 bytes for INTEGER and LOGICAL.
-Mallocatable[=95|03]
Select whether to use Fortran 1995 or Fortran 2003 semantics for assignments to allocatable objects and allocatable components of derived types. Fortran 1995 semantics require the user to allocate the object or component and that an array object or component be conformant before the assignment. Fortran 2003 semantics require the compiler to add code to check whether the object or component is allocated and whether an array object is conformant before the assignment, and to allocate or reallocate if not.
-Mbackslash -Mnobackslash (default)
Treat (don't treat) backslash as a normal (non-escape) character in strings. -Mnobackslash causes the standard C backslash escape sequences to be recognized in quoted strings; -Mbackslash causes the backslash to be treated like any other character.
-Mbyteswapio
Swap bytes from big-endian to little-endian or vice versa on input/output of unformatted Fortran data. Use of this option enables reading/writing of Fortran unformatted data files compatible with those produced on Sun or SGI systems.
-Mcray=pointer
Force Cray Fortran (CF77) compatibility with respect to the listed options. Possible options include:
pointer
For purposes of optimization, assume that pointer-based variables do not overlap the storage of any other variable.
-Mcuda[=option[,option...]
Enable CUDA Fortran extensions, and link with the CUDA Fortran libraries. -Mcuda is required on the link line if there are no CUDA Fortran source files specified on the command line. The options are:
emu
Enable emulation mode; in emulation mode, all code is executed on the host processor, allowing host-level debugging.
cc20 cc30 cc35 cc50
Generate code for a device with compute capability 2.0, 3.0, 3.5 or 5.0.
fermi kepler maxwell
Generate code for a Fermi (compute capability 2.0), Kepler (compute capability 3.x) or Maxwell (compute capability 5.x) device.
cuda7.0 (default) cuda7.5
Use the CUDA 7.0 (default) or 7.5 toolkit to build the GPU code.
7.0 7.5
Aliases for -Mcuda=cuda7.0 and -Mcuda=cuda7.5.
fastmath
Use the faster (but lower precision) versions of math library routines.
flushz noflushz (default)
Enable (disable) flush-to-zero mode on the GPU.
fma nofma
Generate (do not) fused multiply-add operations. This is enabled by default at optimization level -O3.
keepbin
Keep the generated CUDA binary files, with a .bin suffix.
keepgpu
Keep the generated CUDA GPU source files, with a .gpu suffix.
keepptx
Keep the generated portable assembly files, with a .ptx suffix.
lineinfo nolineinfo (default)
Generate debugging line information.
loadcache:[L1|L2]
Generate code to cache global memory loads in the L1 or L2 hardware cache.
madconst
Generate code so that module array descriptors are placed in CUDA constant memory. The array descriptor holds the bounds for allocatable arrays and array pointers. Putting these in CUDA constant memory makes accesses much faster, but prevents any modifications from device code.
maxregcount:n
Set the maximum number of registers to use in the generated GPU code.
ptxinfo
Print the resource usage for each kernel routine from the PTX assembler.
rdc (default) nordc
Generate relocatable device code for separate compilation, and invoke the device linker before the host linker at the link step.
unroll nounroll
Automatically (do not) unroll inner loops. This is enabled by default at optimization level -O3.
Note that multiple compute capabilities can be specified, and one version will be generated for each capability specified.
-Mdalign (default) -Mnodalign
Align (don't align) doubles in structures on 8-byte boundaries. -Mnodalign may lead to data alignment exceptions.
-Mdclchk -Mnodclchk (default)
Require (don't require) that all variables be declared.
-Mdefaultunit -Mnodefaultunit (default)
Treat (don't treat) '*' as stdout/stdin regardless of the status of units 6/5. -Mnodefaultunit causes * to be a synonym for 5 on input and 6 on output; -Mdefaultunit causes * to be a synonym for stdin on input and stdout on output.
-Mdlines -Mnodlines (default)
Treat (don't treat) lines beginning with D in column 1 as executable statements, ignoring the D.
-Mdollar=char
Set the character used to replace dollar signs in names to be char. Default is an underscore (_).
-Mextend
Allow 132-column source lines.
-Mfixed
Process Fortran source using fixed form specifications. The -Mfree options specify free form formatting. By default files with a .f or .F extension use fixed form formatting.
-Mfree -Mfreeform -Mnofree -Mnofreeform
Process Fortran source using free form specifications. The -Mnofree and -Mfixed options specify fixed form formatting. By default files with a .f90, .F90, .f95 or .F95 extension use freeform formatting.
-Mi4 (default) -Mnoi4
Treat (don't treat) INTEGER as INTEGER*4. -Mnoi4 treats INTEGER as INTEGER*2.
-Miomutex -Mnoiomutex (default)
Generate (don't generate) critical section calls around Fortran I/O statements.
-Mlibsuffix=suffix
Define that a file with the given suffix is an object library file.
-Mllalign -Mnollalign (default)
Align (don't align) long longs or INTEGER*8 in structures or common blocks on 8-byte boundaries. -Mnollalign is the default, and this is a change beginning with release 4.0. Releases prior to 4.0 aligned long longs on 8-byte boundaries.
-Mnomain
When the link step is called, don't include the object file which calls the Fortran main program. Useful for using the pgfortran driver to link programs with the main program written in C or C++ and one or more subroutines written in Fortran.
-Mobjsuffix=suffix
Define that a file with the given suffix is a binary object file.
-Monetrip -Mnoonetrip (default)
Force (don't force) each DO loop to be iterated at least once.
-Mr8 -Mnor8 (default)
Treat (don't treat) REAL as DOUBLE PRECISION and real constants as double precision constants.
-Mr8intrinsics [=float]-Mnor8intrinsics (default)
Treat (don't treat) the intrinsics CMPLX as DCMPLX and REAL as DBLE.
float
Also treat the FLOAT intrinsic as DBLE.
-Mrecursive -Mnorecursive (default)
Allocate (don't allocate) local variables on the stack, thus allowing recursion. SAVEd, data-initialized, or namelist members are always allocated statically, regardless of the setting of this switch.
-Mref_externals -Mnoref_externals (default)
Force (don't force) references to names appearing in EXTERNAL statements.
-Msave -Mnosave (default)
Assume (don't assume) that all local variables are subject to the SAVE statement. -Msave may allow many older Fortran programs to run but can greatly reduce performance.
-Msignextend (default) -Mnosignextend
Sign extend (don't sign extend) when a narrowing conversion overflows. For example, when -Msignextend is in effect and an integer containing the value 65535 is converted to a short, the value of the short will be -1. ANSI C specifies that the result of such conversions are undefined.
-Mstack_arrays -Mnostack_arrays (default)
Allocate automatic arrays on the stack (on the heap).
-Mstandard
Flag non-ANSI-Fortran usage.
-Munixlogical -Mnounixlogical (default)
When -Munixlogical is in effect, a logical is considered to be .TRUE. if its value is non-zero and .FALSE. otherwise. When -Mnounixlogical is in effect (the default), a logical considered to be .TRUE. if its value is odd and .FALSE. if its value is even.
-Mupcase -Mnoupcase (default)
Preserve (don't preserve) case in names. -Mnoupcase causes all names to be converted to lower case. Note that, if -Mupcase is used, then variable name 'X' is different than variable name 'x', and keywords must be in lower case.
-module directory
Save/search for module files in directory
-r4
Interpret DOUBLE PRECISION variables as REAL.
-r8
Interpret REAL variables as DOUBLE PRECISION. Equivalent to using the options -Mr8 and -Mr8intrinsics.
-Wh,option[,option...]
Pass each comma-delimited option to the Fortran 90/95 front end.

 

Target-specific Options

-acc
Enable OpenACC pragmas and directives to explicitly parallelize regions of code for execution by accelerator devices. See the -ta flag to select target accelerators for which to compile. The options are:
autopar (default) noautopar
Enable loop autoparallelization within parallel constructs.
routineseq noroutineseq (default)
Compile every routine for the device, as if it had a routine seq directive.
sync
Ignore async clauses, and run every data transfer and kernel launch on the default sync queue.
wait nowait (default)
Wait for each compute kernel to finish.
-Kieee -Knoieee (default)
Perform floating-point operations in strict conformance with the IEEE 754 standard. Some optimizations are disabled with -Kieee, and a more accurate math library is used. The default -Knoieee uses faster but very slightly less accurate methods.
-Ktrap=[option,[option]...]
Controls the behavior of the processor when exceptions occur. Possible options include
align
Trap on memory alignment errors, currently ignored.
denorm
Trap on denormalized operands.
divz
Trap on divide by zero.
fp
Trap on floating point exceptions.
inexact
Trap on inexact result.
inv
Trap on invalid operation.
none (default)
Disable all traps.
ovf
Trap on floating point overflow.
unf
Trap on floating point underflow.
-Ktrap is only processed when compiling a main function/program. -Ktrap=fp is equivalent to -Ktrap=divz,inv,ovf. These options correspond to the processor's exception mask bits. Normally, the processor's exception mask bits are on, meaning floating-point exceptions are masked; the processor recovers from the exception and continues. If a mask bit is off (unmasked) and the corresponding exception occurs, execution terminates with floating point exception (Linux FPE signal).
-Mdaz -Mnodaz
Enable (disable) mode to treat denormalized floating point numbers as zero. -Mdaz is default for -tp p7 -m64 targets; -Mnodaz is default otherwise.
-Mflushz -Mnoflushz
Set floating point operations to flush-to-zero mode; -Mflushz is set at optimization level -O2 and higher.
-Mfpapprox [=option[,option,...]] -Mnofpapprox (default)
Perform (don't perform) certain single-precision floating point operations using low-precision approximation. This can be very dangerous; the low-precision approximations are much faster than the full precision computation, but the results will be different. This option should be used only with the utmost care. The options are
div
Approximate single precision floating point division.
rsqrt
Approximate single precision floating point reciprocal square root.
sqrt
Approximate single precision floating point square root.
With no options, -Mfpapprox will approximate all three operations.
-Mfpmisalign -Mnofpmisalign
Allow (don't allow) vector arithmetic instructions with memory operands that are not aligned on 16-byte boundaries.
-Mfprelaxed [=option[,option,...]] -Mnofprelaxed (default)
Perform (don't perform) certain floating point operations using relaxed precision when it improves speed. The options are
div
Perform divide using relaxed precision.
intrinsic
Perform certain intrinsic functions using relaxed precision.
order noorder
Allow (don't allow) expression reordering, including factoring such as computing a*b+a*c as a*(b+c).
recip
Perform reciprocal operations using relaxed precision.
rsqrt
Perform reciprocal square root (1/sqrt) using relaxed precision.
sqrt
Perform square root using relaxed precision.
With no options, -Mfprelaxed will choose to generate relaxed precision code for those operations that generate a significant performance improvement, depending on the target processor.
-Mfunc32 (default) -Mnofunc32
Align (don't align) functions on 32 byte boundaries.
-Mlarge_arrays -Mnolarge_arrays (default)
(linux86-64 only). Allow (don't allow) arrays larger than 2GB; -Mlarge_arrays is default with -mcmodel=medium.
-Mlongbranch -Mnolongbranch (default)
Enable (disable) long branches.
-Mloop32 -Mnoloop32 (default)
Align (don't align) innermost loops on 32 byte boundaries for -tp barcelona.
-Msecond_underscore -Mnosecond_underscore (default)
Add (don't add) a second underscore to the name of a Fortran global if its name already contains an underscore. This option is useful for maintaining compatibility with g77, which adds a second underscore to such symbols by default.
-Mvarargs -Mnovarargs (default)
(x86-64 only) Generate code for calls made from Fortran to C routines to use varargs calling sequence.
-Mwritable-strings
Store string constants in the writable data segment.
-m32
Compile for 32-bit target.
-m64
Compile for 64-bit target.
-mcmodel=small|medium
(AMD64 and Intel 64 only) Use the memory model that limits objects to less than 2GB (small) or allows data sections to be larger than 2GB (medium); implies -Mlarge_arrays
-pc=val
The IA-32 architecture implements a floating-point stack using 8 80-bit registers. Each register uses bits 0-63 as the significand, bits 64-78 for the exponent, and bit 79 is the sign bit. This 80-bit real format is the default format (called the extended format). When values are loaded into the floating point stack they are automatically converted into extended real format. The precision of the floating point stack can be controlled, however, by setting the precision control bits (bits 8 and 9) of the floating control word appropriately. In this way, the programmer can explicitly set the precision to standard IEEE double using 64 bits, or to single precision using 32 bits. The default precision setting is system dependent. If you use -pc to alter the precision setting for a routine, the main program must be compiled with the same value for -pc. The command line option -pc val lets the programmer set the compiler's precision preference. Valid values for val are:

    32 single precision

    64 double precision

    80 extended precision
Operations performed exclusively on the floating point stack using extended precision, without storing into or loading from memory, can cause problems with accumulated values within the extra 16 bits of extended precision values. This can lead to answers, when rounded, that do not match expected results.
-ta=target
Specify the type of the accelerator to which to target accelerator regions; accepted values are
-ta=tesla
Compile the accelerator regions for a CUDA-enabled NVIDIA GPU. Additional suboptions valid after -ta=tesla are:
cc20 cc30 cc35 cc50
Generate code for a device with compute capability 2.0, 3.0, 3.5 or 5.0.
fermi kepler maxwell
Generate code for a Fermi (compute capability 2.0), Kepler (compute capability 3.x) or Maxwell (compute capability 5.x) device.
cuda7.0 (default) cuda7.5
Use the CUDA 7.0 (default) or 7.5 toolkit to build the GPU code.
7.0 7.5
Aliases for -Mcuda=cuda7.0 and -Mcuda=cuda7.5.
fastmath
Enable the fast math library, which includes faster, but lower precision, implementations of certain math and intrinsic functions.
flushz noflushz (default)
Enable (disable) flush-to-zero mode on the GPU.
fma nofma
Generate (do not) fused multiply-add operations. This is enabled by default at optimization level -O3.
keepbin
Keep the generated CUDA binary, with a .bin suffix.
keepgpu
Keep the generated CUDA GPU source files, with a .gpu suffix.
keepptx
Keep the generated portable assembly files, with a .ptx suffix.
lineinfo nolineinfo (default)
Generate debugging line information.
llvm (default) nollvm
Compile using the LLVM device code generator or the CUDA C code generator.
loadcache:[L1|L2]
Generate code to cache global memory loads in the L1 or L2 hardware cache.
maxregcount:n
Set the maximum number of registers to use in the generated GPU code.
managed (Beta feature)
Allocate any dynamically allocated data in CUDA Unified (managed) memory. This option must appear in both the compile and link lines. This may not be used with -ta=tesla:pinned.
pinned
Allocate any dynamically allocated data in CUDA Pinned host memory. This option must appear in both the compile and link lines. This may not be used with -ta=tesla:managed.
rdc (default) nordc
Generate (do not generate) relocatable device code for separate compilation, and invoke the device linker before the host linker at the link step.
unroll nounroll
Automatically (do not) unroll inner loops. This is enabled by default at optimization level -O3.
Note that multiple compute capabilities can be specified, and one version will be generated for each capability specified. The default is equivalent to -ta=tesla:fermi+.
-ta=multicore (beta feature)
Compile the OpenACC compute regions for parallel execution across the cores of the host multicore CPU.
-ta=nvidia
This flag is equivalent to -ta=tesla, and has all the same suboptions.
-ta=radeon
Compile the accelerator regions for an AMD Radeon GPU. Additional suboptions valid after -ta=radeon are:
tahiti
Generate code for AMD Tahiti architecture GPUs.
capeverde
Generate code for AMD Cape Verde architecture GPUs.
spectre
Generate code for AMD Spectre architecture APUs.
buffercount:n
Specify the number of OpenCL buffers to use for the device; the same value must be used on all OpenACC source files to generate useful code. The default value is 3.
keep
Keep the generated OpenCL source files.
Multiple AMD GPU architectures can be specified. The default is -ta=radeon:tahiti.
-ta=host
Compile the accelerator regions to run sequentially on the host processor.

The default in the absence of the -ta flag is to ignore the accelerator directives and compile for the host. Multiple targets are allowed, such as -ta=tesla,host, in which case code is generated for the NVIDIA GPU as well as the host for each accelerator region.

-tp=target
Specify the type of the target processor; possibilities are
-tp=k8
AMD Opteron or Athlon-64
-tp=barcelona
AMD Barcelona processor
-tp=shanghai
AMD Shanghai architecture Opteron processor
-tp=istanbul
AMD Istanbul architecture Opteron processor
-tp=bulldozer
AMD Bulldozer processor
-tp=piledriver
AMD Piledriver architecture Opteron processor
-tp=p7
Intel 64 processor
-tp=core2
Intel core2 processor
-tp=penryn
Intel Penryn architecture Pentium processor
-tp=nehalem
Intel Nehalem architecture Core processor
-tp=sandybridge
Intel SandyBridge architecture Core processor
-tp=haswell
Intel Haswell architecture processor
-tp=px
Blended code generation that will work on any x86-compatible processor
-tp=x64
Equivalent to -tp=k8,p7.

The default in the absence of the -tp flag is to compile for the type of CPU on which the compiler is running. Where available, -tp=target-64 is equivalent to -m64 -tp=target, and -tp=target-32 is equivalent to -m32 -tp=target. When 32- and 64-bit targets are available for a target, -tp=target by itself will compile for a 32-bit or 64-bit target depending on whether the 32-bit or 64-bit compiler is invoked from your command line path.


 

FILES

a.out
executable output file
pgpf.out
Profile feedback data file; see -Mpfi
pgprof.out
PGPROF output file; see -Mprof
file.a
library of object files
file.f
fixed-format Fortran source file
file.F
fixed-format Fortran source file that requires preprocessing
file.f90
free-format Fortran source file
file.F90
free-format Fortran source file that requires preprocessing
file.f95
free-format Fortran source file
file.F95
free-format Fortran source file that requires preprocessing
file.f03
free-format Fortran source file
file.F03
free-format Fortran source file that requires preprocessing
file.for
fixed-format Fortran source file
file.fpp
fixed-format Fortran source file that requires preprocessing
file.cuf
free-format CUDA Fortran source file
file.CUF
free-format CUDA Fortran source file that requires preprocessing
file.ipa
InterProcedural Analyzer (IPA) file
file.ipo
InterProcedural Analyzer (IPA) file
file.o
object file
file.s
assembler source file
.mypgfortranrc
You may add custom switches or make other additions to pgfortran by creating a file named .mypgfortranrc in your home directory.

The installation of this version of the compiler resides in $PGI/target/16.5/; other versions may coexist in $PGI/target/release/. $PGI is an environment variable that points to the root of the compiler installation directory. If $PGI is not set, the default is /usr/pgi. The target is one of the following:

linux86
for 32-bit IA32 Linux targets
linux86-64
for 64-bit AMD64 or Intel 64 Linux targets

The compiler installation subdirectories are:

bin/
compiler and tool executables and configuration (rc) files
include/
compiler include files
lib/
libraries and object files
liblf/
libraries and object files

 

SEE ALSO

pgcc (1), pgCC (1), pgf77 (1), pghpf (1), pgprof (1), pgdbg (1), and the PGI User's Guide.

 

DIAGNOSTICS

The compiler produces information and error messages as it translates the input program. The linker and assembler may issue their own error messages.


 

Index

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
Overall Options
Optimization Options
Debugging Options
Preprocessor Options
Assembler Options
Linker Options
Language Options
Target-specific Options
FILES
SEE ALSO
DIAGNOSTICS

This document was created by man2html, using the manual pages.
Time: 16:14:12 GMT, July 18, 2016