Cluster Grid

Font Size

SCREEN

Layout

Menu Style

Cpanel

Running and Debugging OpenMP

run and debug

OpenMP Directives

1. Directives Format

1.1. Fortran Directives Format

Format: (not case sensitive)

sentinel directive-name [clause ...]

All Fortran OpenMP directives must begin with a sentinel. The accepted sentinels depend on the type of Fortran source. Possible sentinels are:

!$OMP
C$OMP
*$OMP

Example:

!$OMP PARALLEL DEFAULT(SHARED) PRIVATE(BETA,PI)

Fixed Form Source:

  • !$OMP C$OMP *$OMP are accepted sentinels and must start in column 1.
  • All Fortran fixed form rules for line length, white space, continuation and comment columns apply for the entire directive line.
  • Initial directive lines must have a space/zero in column 6.
  • Continuation lines must have a non-space/zero in column 6.

Free Form Source:

  • !$OMP is the only accepted sentinel. Can appear in any column, but must be preceded by white space only.
  • All Fortran free form rules for line length, white space, continuation and comment columns apply for the entire directive line
  • Initial directive lines must have a space after the sentinel.
  • Continuation lines must have an ampersand as the last non-blank character in a line. The following line must begin with a sentinel and then the continuation directives.

General Rules:

  • Comments can not appear on the same line as a directive.
  • Only one directive name may be specified per directive.
  • Fortran compilers which are OpenMP enabled generally include a command line option which instructs the compiler to activate and interpret all OpenMP directives.
  • Several Fortran OpenMP directives come in pairs and have the form shown below. The "end" directive is optional but advised for readability.

!$OMP directive
    [ structured block of code ]
!$OMP end directive

1.2. C / C++ Directives Format

Format:

#pragma omp directive-name [clause, ...] newline

A valid OpenMP directive must appear after the pragma and before any clauses. Clauses can be placed in any order, and repeated as necessary, unless otherwise restricted. It is required that that the pragma clause precedes the structured block which is enclosed by this directive.

Example:

#pragma omp parallel default(shared) private(beta,pi)

General Rules:

* Case sensitive

* Directives follow conventions of the C/C++ standards for compiler directives.

* Only one directive-name may be specified per directive.

* Each directive applies to at most one succeeding statement, which must be a structured block.

* Long directive lines can be "continued" on succeeding lines by escaping the newline character with a backslash ("n") at the end of a directive line.

PARALLEL Region Construct

Purpose: A parallel region is a block of code that will be executed by multiple threads. This is the fundamental OpenMP parallel construct.

Example:

Fortran

!$OMP PARALLEL [clause ...]

IF (scalar_logical_expression)

PRIVATE (list)

SHARED (list)

DEFAULT (PRIVATE | FIRSTPRIVATE | SHARED | NONE)

FIRSTPRIVATE (list)

REDUCTION (operator: list)

COPYIN (list)

NUM_THREADS (scalar-integer-expression)

block

!$OMP END PARALLEL

C/C++

#pragma omp parallel [clause ...] newline

if (scalar_expression)

private (list)

shared (list)

default (shared | none)

firstprivate (list)

reduction (operator: list)

copyin (list)

num_threads (integer-expression)

structured_block

Notes:

- When a thread reaches a PARALLEL directive, it creates a team of threads and becomes the master of the team. The master is a member of that team and has thread number 0 within that team.

- Starting from the beginning of this parallel region, the code is duplicated and all threads will execute that code.

- There is an implicit barrier at the end of a parallel section. Only the master thread continues execution past this point.

- If any thread terminates within a parallel region, all threads in the team will terminate, and the work done up until that point is undefined.

How Many Threads?

The number of threads in a parallel region is determined by the following factors, in order of precedence:

1. Evaluation of the IF clause

2. Setting of the NUM THREADS clause

3. Use of the omp set num threads() library function

4. Setting of the OMP NUM THREADS environment variable

5. Implementation default - usually the number of CPUs on a node, though it could be dynamic.

Threads are numbered from 0 (master thread) to N-1.

Dynamic Threads:

Use the omp get dynamic() library function to determine if dynamic threads are enabled. If supported, the two methods available for enabling dynamic threads are:

1. The omp set dynamic() library routine;

2. Setting of the OMP DYNAMIC environment variable to TRUE.

Nested Parallel Regions:

Use the omp get nested() library function to determine if nested parallel regions are enabled. The two methods available for enabling nested parallel regions (if supported) are:

1. The omp set nested() library routine

2. Setting of the OMP NESTED environment variable to TRUE

If not supported, a parallel region nested within another parallel region results in the creation of a new team, consisting of one thread, by default.

Clauses:

IF clause: If present, it must evaluate to .TRUE. (Fortran) or non-zero (C/C++) in order for a team of threads to be created. Otherwise, the region is executed serially by the master thread.

Restrictions:

A parallel region must be a structured block that does not span multiple routines or code files. It is illegal to branch into or out of a parallel region. Only a single IF clause is permitted. Only a single NUM THREADS clause is permitted.

Example: Parallel Region - Simple "Hello World" program

- Every thread executes all code enclosed in the parallel section

- OpenMP library routines are used to obtain thread identifiers and total number of threads

Fortran - Parallel Region Example

INTEGER NTHREADS, TID, OMP_GET_NUM_THREADS,

+ OMP_GET_THREAD_NUM

C Fork a team of threads with each thread having a private TID variable

!$OMP PARALLEL PRIVATE(TID)

C Obtain and print thread id

TID = OMP_GET_THREAD_NUM()

PRINT *, 'Hello World from thread = ', TID

C Only master thread does this

IF (TID .EQ. 0) THEN

NTHREADS = OMP_GET_NUM_THREADS()

PRINT *, 'Number of threads = ', NTHREADS

END IF

C All threads join master thread and disband

!$OMP END PARALLEL

END

C / C++ - Parallel Region Example

#include

main () {

int nthreads, tid;

/* Fork a team of threads with each thread having a private tid variable */

#pragma omp parallel private(tid)

{

/* Obtain and print thread id */

tid = omp_get_thread_num();

printf("Hello World from thread = %d\n", tid);

/* Only master thread does this */

if (tid == 0)

{

nthreads = omp_get_num_threads();

printf("Number of threads = %d\n", nthreads);

}

} /* All threads join master thread and terminate */

}

General rules of directives (for more details about these directives you can go to openMP Directives ):

- They follow the standards and conventions of the C/C++ or Fortran compilers;

- They are case sensitive;

- In a directive, only one name can me specified;

- Any directive can be applied only to the statement following it, which must be a structured block.

- "Long" directives can be continued on the next lines by adding a n at the end of the first line of the directive.

 

2. The OpenMP Directives

PARALLEL region: a block will be executed in parallel by OMP NUM THREADS number of threads. It is the fundamental construction in OpenMP.

Work-sharing structures:

DO/for - shares an iteration of a cycle over all threads (parallel data);
SECTIONS - splits the task in separated sections (functional parallel processing);
SINGLE - serialises a code section.

Synchronizing constructions:

MASTER - only the master thread will execute the region of code;
CRITICAL - that region of code will be executed only by one thread;
BARRIER - all threads from the pool synchronize;
ATOMIC - a certain region of memory will be updated in an atomic mode - a sort of critical section;
FLUSH - identifies a syncronization point in which the memory must be in a consistent mode;
ORDERED - the iterations of the cycle from this directive will be executed in the same order like the corresponding serial execution;
THREADPRIVATE - it is used to create from the global variables, local separated variables which will be executed on several parallel regions.

Clauses to set the context:

These are important for programming in a programming model with shared memory. They are used together with the PARALLEL, DO/for and SECTIONS directives.

PRIVATE - the variables from the list are private in every thread;
SHARED - the variables from the list are shared by the threads of the current team;
DEFAULT - it allows the user to set the default "PRIVATE", "SHARED" or "NONE" for all the variables from a parallel region;
FIRSTPRIVATE - it combines the functionality of the clause PRIVATE with the automated initialization of the variables from the list: the initialisation of the local variables is made using the previous value from the cycle;
LASTPRIVATE - it combines the functionality of the PRIVATE clause with a copy of the last iteration from the current section;
COPYIN - it offers the possibility to assign the same value to the variables THREADPRIVATE for all the threads in the pool;
REDUCTION - it makes a reduction on the variables that appear in the list (with a specific operation: + - * /,etc.).

 

OpenMP Examples

examples

OpenMP Environment Variables

OpenMP provides the following environment variables for controlling the execution of parallel code. All environment variable names are uppercase. The values assigned to them are not case sensitive.


OMP_SCHEDULE

Applies only to DO, PARALLEL DO (Fortran) and for, parallel for C/C++ directives which have their schedule clause set to RUNTIME. The value of this variable determines how iterations of the loop are scheduled on processors. For example:

setenv OMP_SCHEDULE "guided, 4"
setenv OMP_SCHEDULE "dynamic"


OMP_NUM_THREADS

Sets the maximum number of threads to use during execution. For example:

setenv OMP_NUM_THREADS 8


OMP_DYNAMIC

Enables or disables dynamic adjustment of the number of threads available for execution of parallel regions. Valid values are TRUE or FALSE. For example:

setenv OMP_DYNAMIC TRUE


OMP_NESTED

Enables or disables nested parallelism. Valid values are TRUE or FALSE. For example:

setenv OMP_NESTED TRUE

Implementation notes:

Your implementation may or may not support nested parallelism and/or dynamic threads. If nested parallelism is supported, it is often only nominal, meaning that a nested parallel region may only have one thread. Consult your implementation's documentation for details - or experiment and find out for yourself.


OMP_STACKSIZE

New feature available with OpenMP 3.0. Controls the size of the stack for created (non-Master) threads. Examples:

setenv OMP_STACKSIZE 2000500B
setenv OMP_STACKSIZE "3000 k "
setenv OMP_STACKSIZE 10M
setenv OMP_STACKSIZE " 10 M "
setenv OMP_STACKSIZE "20 m "
setenv OMP_STACKSIZE " 1G"
setenv OMP_STACKSIZE 20000


OMP_WAIT_POLICY

New feature available with OpenMP 3.0. Provides a hint to an OpenMP implementation about the desired behaviour of waiting threads. A compliant OpenMP implementation may or may not abide by the setting of the environment variable. Valid values are ACTIVE and PASSIVE. ACTIVE specifies that waiting threads should mostly be active, i.e. consume processor cycles, while waiting. PASSIVE specifies that waiting threads should mostly be passive, i.e. not consume processor cycles, while waiting. The details of the ACTIVE and PASSIVE behaviours are implementation defined. Examples:

setenv OMP_WAIT_POLICY ACTIVE
setenv OMP_WAIT_POLICY active
setenv OMP_WAIT_POLICY PASSIVE
setenv OMP_WAIT_POLICY passive


OMP_MAX_ACTIVE_LEVELS

New feature available with OpenMP 3.0. Controls the maximum number of nested active parallel regions. The value of this environment variable must be a non-negative integer. The behaviour of the program is implementation-defined if the requested value of OMP MAX ACTIVE LEVELS is greater than the maximum number of nested active parallel levels an implementation can support or if the value is not a non-negative integer. Example:

setenv OMP_MAX_ACTIVE_LEVELS 2


OMP_THREAD_LIMIT

New feature available with OpenMP 3.0. Sets the number of OpenMP threads to use for the whole OpenMP program. The value of this environment variable must be a positive integer. The behaviour of the program is implementation-defined if the requested value of OMP THREAD LIMIT is greater than the number of threads an implementation can support or if the value is not a positive integer. Example:

setenv OMP_THREAD_LIMIT 8

 

More Articles...