This lesson is being piloted (Beta version)

Introduction

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How shared memory parallel programs work?

  • What is OpenMP?

  • How to write and compile parallel programs in C?

Objectives
  • Understand the shared memory programming environment provided by OpenMP

  • Learn how to write and compile simple parallel programs in C

Shared Memory OpenMP Programming

As we learned in the General Parallel Computing lesson, parallel programs come in two broad flavors: shared-memory and distributed memory or message-passing. In this lesson, we will be looking at shared-memory programming, with a focus on Open Multi-Processing (OpenMP) programming.

In any parallel program, the general idea is to have multiple threads of execution so that you can break up your problem and have each thread handle one part. These multiple threads need to be able to communicate with each other as your program runs. In a shared-memory program, this communication happens through the use of global variables stored in the global memory of the computer running the code. This means that communication between the various threads is extremely fast, as it happens at the speed of local memory (RAM) access. The drawback is that your program will be limited to a single physical machine (compute node of HPC network), since all threads need to be able to see the same RAM.

OpenMP is one way of writing shared-memory parallel programs. OpenMP is a specification, which has been implemented by many vendors.

The OpenMP effort began in 1996 when Accelerated Strategic Computing Initiative of the DOE brought together a handful of computer vendors including HP, IBM, Intel, SGI and DEC to create a portable API for shared memory computers. Vendors do not typically work well together unless an outside force encourages cooperation. So this committee communicated that DOE would only purchase systems with a portable API for shared memory programming.

The current OpenMP v.5.1 specification is 600+ pages long, but you need to know only a very small fraction of it to be able to use it in your code.

OpenMP standard describes extensions to a C/C++ or FORTRAN compiler. OpenMP libraries are built into a compiler, and this means that you need use a compiler that supports OpenMP. There are different OpenMP implementations (compilers) and not all sections of OpenMP specifications are equally supported by all compilers. It is up to programmers to investigate the compiler they want to use and see if it supports the parts of OpenMP specification that they wish to use. Luckily, the vast majority of OpenMP behaves the way you expect it to with most modern compilers. When possible, we will try and highlight any odd behaviors. All commonly used compilers such as gcc, clang, Intel, Nvidia HPC, and Absoft support OpenMP.

View OpenMP specifications
View Full List of Compilers supporting OpenMP
For an overview of the past, present and future of the OpenMP read the paper “The Ongoing Evolution of OpenMP”.

OpenMP Execution Model

The philosophy of OpenMP is to not sacrifice ease of coding and maintenance in the name of performance.

Compiling OpenMP programs

Since OpenMP is meant to be used with either C/C++ or FORTRAN, you will need to know how to work with at least one of these languages. This workshop will use C as the language for the examples. Before we start using a compiler let’s load the most recent environment module:

module load StdEnv/2020

You don’t need to run this command on the real CC clusters because StdEnv/2020 is the default, but we need to run it on the training cluster.

As a reminder, a simple hello world program in C would look like the following.

Compiling C code

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  printf("Hello World\n");
}

In order to compile this code, you would need to use the following command:

gcc -o hello hello_world.c

This gives you an executable file “hello” that will print out the text “Hello World”. You run it with the command:

./hello
Hello World

If you don’t specify the output filename with the -o option compiler will use the default output name “a.out” (assembler output).

GCC on Compute Canada

Currently the default environment on the general purpose clusters (Beluga, Cedar, Graham) is StdEnv/2020. The default compilers available in this environment on Graham and Beluga are Intel/2020.1.217 and gcc/9.3.0. On Cedar the default compiler is gcc/8.4.0.

To load another compiler you can use the command module load. For example, the command to load gcc version 10.2.0 is:

module load gcc/10.2.0

A Very Quick Introduction to C

Preprocessor directives

Basic Syntax

Defining Functions

Function definitions have the following format:

  return_type function_name( parameter list ) {
     body of the function
  }

Using Memory

Dynamic memory allocation is when an executing program requests that the operating system give it a block of main memory. The program then uses this memory for some purpose.

Key Points

  • Shared-memory parallel programs break up large problems into a number of smaller ones and execute them simultaneously

  • OpenMP programs are limited to a single physical machine

  • OpenMP libraries are built into all commonly used C, C++, or Fortran compilers