POSIX Shared Memory

POSIX shared memory is an inter-process communication (IPC) mechanism defined in POSIX specification. After setting up the shared memory, two (or more) processes may read from and write to the shared memory region. Compared to other IPC mechanisms (e.g. pipe, socket, etc), POSIX shared memory does not impose copy overheads, thus it is appealing to some applications.

Overview

A program using POSIX shared memory usually consists of these steps:

Create or open a shared memory object with shm_open(). A file descriptor will be returned if shm_open() creates a shared memory object successfully.
Set the shared memory object size with ftruncate().
Map the shared memory object into the current address space with mmap() and MAP_SHARED.
Read/write the shared memory.
Unmap the shared memory with munmap().
Close the shared memory object with close().
Delete the shared memory object with shm_unlink().

There are several file-related system calls, such as ftruncate(), mmap(), munmap(), and close(). These system calls have similar behaviors on both regular files and shared memory objects. Your experiences on these system calls can be applied to shared memory objects as well. For example, although mmap() succeeds even if the mapped size is larger than the file size, a signal SIGBUS is raised if the program accesses the region that is larger than the file size. As a result, you have to call ftruncate() before using the shared memory.

shm_open() and shm_unlink() are shared memory specific system calls. We will elaborate them in the next section.

API Summary

shm_open() is the shared memory object anology of open(). It creates a shared memory object with the specified name or opens an existing shared memory object from the specified name:

int shm_open(const char *name, int flag, mode_t mode);

The first argument name must start with a slash / character and continue with several non-slash characters. The second argument flag can be the combination of O_RDONLY, O_RDWR, O_CREAT, and/or O_EXCL.

O_RDONLY stands for read-only. If a program opens a shared memory object with O_RDONLY, it can only read the shared memory and must not write to the shared memory.
O_RDWR stands for read and write. If a program opens a shared memory object with O_RDWR, it can read from or write to the shared memory.
O_CREAT stands for create. If the shared memory object does not exist, a new shared memory object will be created. Conversely, if O_CREAT is not set and the shared memory object does not exist, an error will be returned.
O_EXCL stands for exclusive. This must be set with O_CREAT. If the shared memory object does not exist, a new shared memory object will be created. If the shared memory object exists, an error will be returned.

The third argument mode is the file permission of the created shared memroy object. If O_CREAT is not specified or the shared memory object exists, then mode is ignored.

Shared memory objects are kernel persistent. Unless they are deleted, they are kept until the computer reboots. shm_unlink() is the shared memory object anology of unlink():

int shm_unlink(const char *name);

The first argument name is the name of the shared memory object which you would like to delete.

Shared Memory Objects and Linux Kernels

On Linux, all shared memory objects can be found in /dev/shm. You may list them with ls -l /dev/shm. You may also remove a shared memory object with rm /dev/shm/[name]. This is handy when you are debugging your program.

Prior to Linux 3.16, the size of shared memory objects is limited to 32MB. You may check the value in procfs:

$ cat /proc/sys/kernel/shmmax
18446744073692774399

Since Linux 3.16, the kernel supports unlimited shared memory size. If the system administrator did not change the limit, then it is ULONG_MAX - (1 << 24) (i.e. 18014398509465599 on 64-bit machines), which stands for unlimited. Of course, this is the theoretical upper bound. The physical RAM size and SWAP size may impose other limits.

Sender and Receiver

Let's start from a sender-receiver example. In this example, the sender will create a shared memory object named /shmem-example and write 3 integers into the shared memory. The receiver will open the shared memory object and read 3 integers from the shared memory.

For the sake of brevity, we assume the receiver runs after the sender completes. The synchronization problem will be covered in the next section.

This is the source code of protocol.h:

#ifndef PROTOCOL_H
#define PROTOCOL_H

#define NAME "/shmem-example"

#define NUM 3

#define SIZE (NUM * sizeof(int))

#endif  /* PROTOCOL_H */

This is the source code of sender.c:

#include "protocol.h"

#include <stdio.h>
#include <stdlib.h>

#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>

int main() {
  int fd = shm_open(NAME, O_CREAT | O_EXCL | O_RDWR, 0600);
  if (fd < 0) {
    perror("shm_open()");
    return EXIT_FAILURE;
  }

  ftruncate(fd, SIZE);

  int *data =
      (int *)mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
  printf("sender mapped address: %p\n", data);

  for (int i = 0; i < NUM; ++i) {
    data[i] = i;
  }

  munmap(data, SIZE);

  close(fd);

  return EXIT_SUCCESS;
}

This is the source code of receiver.c:

#include "protocol.h"

#include <stdio.h>
#include <stdlib.h>

#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>

int main() {
  int fd = shm_open(NAME, O_RDONLY, 0666);
  if (fd < 0) {
    perror("shm_open()");
    return EXIT_FAILURE;
  }

  int *data =
      (int *)mmap(0, SIZE, PROT_READ, MAP_SHARED, fd, 0);
  printf("receiver mapped address: %p\n", data);

  for (int i = 0; i < NUM; ++i) {
    printf("%d\n", data[i]);
  }

  munmap(data, SIZE);

  close(fd);

  shm_unlink(NAME);

  return EXIT_SUCCESS;
}

Compile and link the code with -lrt:

$ gcc -o sender sender.c -lrt
$ gcc -o receiver receiver.c -lrt

Run ./sender:

$ ./sender
sender mapped address: 0x7fe68e346000

After running ./sender, a shared memory object is created. On Linux, it can be found under /dev/shm:

$ ls -l /dev/shm/ | grep shmem-example
-rw------- 1 user    user      40 Jan 7 20:59 shmem-example

Now, run ./receiver:

$ ./receiver
receiver mapped address: 0x7f02df4cd000
0
1
2

After calling shm_unlink() from receiver.c, /dev/shm/shmem-example is removed:

$ ls -l /dev/shm/shmem-example
ls: cannot access '/dev/shm/shmem-example': No such file or directory

Synchronize with C11 Atomics

C11, the C programming language standard released in 2011, provides several atomic types and operations that enable programmers to synchronize memory accesses between threads.

All variables that have atomic types (e.g. atomic_int, atomic_char, etc) are atomic objects. The function atomic_load() can load a value from an atomic object. The function atomic_store() can store a value to an atomic object:

atomic_int state;
int value = atomic_load(&state);
atomic_store(&state, value);

These two functions are atomic operations with sequential consistency. When we are reasoning the possible executions, we can assume that:

There is a total ordering for all atomic operations from all threads.
Within a thread, all atomic operations obey the program order.
If a thread loads from an atomic object atomic_load(A) and observes an atomic_store(A, x), all memory operations (including non-atomic operations) prior to atomic_store(A, x) will be visible to the thread.

Based on these rules, a spinlock can be implemented as:

while (atomic_load(&state) != x) {}

To demonstrate how C11 atomic synchronization works with POSIX shared memory, the example below includes a request program and a worker program. They communicate through a shared structure:

struct Data {
  atomic_int state;
  int data[];
};

Their protocol consists of following steps:

ftruncate() will fill the structure with zeros.
The worker program will wait until state becomes 1.
The request program will fill in the data array and set state to 1.
The request program will wait until state becomes 2.
The worker program will update the data array, set state to 2, and exit.
The request program will read the data array, which was updated by the worker program.

This is the source code of protocol.h:

#ifndef PROTOCOL_H
#define PROTOCOL_H

#include <stdatomic.h>

struct Data {
  atomic_int state;
  int data[];
};

#define NAME "/shmem-example"

#define NUM 3

#define SIZE (sizeof(struct Data) + NUM * sizeof(int))

#endif  /* PROTOCOL_H */

This is the source code of request.c:

#include "protocol.h"

#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>

#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>

int main() {
  int fd = shm_open(NAME, O_CREAT | O_EXCL | O_RDWR, 0600);
  if (fd < 0) {
    perror("shm_open()");
    return EXIT_FAILURE;
  }

  ftruncate(fd, SIZE);

  struct Data *data = (struct Data *)
      mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
  printf("request: mapped address: %p\n", data);

  for (int i = 0; i < NUM; ++i) {
    data->data[i] = i;
  }

  printf("request: release initial data\n");
  atomic_store(&data->state, 1);

  printf("request: waiting updated data\n");
  while (atomic_load(&data->state) != 2) {}
  printf("request: acquire updated data\n");

  printf("request: updated data:\n");
  for (int i = 0; i < NUM; ++i) {
    printf("%d\n", data->data[i]);
  }

  munmap(data, SIZE);

  close(fd);

  shm_unlink(NAME);

  return EXIT_SUCCESS;
}

This is the source code of worker.c:

#include "protocol.h"

#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>

#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>

int main() {
  int fd = -1;
  while (fd == -1) {
    fd = shm_open(NAME, O_RDWR, 0666);
    if (fd < 0 && errno != ENOENT) {
      perror("shm_open()");
      return EXIT_FAILURE;
    }
  }

  ftruncate(fd, SIZE);

  struct Data *data = (struct Data *)
      mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
  printf("worker: mapped address: %p\n", data);

  printf("worker: waiting initial data\n");
  while (atomic_load(&data->state) != 1) {}
  printf("worker: acquire initial data\n");

  printf("worker: update data\n");
  for (int i = 0; i < NUM; ++i) {
    data->data[i] = data->data[i] * 42;
  }

  printf("worker: release updated data\n");
  atomic_store(&data->state, 2);

  munmap(data, SIZE);

  close(fd);

  return EXIT_SUCCESS;
}

Then, compile the request program with:

$ gcc -o request request.c -lrt

And compile the worker program with:

$ gcc -o worker worker.c -lrt

To test the code, first run ./worker & in the background and then run ./request:

$ ./worker &
$ ./request
request: mapped address: 0x7f31c5b0c000
request: release initial data
request: waiting updated data
worker: mapped address: 0x7fcdbcfd8000
worker: waiting initial data
worker: acquire initial data
worker: update data
worker: release updated data
request: acquire updated data
request: updated data:
0
42
84

The output shows that the spinlocks work as expected. The execution has the order:

The request program releases the initial data.
The worker program acquires the initial data.
The worker program releases the updated data.
The request program acquires the updated data.

And the request program get the expected 0, 42, 84.

Note

The order of waiting is interesting as well. If you add sleep(1) to the line after the shm_open() function call in request.c, you will see a different ordering. But the order of acquire and release must remain the same.

Epilogue

In this post, we covered two POSIX shared memory APIs shm_open() and shm_unlink(). In addition, we explained how to synchronize shared memory with atomic_load(), atomic_store(), and spinlocks. Although spinlock is easy to understand, it needlessly wastes CPU cycles. POSIX semaphore is an alternative for inter-process synchronization. We will cover POSIX semaphore in the next post. Stay tuned.

A Historical Note

Before POSIX shared memory was standardized, System V shared memory (a part of XSI Interprocess Communication) was a common alternative. To use System V shared memory, you have to generate a token with ftok(), create a shared memory with shmget(), and then attach or detach the shared memory with shmat() or shmdt(). However, I prefer POSIX shared memory because shm_open() returns a file descriptor and it can incorperate with other file system calls, such as ftruncate(), fstat(), fcntl(), and mmap().

Reference

The Open Group Base Specifications Issue 7, System Interfaces, shm_open()
The Open Group Base Specifications Issue 7, System Interfaces, shm_unlink()
The Open Group Base Specifications Issue 7, System Interfaces, XSI Interprocess Communication
LWN, Changing the default shared memory limits
ipc/shm.c: increase the defaults for SHMALL, SHMMAX
Red Hat Enterprise Linux Manual, Chapter 7. Setting Shared Memory

POSIX Shared Memory

Overview

API Summary

Shared Memory Objects and Linux Kernels

Sender and Receiver

Synchronize with C11 Atomics

Epilogue

A Historical Note

Reference

Logan's Note