POSIX shared memory is an inter-process communication (IPC) mechanism defined in POSIX specification. After setting up the shared memory, two (or more) processes may read from and write to the shared memory region. Compared to other IPC mechanisms (e.g. pipe, socket, etc), POSIX shared memory does not impose copy overheads, thus it is appealing to some applications.
Overview
A program using POSIX shared memory usually consists of these steps:
- Create or open a shared memory object with
shm_open()
. A file descriptor will be returned ifshm_open()
creates a shared memory object successfully. - Set the shared memory object size with
ftruncate()
. - Map the shared memory object into the current address space with
mmap()
andMAP_SHARED
. - Read/write the shared memory.
- Unmap the shared memory with
munmap()
. - Close the shared memory object with
close()
. - Delete the shared memory object with
shm_unlink()
.
There are several file-related system calls, such as ftruncate()
,
mmap()
, munmap()
, and close()
. These system calls have
similar behaviors on both regular files and shared memory objects. Your
experiences on these system calls can be applied to shared memory objects as
well. For example, although mmap()
succeeds even if the mapped size is
larger than the file size, a signal SIGBUS
is raised if the program
accesses the region that is larger than the file size. As a result, you have to
call ftruncate()
before using the shared memory.
shm_open()
and shm_unlink()
are shared memory specific system
calls. We will elaborate them in the next section.
API Summary
shm_open()
is the shared memory object anology of open()
. It
creates a shared memory object with the specified name or opens an existing
shared memory object from the specified name:
int shm_open(const char *name, int flag, mode_t mode);
The first argument name
must start with a slash /
character and
continue with several non-slash characters. The second argument flag
can be the combination of O_RDONLY
, O_RDWR
, O_CREAT
,
and/or O_EXCL
.
O_RDONLY
stands for read-only. If a program opens a shared memory object withO_RDONLY
, it can only read the shared memory and must not write to the shared memory.O_RDWR
stands for read and write. If a program opens a shared memory object withO_RDWR
, it can read from or write to the shared memory.O_CREAT
stands for create. If the shared memory object does not exist, a new shared memory object will be created. Conversely, ifO_CREAT
is not set and the shared memory object does not exist, an error will be returned.O_EXCL
stands for exclusive. This must be set withO_CREAT
. If the shared memory object does not exist, a new shared memory object will be created. If the shared memory object exists, an error will be returned.
The third argument mode
is the file permission of the created shared
memroy object. If O_CREAT
is not specified or the shared memory object
exists, then mode
is ignored.
Shared memory objects are kernel persistent. Unless they are deleted, they are
kept until the computer reboots. shm_unlink()
is the shared memory
object anology of unlink()
:
int shm_unlink(const char *name);
The first argument name
is the name of the shared memory object which
you would like to delete.
Sender and Receiver
Let's start from a sender-receiver example. In this example, the sender will
create a shared memory object named /shmem-example
and write 3 integers
into the shared memory. The receiver will open the shared memory object and
read 3 integers from the shared memory.
For the sake of brevity, we assume the receiver runs after the sender completes. The synchronization problem will be covered in the next section.
This is the source code of protocol.h
:
#ifndef PROTOCOL_H
#define PROTOCOL_H
#define NAME "/shmem-example"
#define NUM 3
#define SIZE (NUM * sizeof(int))
#endif /* PROTOCOL_H */
This is the source code of sender.c
:
#include "protocol.h"
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
int main() {
int fd = shm_open(NAME, O_CREAT | O_EXCL | O_RDWR, 0600);
if (fd < 0) {
perror("shm_open()");
return EXIT_FAILURE;
}
ftruncate(fd, SIZE);
int *data =
(int *)mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
printf("sender mapped address: %p\n", data);
for (int i = 0; i < NUM; ++i) {
data[i] = i;
}
munmap(data, SIZE);
close(fd);
return EXIT_SUCCESS;
}
This is the source code of receiver.c
:
#include "protocol.h"
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
int main() {
int fd = shm_open(NAME, O_RDONLY, 0666);
if (fd < 0) {
perror("shm_open()");
return EXIT_FAILURE;
}
int *data =
(int *)mmap(0, SIZE, PROT_READ, MAP_SHARED, fd, 0);
printf("receiver mapped address: %p\n", data);
for (int i = 0; i < NUM; ++i) {
printf("%d\n", data[i]);
}
munmap(data, SIZE);
close(fd);
shm_unlink(NAME);
return EXIT_SUCCESS;
}
Compile and link the code with -lrt
:
$ gcc -o sender sender.c -lrt
$ gcc -o receiver receiver.c -lrt
Run ./sender
:
$ ./sender
sender mapped address: 0x7fe68e346000
After running ./sender
, a shared memory object is created. On Linux,
it can be found under /dev/shm
:
$ ls -l /dev/shm/ | grep shmem-example
-rw------- 1 user user 40 Jan 7 20:59 shmem-example
Now, run ./receiver
:
$ ./receiver
receiver mapped address: 0x7f02df4cd000
0
1
2
After calling shm_unlink()
from receiver.c
,
/dev/shm/shmem-example
is removed:
$ ls -l /dev/shm/shmem-example
ls: cannot access '/dev/shm/shmem-example': No such file or directory
Synchronize with C11 Atomics
C11, the C programming language standard released in 2011, provides several atomic types and operations that enable programmers to synchronize memory accesses between threads.
All variables that have atomic types (e.g. atomic_int
,
atomic_char
, etc) are atomic objects. The function
atomic_load()
can load a value from an atomic object. The function
atomic_store()
can store a value to an atomic object:
atomic_int state;
int value = atomic_load(&state);
atomic_store(&state, value);
These two functions are atomic operations with sequential consistency. When we are reasoning the possible executions, we can assume that:
- There is a total ordering for all atomic operations from all threads.
- Within a thread, all atomic operations obey the program order.
- If a thread loads from an atomic object
atomic_load(A)
and observes anatomic_store(A, x)
, all memory operations (including non-atomic operations) prior toatomic_store(A, x)
will be visible to the thread.
Based on these rules, a spinlock can be implemented as:
while (atomic_load(&state) != x) {}
To demonstrate how C11 atomic synchronization works with POSIX shared memory,
the example below includes a request
program and a worker
program. They communicate through a shared structure:
struct Data {
atomic_int state;
int data[];
};
Their protocol consists of following steps:
ftruncate()
will fill the structure with zeros.- The worker program will wait until
state
becomes 1. - The request program will fill in the
data
array and setstate
to 1. - The request program will wait until
state
becomes 2. - The worker program will update the
data
array, setstate
to 2, and exit. - The request program will read the
data
array, which was updated by the worker program.
This is the source code of protocol.h
:
#ifndef PROTOCOL_H
#define PROTOCOL_H
#include <stdatomic.h>
struct Data {
atomic_int state;
int data[];
};
#define NAME "/shmem-example"
#define NUM 3
#define SIZE (sizeof(struct Data) + NUM * sizeof(int))
#endif /* PROTOCOL_H */
This is the source code of request.c
:
#include "protocol.h"
#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
int main() {
int fd = shm_open(NAME, O_CREAT | O_EXCL | O_RDWR, 0600);
if (fd < 0) {
perror("shm_open()");
return EXIT_FAILURE;
}
ftruncate(fd, SIZE);
struct Data *data = (struct Data *)
mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
printf("request: mapped address: %p\n", data);
for (int i = 0; i < NUM; ++i) {
data->data[i] = i;
}
printf("request: release initial data\n");
atomic_store(&data->state, 1);
printf("request: waiting updated data\n");
while (atomic_load(&data->state) != 2) {}
printf("request: acquire updated data\n");
printf("request: updated data:\n");
for (int i = 0; i < NUM; ++i) {
printf("%d\n", data->data[i]);
}
munmap(data, SIZE);
close(fd);
shm_unlink(NAME);
return EXIT_SUCCESS;
}
This is the source code of worker.c
:
#include "protocol.h"
#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
int main() {
int fd = -1;
while (fd == -1) {
fd = shm_open(NAME, O_RDWR, 0666);
if (fd < 0 && errno != ENOENT) {
perror("shm_open()");
return EXIT_FAILURE;
}
}
ftruncate(fd, SIZE);
struct Data *data = (struct Data *)
mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
printf("worker: mapped address: %p\n", data);
printf("worker: waiting initial data\n");
while (atomic_load(&data->state) != 1) {}
printf("worker: acquire initial data\n");
printf("worker: update data\n");
for (int i = 0; i < NUM; ++i) {
data->data[i] = data->data[i] * 42;
}
printf("worker: release updated data\n");
atomic_store(&data->state, 2);
munmap(data, SIZE);
close(fd);
return EXIT_SUCCESS;
}
Then, compile the request
program with:
$ gcc -o request request.c -lrt
And compile the worker
program with:
$ gcc -o worker worker.c -lrt
To test the code, first run ./worker &
in the background and then run
./request
:
$ ./worker &
$ ./request
request: mapped address: 0x7f31c5b0c000
request: release initial data
request: waiting updated data
worker: mapped address: 0x7fcdbcfd8000
worker: waiting initial data
worker: acquire initial data
worker: update data
worker: release updated data
request: acquire updated data
request: updated data:
0
42
84
The output shows that the spinlocks work as expected. The execution has the order:
- The request program releases the initial data.
- The worker program acquires the initial data.
- The worker program releases the updated data.
- The request program acquires the updated data.
And the request program get the expected 0
, 42
, 84
.
Note
The order of waiting
is interesting as well. If you add
sleep(1)
to the line after the shm_open()
function call in
request.c
, you will see a different ordering. But the order of
acquire and release must remain the same.
Epilogue
In this post, we covered two POSIX shared memory APIs shm_open()
and
shm_unlink()
. In addition, we explained how to synchronize shared
memory with atomic_load()
, atomic_store()
, and spinlocks.
Although spinlock is easy to understand, it needlessly wastes CPU cycles.
POSIX semaphore is an alternative for inter-process synchronization. We will
cover POSIX semaphore in the next post. Stay tuned.
A Historical Note
Before POSIX shared memory was standardized, System V shared memory (a part of
XSI Interprocess Communication) was a common alternative. To use
System V shared memory, you have to generate a token with ftok()
,
create a shared memory with shmget()
, and then attach or detach the
shared memory with shmat()
or shmdt()
. However, I prefer POSIX
shared memory because shm_open()
returns a file descriptor and it can
incorperate with other file system calls, such as ftruncate()
,
fstat()
, fcntl()
, and mmap()
.
Reference
- The Open Group Base Specifications Issue 7, System Interfaces, shm_open()
- The Open Group Base Specifications Issue 7, System Interfaces, shm_unlink()
- The Open Group Base Specifications Issue 7, System Interfaces, XSI Interprocess Communication
- LWN, Changing the default shared memory limits
- ipc/shm.c: increase the defaults for SHMALL, SHMMAX
- Red Hat Enterprise Linux Manual, Chapter 7. Setting Shared Memory