Skip to content

Commit

Permalink
Scheduler and stealing work in progress
Browse files Browse the repository at this point in the history
  • Loading branch information
VoxSciurorum committed Aug 15, 2022
1 parent 56cab7b commit 8863fe4
Showing 1 changed file with 38 additions and 26 deletions.
64 changes: 38 additions & 26 deletions src/posts/scheduler.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,22 @@
I want to explain how the OpenCilk compiler implements spawn, but
first I need to explain what spawn means. The `cilk_spawn` keyword is
fundamentally different from C's `pthread_create` and Java's
`Thread.start`.
`Thread.start`. This distinction goes back to the origins of Cilk in
the 1990s and my description applies to the whole Cilk family of
languages.

The old approach to multithreading made programs that depended on
threads to work. A program with a producer and consumer thread will
hang if one of the threads doesn't run.
By making threads explicit in the programming model, the old approach
to multithreading made programs that depended on threads to work. A
program with a producer and consumer thread will hang if one of the
threads doesn't run.

A correct Cilk program behaves the same whether it runs on one thread
or many. Usually you don't specify how many processors to use. By
default Cilk uses as many _workers_ (threads running user code) as
your system has processors. Then spawns enable parallelism. You can
also ask it to run single-threaded. Then spawns don't do anything,
but the program is still the same.
or many. You do not have to create threads and it is poor style to
examine thread state. By default Cilk uses as many _workers_ (threads
running user code) as your system has processors. Spawns tell the
system that part of the program can be moved to these workers. You
can also ask it to run single-threaded. Then spawns don't do
anything, but the program is still the same.

## Spawning usually does nothing

Expand Down Expand Up @@ -50,26 +54,27 @@ When a function is spawned, i.e. a function call is prefixed with
`cilk_spawn`, the spawned function is called the _child_ and function
with the `cilk_spawn` keyword is called the _parent_.

Each worker has a deque (double-ended queue) of functions that have
spawned. We call one end the _head_ and the other the _tail_.
Spawning pushes the *parent* onto the tail. Returning from a spawn pops
the parent off the tail.
Each worker has a deque (double-ended queue) of parent functions,
functions that have spawned. We call one end the _head_ and the other
the _tail_. Spawning pushes the *parent* onto the tail. Returning
from a spawn pops the parent off the tail.

Usually that's all that happens. Functions get pushed, functions get
popped, and in the end push and pop cancel out.

Once in a while a worker has nothing to do. It looks at other workers
and _steals_ a function by popping it off the *head* of the other
worker's deque. This _work stealing_ is how parts of the program move
between processors.
Sometimes, especially at the start of a parallel region of code, a
worker has nothing to do. An idle worker _steals_ a function from a
busy worker by popping it off the *head* of the other worker's deque.
This _work stealing_ is how parts of the program move between
processors.

## Only monsters steal children

The thing that was popped off the head of the other worker's deque is
a function that is suspended at a function call. More specifically,
it is a data structure with enough information to resume the parent
function as if the spawned child had returned. The thief does this on
a new processor.
The thing that was popped off the head of the other worker's deque
describes a function that is suspended at a function call. More
specifically, it is a data structure with enough information to resume
the parent function as if the spawned child had returned. The thief
does this on a new processor.

The worker from which work was stolen, sometimes called the _victim_,
is so far oblivious. It continues doing whatever it was doing in the
Expand All @@ -81,10 +86,12 @@ This scheduling policy avoids deadlocks and unnecessary migration of
work between processors.

The parent function, running on a new processor, has a flag set
indicating that it has been stolen and it is not _synced_. It might
spawn again and be stolen again. In any case it will eventually
execute a `cilk_sync`. This triggers a call to the Cilk runtime which
suspends the function until all spawned children return.
indicating that it has been stolen. It might spawn again and be
stolen again. In any case it will eventually execute a `cilk_sync`.
If the function has never been stolen (the usual case) `cilk_sync`
does nothing. If the function has been stolen `cilk_sync` calls into
the Cilk runtime. The runtime suspends the function until all spawned
children return.

The spawned child will eventually return. When it returns the worker
tries to pop the tail of the deque. This fails: the deque is empty.
Expand All @@ -94,3 +101,8 @@ continue. The worker is now idle and can start stealing.
(We have an optimization for the common case where the parent reached
a `cilk_sync` and is waiting for the spawn to complete.)

## Stay tuned

Having described at a high level what `cilk_spawn` does, next time I
will describe what the compiler does to your code when you spawn.

0 comments on commit 8863fe4

Please sign in to comment.