WWAD 1: Concurrency

2022 JUNE 02 LANG, WWAD

Sometimes in a program, we'd like to do several things simultaneously, and often we call it concurrency¹. In this first installment of What Would Ada Do?, we'll look at Ada's tasking system – how it works, some benefits and some disadvantages and dangers.

Table of contents

Tasks
Structured concurrency
Pitfalls
Conclusion
Notes

Tasks

A task is some piece of code that is run concurrently with the rest of the program².

As an example, let's write a simple concurrent program. This program will have a task Storage, in which we can first store an integer and later fetch it back. This will be done with two “messages” Store and Fetch, which must be sent in that order. The main task will first send a message to the Storage task to store a number, and will later fetch it.

First, note that a task must be declared within some declarative region (such as inside a procedure, a package or another task). In this example, we'll just put it inside the Main procedure.

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is

Now we need to declare the external interface of the Storage task. This involes which “messages” (which are called entries in Ada) can be called from another task. In this case, we just have two.

  task Storage is
    entry Store (X :     Integer);
    entry Fetch (X : out Integer);
  end Storage_Task;

Entry declarations are largely similar to procedure declarations. Store takes a single integer parameter, while Fetch takes an out Integer. These can be assigned to, which gives us a way to return results.

Next, we will implement the task. It is fairly straightforward, though take notice of the accept statements. These “implement” the entries declared above, although they are statements, not declarations. An accept statement effectively blocks until the given entry is called from some other task, at which point they will be executed and control moves onto the next statement.

  task body Storage is
    Value : Integer;
  begin
    Put_Line ("Storage 1");

    -- Handle the 'Store' entry
    accept Store (X : Integer) do
      Value := X;
    end Store;

    Put_Line ("Storage 2");

    -- Handle the 'Fetch' entry
    accept Fetch (X : out Integer) do
      X := Value;
    end Fetch;

    Put_Line ("Storage 3");
  end Storage;

That concludes the specification and implementation of the Storage task. Finally, we'll implement the Main procedure, which will communicate with the task using the entries. Note that entry calls also block until the corresponding accept has finished processing it. This means that the program will deadlock if Storage.Fetch is called before Storage.Store. This is because the task implementation in the beginning only expects the Store entry, which means both tasks will block forever.

  X : Integer;
begin
  Put_Line ("Main 1");

  -- Call the 'Store' entry on the 'Storage' task
  Storage.Store (3);

  Put_Line ("Main 2");

  -- Call the 'Fetch' entry
  Storage.Fetch (X);

  Put_Line ("Main " & Integer'Image(X));
end Main;

If we run this program, we'll get some output that looks something like this:

Storage 1
Main 1
Storage 2
Main 2
Main 3
Storage 3

Note that the order in which lines with the same number appear may vary, as those bits of code run concurrently. However, we will always get both lines with 1 before 2, of which both will always come before 3. This is because entries are synchronizing – on entry calls, the caller will block until the corresponding accept finishes, while an accept will block until the corresponding entry is called. In this way, they act like a powerful intra-task function call mechanism.

One final thing is that we are guaranteed to get all six messages. This is because a scope in Ada will wait for all its contained tasks to finish before it itself stops. If you think of a task like a thread, imagine that there's an implicit thread.join() at the end of every block. One consequence of this is that you can safely access stack-allocated variables from outside of the task (modulo race conditions).

Ada's concurrency support is more featureful than this, including protected objects to guard against data races, select, delay, and termination statements to accept any of multiple entries or to end the task if all other tasks are done, and task types to store tasks in variables and arrays.

Structured concurrency

One of the greatest advantages of Ada's concurrency support, in my opinion, is that it serves as a genuinely high-level abstraction over lower-level concurrency primitives, like threads, mutexes and channels. Instead of having to manually manage and keep track of different methods of data sharing and synchronization, the task system neatly abstracts over it and takes care of it for you.

That's not to say that such lower-lovel primitives are unnecessary, but in my experience, the “common case” is one where the implementation specifics are unimportant. One interesting consequence of this abstraction is that Ada has potentially one of the simplest examples of a deadlock I know of:

procedure Main is
  task Locky is
    entry A;
    entry B;
  end Locky;

  task body Locky is
  begin
    -- Block until 'A' is called
    accept A do
      null;
    end A;
  end Locky;
begin
  -- Block until 'B' is accepted
  Locky.B;
end Main;

Whether this is a symptom of a good abstraction of concurrency, a result of bad design, or just a lacking one is arguable.

Deadlocks aside, the fact that all tasks are guaranteed to finish (even in the face of exceptions) represents one of the earlier forms of structured concurrency³.

Pitfalls

As mentioned, deadlocks and data races are dead easy in Ada⁴. Although protected objects prevent the latter, they can be quite heavyweight (given that they are essentially fancy mutexes). Ada 202x improves on this area with standard library-support for atomic operations and parallel blocks and loops for more lightweight structured concurrency (and better parallelism support).

Part of the reason that deadlocks are so easy is because the ordering of entry calls is not part of their interface. Integrating something like session types in order to statically enforce this may be an interesting avenue for exploration⁵.

The fact that tasks run to completion even in the face of exceptions means that long-running or infinitely spinning tasks need to include complicated logic if they want exceptions to bubble up fast. Exceptions raised in tasks other than the main task will also disappear, which is unfortunate given that the structured concurrency means they could potentially all end up on the same thread.

Finally, exceptions are also a great way to deadlock:

procedure Main is
  task Locky is
    entry A;
  end Locky;
  
  task body Locky is
  begin
    -- This task won't terminate until
    -- 'A' is called
    accept A do
      null;
    end A;
  end Locky;

  E : exception;
begin
  -- This exception won't unwind until
  -- all tasks have terminated
  raise E;
end Main;

Conclusion

Perhaps the main thing I wanted to present here is what I believe to be a pretty good abstraction for writing concurrent code. Although it is far from perfect, with odd interactions between various other features, a complicated and at times heavy featureset, and accessible ways of shooting your own foot, I think it serves as a better form of concurrency than the common fire-and-forget threading model.

Other features like generators and async-await serve a similar nice with an equally (if not more) high level of abstraction. Although the tasking model is perhaps more flexible than either of these, you could argue that is largely detrimental (when was the last time you caused a deadlock with a generator?). In that sense, tasks lie perhaps somewhere in between manually working with primitives like threads, channels and mutexes, and writing asynchronous functions which hide the low-level machinery.