quantdev.blog - Coroutines

Coroutines

You’ve likely heard about this new C++20 feature, coroutines. I think that this is a really important subject and there are several cool use-cases for coroutines. A coroutine in the simplest terms is just a function that you can pause in the middle. Imagine that you are executing a function top-down and then just at some point in between, you’d like to say, I am going to hit the pause button here; going to return the control back to the caller. At a later point the caller will decide to resume the execution of the function right where you left off. Unlike a function therefore, coroutines are always stateful - you atleast need to remember where you left off in the function body. Usually, you also need to remember more than this; you need to remember the values of all of the local variables were at the point where you paused. As a consequence, you can think of the coroutine you are calling not as a function in the classical sense of the term, but more like a factory function that actually returns the coroutine object back to you. And this coroutine object is the one that holds all the state, and which you can resume to continue the computation at a future time.

%load_ext itikz

Coroutines can simplify our code! Coroutines are a great tool, when it comes to implementing parsers.

Compared to functions, the control flow of coroutines looks as follows:

Show the code

%%itikz --temp-dir --tex-packages=tikz --tikz-libraries=arrows.meta --implicit-standalone
\begin{tikzpicture}[scale=1.5]
\draw[blue, rounded corners] (0,0) rectangle (3,5);
\node(A) at (1.5,4.75) {Caller};
\node(B) at (1.5,3.50) {};
\node(C) at (1.5,3.20) {};
\node(call) at (2.20,4.20) {call};
\node(D) at (1.5,0.20) {};
\node(F) at (5.5,4.75) {Function};
\node(Func) at (5.55,4.50) {};
\node(Return) at (5.55,0.25) {};
\node(return_back) at (6.2,0.50) {\texttt{return}};
\draw[dashed, -{Stealth[length=3mm]}] (A) -- (B);
\draw[dashed, -{Stealth[length=3mm]}] (B) -- (Func);
\draw[dashed, -{Stealth[length=3mm]}] (Func) -- (Return);
\draw[dashed, -{Stealth[length=3mm]}] (Return) -- (C);
\draw[dashed, -{Stealth[length=3mm]}] (C) -- (D);
\draw[blue, rounded corners] (4,0) rectangle (7,5);
\end{tikzpicture}

Fig. Functions - control flow

Show the code

%%itikz --temp-dir --tex-packages=tikz --tikz-libraries=arrows.meta --implicit-standalone
\begin{tikzpicture}[scale=1.5]
\draw[blue, rounded corners] (0,0) rectangle (3,7);
\node(F) at (5.5,6.75) {Coroutine};
\node(A) at (1.5,6.75) {Caller};
\node(B) at (1.5,5.50) {};
\node(coro_A) at (5.55,6.50) {};
\node(coro_B) at (5.55,5.20) {};
\node(suspend_1) at (4.75,5.70) {suspend};
\node(coyield_1) at (6.50,5.25) {\texttt{co\_yield}};
\node(call_1) at (2.20,6.20) {call};
\node(C) at (1.5,5.20) {};
\node(D) at (1.5,3.70) {};
\node(coro_C) at (5.55,3.70) {};
\node(resume_1) at (4.75,4.10) {resume};
\node(coro_D) at (5.55,2.20) {};
\node(suspend_2) at (4.75,2.70) {suspend};
\node(coyield_2) at (6.50,2.25) {\texttt{co\_yield}};
\node(E) at (1.5, 2.20) {};
\node(F) at (1.5, 1.25) {};
\node(resume_2) at (4.75,1.55) {resume};
\node(coro_F) at (5.55, 1.25) {};
\node(coro_G) at (5.55, 0.25) {};
\node(return_back) at (6.2,0.25) {\texttt{co\_return}};
\node(H) at (1.5, 0.25) {};
\draw[dashed, -{Stealth[length=3mm]}] (A) -- (B);
\draw[dashed, -{Stealth[length=3mm]}] (B) -- (coro_A);
\draw[dashed, -{Stealth[length=3mm]}] (coro_A) -- (coro_B);
\draw[dashed, -{Stealth[length=3mm]}] (coro_B) -- (C);
\draw[dashed, -{Stealth[length=3mm]}] (C) -- (D);
\draw[dashed, -{Stealth[length=3mm]}] (D) -- (coro_C);
\draw[dashed, -{Stealth[length=3mm]}] (coro_C) -- (coro_D);
\draw[dashed, -{Stealth[length=3mm]}] (coro_D) -- (E);
\draw[dashed, -{Stealth[length=3mm]}] (E) -- (F);
\draw[dashed, -{Stealth[length=3mm]}] (F) -- (coro_F);
\draw[dashed, -{Stealth[length=3mm]}] (coro_F) -- (coro_G);
\draw[dashed, -{Stealth[length=3mm]}] (coro_G) -- (H);
\draw[blue, rounded corners] (4,0) rectangle (7,7);
\end{tikzpicture}

Fig. Coroutines - control flow

Coroutine frames (state of all the local variables at the point where you paused) may be stored on the stack or on the dynamic heap. C++ implements the latter approach - stackless coroutines.

When using coroutines, we are talking about cooperative multi-tasking. As a quick 101, in preemptive multitasking, a special hardware timer interrupt is sent to the CPU periodically, the register state of the currently running process is saved to the main memory (in a data-structure called the process context) and the OS kernel acquires control of the CPU. The OS scheduler uses a scheduling policy such as round-robin to schedule another process from the process queue, copies its context from main memory to the CPU registers and transfers control to it. This is called a context switch. In cooperative multitasking, each process/thread, once running decides for how long to keep the CPU, and (crucially) when it is time to give it up so that another thread can use it.

A simple example

Let’s start with a simple example. Let’s say, you want to compute the Fibonacci sequence. First thing, that you’d do, is code up a function fibo(int) that returns a sequence in a vector.

// Returns a vector containing the 
// first n elements of the Fibonacci 
// series: 
// 1, 1, 2, 3, 5, 8, 13, 21, ...
std::vector<int> fibo(int n)
{
    std::vector<int> results{};
    int a_prev{1}, a_curr{1};
    int a_next{0};
    results.push_back(a_prev);
    results.push_back(a_curr);
    for(int i=2;i<=n;++i)
    {
        a_next = a_prev + a_curr;
        results.push_back(a_next);
        a_prev = a_curr;
        a_curr = a_next;
    }

    return results;
}

This naive implementation has a number of disadvantages. For example, this function will always require \(O(n)\) storage. Fibonacci has a simple recursive equation, but if I have a complex computation, and I am only ever processing the numbers sequentially one at a time, then it makes no sense to pay for the entire \(O(n)\) storage. Another problem is that, if you are dealing with an infinite range, then it becomes a lot harder to handle.

A different way of implementing this would be as follows. Instead of our function returning a range in a vector, we are just going to return a generator object; this generator object has then a next() member function and then every call to the next() member function gives us back the next number in the sequence.

struct FiboGenerator
{
    // Successive calls to next()  
    // return the numbers from the 
    // Fibonacci series
    int next();
}

// Returns a new FiboGenerator object
// that will start from the first 
// Fibonacci number
FiboGenerator makeFiboGenerator();

The interesting question is: Is makeFiboGenerator() a coroutine? And we really can’t tell - it’s an implementation detail. All we see is an interface. This is actually the most important thing about coroutines. From the outside, they just look and behave like functions. From the outside, there is no way to tell, whether they’ve been implemented using a coroutine or not. The only thing that we need to know as the user of this function is the interface of the object FiboGenerator.

The coroutine return type

The initial call to the coroutine function will produce this return object of a certain ReturnType and hand it back to the caller. The interface of this type is what is going to determine what the coroutine is capable of. Since coroutines are super-flexible, we can do a whole lot with this return object. If you have some coroutine, and you want to understand what it’s doing, the first thing you should look at is the ReturnType, and what it’s interface is. The important thing here is, we design this ReturnType. If you are writing a coroutine, you can decide, what goes into this interface.

How to turn a function into a coroutine?

The compiler looks for one of the three keywords in the implementation: co_yield, co_await and co_return.

Coroutine keywords and their effects
Keyword	Action	State
`co_yield`	Output	Suspended
`co_return`	Output	Ended
`co_await`	Input	Suspended

In the preceding table, we see that after co_yield and co_await, the coroutine suspends itself and after co_return, it is terminated (co_return is the equivalent of the return statement in the C++ function).

A classical function starts executing when it’s called and normally terminates with a return statement or just when the function’s end is reached. A function runs from the beginning to end. It may call another function (or even itself if it is recursive) and it may throw exceptions or have different return points. But it always runs from the beginning to the end.

A coroutine is different. A coroutine is a function that can suspend itself. The flow for a coroutine may be like the following pseudo-code:

ReturnType coroutine(){
    do_something();
    co_yield;
    do_something_else();
    co_yield;
    do_more_work();
    co_return;
}

You should think of the coroutine function not as a function in the classical sense, but rather as a factory function that returns a coroutine object and this coroutine object is the one that holds all the state, and one which you can resume to continue the computation later on.

Use-cases for coroutines

Asynchronous computation. Suppose we are tasked with designing a simple echo server. We listen for incoming data from a client socket and we simply send it back to the client. At some point in our code for the echo server, we will have a piece of logic like below:

void session(Socket sock){
    char buffer[1024];
    int len = sock.read({buffer});
    sock.write({buffer,len});
    log(buffer);
}

We create a buffer, then call the read that reads packets from the NIC and stores them in the buffer. The return value len will tell us how many bytes we read. Then, we shrink the buffer - we create a smaller one with len bytes and we write it back. Finally, we perform some logging. If we run it, it will probably work, but we certainly don’t want to write a server like this. Say one of the clients requests communication and we are in a session. They say, they are ready to send the data, so we are blocking on the read, but maybe they send us this data in 2 minutes, or 5 minutes or even more. And other clients keep waiting.

What we could do is start this session and run it in a separate thread. And this would work, because if this thread is blocked, and if there are other threads in this server, they would run in exchange during the wait time. If we have a threadpool, we’ll just grab a thread from the threadpool, tell whatever system manages the pool to run this session(Socket) function there.

This solution is still far from ideal. These are operating system emulated threads and the OS will have to manage which threads get CPU time. It will have to preempt one thread, run another. This preemption will occur at random moments, when we least expect it. We could have data-races and likely require protection of a critical section/resource using mutexes.

One alternative is to use an asynchronous framework and rewrite our code as follows:

void session(Socket sock){
    struct State{ Socket sock; char buffer[1024]; };
    
    // Heap allocate the state
    auto state = std::make_shared<State>(sock, buffer);

    auto on_read_finished_callback = /* ... */

    // Perform an asynchronous read 
    state->socket.async_read( state->buffer, 
                             on_read_finished_callback );
}

The async_read() call is a request to the framework, which says that we are interested in data being read to our buffer at some point convenient to the client. When this succeeds, call our on_read_finished_callback. So, we are just making this association at this point and the server can move on to doing other stuff.

The read operation may fail for a number of reasons, so a good approach would be supply two arguments to the on_read_finished_callback: (i) An error code (ii) The number of bytes received, if there were no errors. We can try to implement it, by checking the error-code.

void session(Socket sock){
    struct State{ Socket sock; char buffer[1024]; };
    
    // Heap allocate the state
    auto state = std::make_shared<State>(sock, buffer);

    auto on_read_finished_callback = [&state](
        error_code ec, 
        size_t len
    )
    {
        auto done = /* ... */
        if(!ec)
        {
            // Perform an asynchronous write
            state->socket.async_write( 
                state->buffer, 
                done 
            );    
        }
    }

    // Perform an asynchronous read 
    state->socket.async_read( state->buffer, 
                             on_read_finished_callback );
}

Now, we don’t have exception handling, because we have spread our logic over a number of callbacks. We are only left with if statements. If the reading all went fine, we can write the data back by shrinking the buffer, requesting that the write is performed at some time, not necessarily now - maybe later. When this write is performed, we would like to call another callback done. We don’t wait until this callback is called or write is performed. At this point, we just make an association. It is possible, that at a future time, the write succeeds and then the callback done is called. So, we have to define it.

void session(Socket sock){
    struct State{ Socket sock; char buffer[1024]; };
    
    // Heap allocate the state
    auto state = std::make_shared<State>(sock, buffer);

    auto on_read_finished_callback = [state](
        error_code ec, 
        size_t len
    )
    {
        auto done = [state](error_code ec, size_t len)
        {
            if(!ec)
                log();
        }
        if(!ec)
        {
            // Perform an asynchronous write
            state->socket.async_write( 
                state->buffer, 
                done 
            );    
        }
    }

    // Perform an asynchronous read 
    state->socket.async_read( state->buffer, 
                             on_read_finished_callback );
}

done has a similar structure. It takes two arguments - the error code that tells if the write went fine and the number of bytes written len. If there were no errors on write and everything went fine, we can do the logging. Implicitly,

So, the session makes two associations:

\[ \begin{align*} \text{On finishing read} &\mapsto \texttt{on\_finished\_read\_callback} \\ \text{On finishing write} &\mapsto \texttt{done} \\ \text{Accepting a new client connection} &\mapsto \texttt{session} \end{align*} \]

And implicitly there is a third association even though we cannot see it here - this entire function session is most likely a callback, in response to an event like \(\text{On client connection established}\). So, the server will be many different associations of events to callbacks at different levels.

Pay attention to the state. We said that, we wanted to allocate it on the heap and manage it through a shared_ptr. We pass this shared_ptr<State> by value to every single callback. This way, I make sure that the last one who touches this session turns off the lights and deallocates state.

While this is a toy-example, in real production code, there can be a long sequence of steps and calling lambdas inside lambdas can obfuscate the meaning of the code. We already see a weird inversion of control flow, when reading the code.

One thing to note before I go to coroutines is that, even though those things happen asynchronously, there is a sequence to it, that is not violated. We only perform a write when we know that the read has succeeded. We only perform the log after the write has succeeded.

\[ \texttt{read} \to \texttt{write} \to \texttt{log} \]

As an application programmer we determine, when we yield control to the system other sessions. It’s not that the system pre-empts this task.

A coroutine implementation of the same echo’ing session would look like this:

Task<void> session(Socket sock){
    char buffer[1024];
    int len = co_await sock.async_read({buffer});
    co_await sock.async_write({buffer,len});
    log(buffer);
}

This looks very similar to the sequential code, except that we use this co_await keyword. You have clear indication of the points where the coroutine will be suspended. Also, note that previously the function session returned void. Now, we are returning something - a Task<void>. This will be a handle to the coroutine and it’s how the outside world will be communicating with the coroutine.

Suspended computation. A second use-case is that coroutines support lazy evaluation. Just as in the fibonacci example, sometimes we want to avoid doing unecessary evaluation. Lazy evaluation doesn’t do any work unless it’s absolutely necessary. This can also potentially make your code more efficient. Lazy evaluation also supports programming with infinite lists.

The simplest coroutine

The following code is the simplest implementation of a coroutine:

#include <coroutine>
void coro_func(){
    co_return;
}

int main(){
    coro_func();
}

Compiler Explorer

Our first coroutine will just return nothing. It will not do anything else. Sadly, the preceding code is too simple for a functional coroutine and it will not compile. When compiling with gcc 15.2, we get the following error:

<source>: In function 'void coro_func()':
<source>:4:5: error: unable to find the promise type for this coroutine
    4 |     co_return;
      |     ^~~~~~~~~

Looking at C++ reference, we see that the return type of a coroutine must define a type named promise_type.

The `promise_type`

Why do we need a promise? The promise_type is the second important piece in the coroutine mechanism. We can draw an analogy from futures and promises which are essential blocks for achieving asynchronous programming in C++. The future is the thing, that the function that does the asynchronous computation, hands out back to the caller, that the caller can use to retrieve the result by invoking the get() member function. The future has the role of the return object. But, you need something that remains on the producer-side. The asynchronous function holds on to the promise, and that’s where it puts the results that will be given to the caller, when it calls get() on the future. The idea behind the promise_type for coroutines is similar. The promise is where the coroutine’s return value is stored, and it provides functions to control the coroutine’s behavior at startup and completion of t he coroutine. The promise is the interface through which the caller interacts with the coroutine.

Following the reference advice, we can write a new version of our coroutine.

#include <coroutine>
struct Task
{
    struct promise_type
    {
    };
};

Task coro_func(){
    co_return;
}

int main(){
    coro_func();
}

Compiler Explorer

Note that the return type of a coroutine can have any name (I call it Task, so that it makes intuitive sense). Compiling the preceding code again gives us errors. All errors are about missing functions in the promise_type:

<source>: In function 'Task coro_func()':
<source>:11:5: error: no member named 'return_void' in
'std::__n4861::__coroutine_traits_impl<Task, void>::promise_type' {aka 'Task::promise_type'}
   11 |     co_return;
      |     ^~~~~~~~~
<source>:10:6: error: no member named 'unhandled_exception' in
'std::__n4861::__coroutine_traits_impl<Task, void>::promise_type' {aka 'Task::promise_type'}
   10 | Task coro_func(){
      |      ^~~~~~~~~
<source>:10:6: error: no member named 'get_return_object' in
'std::__n4861::__coroutine_traits_impl<Task, void>::promise_type' {aka 'Task::promise_type'}

One of the important functions of the promise_type is that it determines what happens at certain key points in the coroutine’s life. It determines, what happens at the startup and completion of execution of the coroutine.

Implementing the `promise_type`

The first thing that the compiler expects from us the get_return_object() function. The return type of this function is the same as the return type of the coroutine.

#include <coroutine>
#include <print>

struct Task{
    struct promise_type{
        Task get_return_object(){
            std::println("get_return_object()");
            return Task{ *this };
        }

        void return_void() noexcept {
            std::println("return_void()");
        }

        void unhandled_exception() noexcept {
            std::println("unhandled_exception()");
        }

        std::suspend_always initial_suspend() noexcept{
            std::println("initial_suspend()");
            return {};
        }

        std::suspend_always final_suspend() noexcept{
            std::println("final_suspend()");
            return {};
        }
    };

    explicit Task(promise_type&){
        std::println("Task(promise_type&)");
    }

    ~Task() noexcept{
        std::println("~Task()");
    }
};

Task coro_func(){
    co_return;
}

int main(){
    coro_func();
}

Compiler Explorer

Output:

get_return_object()
Task(promise_type&)
initial_suspend()
~Task()

The get_return_object() method is implicitly called when the coroutine starts executing. Its upto the promise_type to provide an implementation of this method that constructs the return object that will be handed back to the caller. The Task object is stored on the heap. You don’t see this in the source code anywhere. When the coroutine reaches its first suspension point, and control flow is returned back to the caller, then the caller will receive this object.

The next thing we need to specify is the return_void() function. This is a customization point for handling what happens when we reach the co_return statement in the function body. There is also a corresponding return_value(), if you don’t have an empty co_return statement, but we’ll look at this at length later ahead.

The next one we need is unhandled_exception() and similar to the return_void(), this function is a customization point for handling, what happens when the coroutine throws an exception. We leave it empty for now.

We need to implement two more functions initial_suspend() and final_suspend(). These are basically the customization points that allow us to execute some code, both when the coroutine first starts executing and shortly before the coroutine ends execution. Here, we are returning std::suspend_always which basically means that at these points, I want to go into suspension. In a typical implementation, you either return std::suspend_always, which means you pause execution at this point and hand control back to the caller always, or you return std::suspend_never, which basically means you just go on and continue executing the coroutine.

Note that, final_suspend() is not printed in the output, because the coroutine is paused at initial_suspend() and since I never resumed it, I don’t see the output on the console.

A yielding coroutine

The do-nothing coroutine was great to get a gut intuitive feel for the basic mechanics of writing low-level coroutines in C++. Let’s implement another coroutine that can send data back to the caller.

In this second example, we implement a coroutine that produces a message. It will be the hello world of coroutines. The coroutine will say hello and the caller function will print the message received from the coroutine.

To implement this functionality, we need to establish a communication channel from the coroutine to the caller. This channel is the mechanism that allows the coroutine to pass values to the caller and receive information from it. This channel is established through the coroutine’s promise_type and the coroutine handle.

The coroutine handle is a type that gives access to the coroutine frame(the coroutine’s internal state) and allows the caller to resume or destroy the coroutine. The handle is what the caller can use to resume the coroutine after it has been suspended (for example after co_await or co_yield). The handle can also be used to check whether the coroutine is done or to clean up its resources.

The following code is the new version of both the caller function and the coroutine:

Task coro_func(){
    co_yield "Hello world from the coroutine";
    co_return;
}

int main(){
    auto task = coro_func();
    std::print("task.get() = {}", task.get());
    return 0;
}

The coroutine yields and sends some data to the caller. The caller reads that data and prints it. When the compiler reads the co_ yield expression, it will generate a call to the yield_value function defined in the promise_type. Thus, we add the following code to the promise_type :

struct Task{
    struct promise_type{
        std::string output_data;

        /* ... */
        std::suspend_always yield_value(std::string msg) noexcept{
            output_data = std::move(msg);
        }
    };

    explicit Task(promise_type&){
        std::println("Task(promise_type&)");
    }

    ~Task() noexcept{
        std::println("~Task()");
    }
};

The function gets a std::string object and moves it to the output_data member variable of the promise type. But, this just keeps the data inside the promise_type. We still need a mechanism to get that data out of the coroutine.

The coroutine handle

Once we require a communication channel to and from a coroutine, we need a way to refer to a suspended or executing coroutine. The mechanism to refer to the coroutine object is through a pointer or handle called a coroutine handle. The C++ library header file <coroutine> defines a type std::coroutine_handle to work with coroutine handles.

Two functions are of interest to us in the std::coroutine_handle interface : resume() and destroy().

struct coroutine_handle<promise_type>{
    /* ... */
    void resume() const;
    void destroy() const;
    promise_type& promise() const;
    static coroutine_handle from_promise(promise_type&);
}

What resume() does is simply, it resumes the suspended coroutine. It continues execution.

If we think of this coroutine frame or object living somewhere on the heap, where all of the state of execution is stored, one way to destroy this state is to let the coroutine run to completion. But, far more commonly, we would like to manage the lifetime externally and we can then just call the destroy() function which will then get rid of the coroutine state.

Note that, the coroutine_handle is not a smart pointer type. So, you have to call the destroy() explicitly.

There’s two more functions .promise() and .from_promise(). These are used to convert from a coroutine to a promise object and vice versa.

We add the following functionality to our return type to manage the coroutine handle:

struct Task{
    struct promise_type{
        std::string output_data;

        /* ... */
        std::suspend_always yield_value(std::string msg) noexcept{
            output_data = std::move(msg);
        }
    };

    // Coroutine handle member-variable
    std::coroutine_handle<promise_type> handle{};

    // ReturnType(promise_type&) ctor initializes 
    // the coroutine handle member variable
    explicit Task(promise_type& promise)
    : handle { std::coroutine_handle<promise_type>::from_promise(promise) }
    {
        std::println("Task(promise_type&)");
    }

    // Destructor
    ~Task() noexcept{
        std::println("~Task()");
        if(handle)
            handle.destroy();
    }
};

The preceding code declares a coroutine handle of type std::coroutine_handle<promise_type> and creates the handle in the return type constructor. The handle is destroyed in the return type destructor.

Now, back to our yielding coroutine. The only missing bit is a get() function for the caller to be able to extract the resultant string out of the promise.

std::string get(){
    if(!handle.done()){
        handle.resume();
    }
    return std::move(handle.promise().output_data);
}

The get() function resumes the coroutine if it has not terminated and return the result stored in the output_data member variable of the promise. The full source code listing is shown below:

#include <coroutine>
#include <print>
#include <iostream>
#include <string>

using namespace std::string_literals;

struct Task{
    struct promise_type{
        std::string output_data{};

        Task get_return_object(){
            std::println("get_return_object()");
            return Task{ *this };
        }

        void return_void() noexcept {
            std::println("return_void()");
        }

        void unhandled_exception() noexcept {
            std::println("unhandled_exception()");
        }

        std::suspend_always initial_suspend() noexcept{
            std::println("initial_suspend()");
            return {};
        }

        std::suspend_always final_suspend() noexcept{
            std::println("final_suspend()");
            return {};
        }

        std::suspend_always yield_value(std::string msg) noexcept{
            std::println("yield_value(std::string)");
            output_data = std::move(msg);
            return {};
        }
    };

    // Coroutine handle member-variable
    std::coroutine_handle<promise_type> handle{};

    // ReturnType(promise_type&) ctor initializes 
    // the coroutine handle member variable
    explicit Task(promise_type& promise)
    : handle { std::coroutine_handle<promise_type>::from_promise(promise) }
    {
        std::println("Task(promise_type&)");
    }

    ~Task() noexcept{
        std::println("~Task()");
        if(handle)
            handle.destroy();
    }

    std::string get(){
        std::println("get()");
        if(!handle.done())
            handle.resume();

        return std::move(handle.promise().output_data);
    }
};

Task coro_func(){
    co_yield "Hello world from the coroutine";
    co_return;
}

int main(){
    auto task = coro_func();
    std::cout << task.get() << std::endl;
}

Compiler Explorer

Output:

get_return_object()
Task(promise_type&)
initial_suspend()
get()
yield_value(std::string)
Hello world from the coroutine
~Task()

The output shows what is happening during the coroutine execution. The Task object is created after a call to get_return_object. The coroutine is initially suspended. The caller wants to get the message from the coroutine so get() is called, which resumes the coroutine. When the compiler sees co_yield statement in the coroutine, it generates an implicit called to yield_value(std::string). yield_value is called and the message is copied to the resultant member variable output_data in the promise. Finally, the message is printed by the caller function, and the coroutine returns.

A waiting coroutine

We are now going to implement a coroutine that can wait for the input data sent by the caller. In our example, the coroutine will wait until it gets a std::string object and then print it. We say that the coroutine waits, we mean it is suspended (that is, not executed) until the data is received.

We start with changes to both the coroutine and the caller function:

Task coro_func(){
    std::cout << co_await std::string{};
    co_return;
}

int main(){
    auto task = coro_func();
    task.put("To boldly go where no man has gone before");
    return 0;
}

In the preceding code, the caller function calls the put() function(a method in the return type structure) and the coroutine calls co_await to wait for a std::string object from the caller.

The changes to the return type are simple, that is, just adding the put() function.

void put(std::string msg){
    handle.promise().input_data = std::move(msg);
    if(!handle.done()){
        handle.resume();
    }
}

We need to add the input_data variable to the promise structure. But, just with those changes to our first example and the coroutine handle from the previous example, the code cannot be compiled.

#include <print>
#include <string>
#include <coroutine>
#include <iostream>

struct Task{
    struct promise_type{
        std::string input_data{};

        Task get_return_object() noexcept{
            std::println("get_return_object");
            return Task{ *this };
        } 

        void return_void() noexcept{
            std::println("return_void");
        }

        std::suspend_always initial_suspend() noexcept{
            std::println("initial_suspend");
            return {};
        }

        std::suspend_always final_suspend() noexcept{
            std::println("final_suspend");
            return {};
        }
        void unhandled_exception() noexcept{
            std::println("unhandled_exception");
        }

        std::suspend_always yield_value(std::string msg) noexcept{
            std::println("yield_value");
            //output_data = std::move(msg);
            return {};
        }
    };

    std::coroutine_handle<promise_type> handle{};

    explicit Task(promise_type& promise)
    : handle{ std::coroutine_handle<promise_type>::from_promise(promise)}
    {
        std::println("Task(promise_type&) ctor");
    }

    ~Task() noexcept{
        if(handle)
            handle.destroy();

        std::println("~Task()");
    }

    void put(std::string msg){
        handle.promise().input_data = std::move(msg);
        if(!handle.done())
            handle.resume();
    }
};

Task coro_func(){
    std::cout << co_await std::string{};
    co_return;
}

int main(){
    auto task = coro_func();
    task.put("To boldly go where no man has gone before");
    return 0;
}

Compiler Explorer

The compiler gives us the following error:

<source>: In function 'Task coro_func()':
<source>:62:18: error: no member named 'await_ready' in 'std::string' {aka 'std::__cxx11::basic_string<char>'}
   62 |     std::cout << co_await std::string{};
      |                  ^~~~~~~~

Let’s explore more about what this error message means.

What is an awaitable?

An awaitable is any object, I can call co_await on. You can think of co_await like an operator, and its argument as an awaitable. The way to think about the operator co_await is that these are opportunities for suspension. These are the points where the coroutine can be paused, and the control-flow can be handed back to the caller. The awaitable controls what happens at these points. Similar to, how the promise_type provides hooks to control what happens at startup or when you return from the coroutine, the awaitable provides these hooks for what happens when we go into suspension. They may decide not to suspend at all, and let the execution just continue. Not every co_await call necessary suspends. It’s totally upto the coroutine-writer. Let’s try to implement such an awaitable.

struct Awaitable{
    bool await_ready();
    void await_suspend(std::coroutine_handle<promise_type>);
    void await_resume(std::coroutine_handle<promise_type>);
};

The first function is .await_ready() which returns a bool. This determines whether we do actually go into suspension or we just say, yeah, we are ready, and we don’t want to go into suspension, we want to continue execution and in that case we just return true.

The next function is .await_suspend() and that is the customization point that will get executed shortly before the coroutine function goes to sleep.

The next function is .await_resume() and that is the customization point that will get execute just after the coroutine is resumed.

The following code shows our implementation of the await_transform function and the Awaitable struct:

auto await_transform(std::string) noexcept{
    struct Awaitable{
        promise_type& promise;

        bool await_ready() const noexcept{
            return true;    // Says, yeah we are ready
                            // we don't need to sleep. Just go on.
        }

        std::string await_resume() const noexcept{
            return std::move(promise.input_data);
        }

        void await_suspend(std::coroutine_handle<promise_type>) const noexcept{}

    };

    return Awaitable(*this);
}

This is the code for the full example of the waiting coroutine:

#include <print>
#include <string>
#include <coroutine>
#include <iostream>

struct Task{
    struct promise_type{
        std::string input_data{};

        Task get_return_object() noexcept{
            std::println("get_return_object");
            return Task{ *this };
        } 

        void return_void() noexcept{
            std::println("return_void");
        }

        std::suspend_always initial_suspend() noexcept{
            std::println("initial_suspend");
            return {};
        }

        std::suspend_always final_suspend() noexcept{
            std::println("final_suspend");
            return {};
        }
        void unhandled_exception() noexcept{
            std::println("unhandled_exception");
        }

        std::suspend_always yield_value(std::string msg) noexcept{
            std::println("yield_value");
            //output_data = std::move(msg);
            return {};
        }

        auto await_transform(std::string) noexcept{
            struct Awaitable{
                promise_type& promise;

                bool await_ready() const noexcept{
                    return true;    // Says, yeah we are ready
                                    // we don't need to sleep. Just go on.
                }

                std::string await_resume() const noexcept{
                    return std::move(promise.input_data);
                }

                void await_suspend(std::coroutine_handle<promise_type>) const noexcept{}

            };

            return Awaitable(*this);
        }
    };

    std::coroutine_handle<promise_type> handle{};

    explicit Task(promise_type& promise)
    : handle{ std::coroutine_handle<promise_type>::from_promise(promise)}
    {
        std::println("Task(promise_type&) ctor");
    }

    ~Task() noexcept{
        if(handle)
            handle.destroy();

        std::println("~Task()");
    }

    void put(std::string msg){
        handle.promise().input_data = std::move(msg);
        if(!handle.done())
            handle.resume();
    }
};

Task coro_func(){
    std::cout << co_await std::string{};
    co_return;
}

int main(){
    auto task = coro_func();
    task.put("To boldly go where no man has gone before");
    return 0;
}

Compiler Explorer

Output:

get_return_object
Task(promise_type&) ctor
initial_suspend
To boldly go where no man has gone beforereturn_void
final_suspend
~Task()

Coroutine Generators

A generator is a coroutine that generates a sequence of elements by repeatedly resuming itself from the point that it was suspended.

A generator can be seen as an infinite list, because it can generate an arbitrary number of elements.

I present below the source code for a simple FibonacciGenerator.

#include <print>
#include <string>
#include <coroutine>
#include <iostream>
#include <utility>

struct Generator{
    struct promise_type{
        int fibo_n{0};

        Generator get_return_object() noexcept{
            return Generator{ *this };
        } 

        void return_void() noexcept{}

        std::suspend_always initial_suspend() noexcept{
            return {};
        }

        std::suspend_always final_suspend() noexcept{
            return {};
        }
        void unhandled_exception() noexcept{}

        std::suspend_always yield_value(int n) noexcept{
            fibo_n = n;
            return {};
        }
    };

    std::coroutine_handle<promise_type> handle{};

    explicit Generator(promise_type& promise)
    : handle{ std::coroutine_handle<promise_type>::from_promise(promise)}
    {}

    ~Generator() noexcept{
        if(handle)
            handle.destroy();
    }

    int next(){
        if(!handle.done())
            handle.resume();
        return handle.promise().fibo_n;
    }
};

Generator makeFibonacciGenerator(){
    int i1{0};
    int i2{1};
    while(true){
        co_yield i1;
        i1 = std::exchange(i2, i1 + i2);
    }
    co_return;
}

int main(){
    auto fibo_gen = makeFibonacciGenerator();
    std::println("The first 10 numbers the Fibonacci sequence are : ");
    for(int i{1}; i<=10; ++i){
        std::println("F[{}] = {}", i, fibo_gen.next());
    }
    return 0;
}

Compiler Explorer

Coroutines

A simple example

The coroutine return type

How to turn a function into a coroutine?

Use-cases for coroutines

The simplest coroutine

The promise_type

Implementing the promise_type

A yielding coroutine

The coroutine handle

A waiting coroutine

What is an awaitable?

Coroutine Generators

The `promise_type`

Implementing the `promise_type`