Introduction

Once you instantiate a std::function object, how is it, that you are able to stick objects of different actual types e.g. an anonymous lambda, a free-standing function or a function-pointer (with only a common function signature) to it? This is achieved through type erasure.

Type erasure is a programming technique by which the explicit type information is removed from the program. It is a type of abstraction that ensures that the program does not explicitly depend on some of the data-types. You might wonder, how is it, that a program is written in a strongly typed language but does not use the actual types?

How does type erasure look like?

Let’s see what a program with explicit types looks like. Consider a smart pointer such as std::unique_ptr<T,Deleter> in the standard library that models exclusive ownership of the managed resource.

int main(){
    std::unique_ptr<int> ptr{new int{10}};
}

Here is a unique pointer. It creates and deletes an int. The deletion is not explicitly visible, in fact nothing here tells you how the deletion will take place. But, it’s done by calling a callable object std::default_delete, which is the default deleter and under the hood, it calls the default delete ptr.

Suppose, however, we are interested to allocate/deallocate memory from our own heap/memory pool. In such case, we’d have to override the global new operator and pass the Heap* pointer. That’s what will be used for allocations.

class Heap{
    /* ... */
    void allocate(size_t size);
    void deallocate(void* ptr_to_block);
};

void* operator new(size_t n, Heap* heap){
    return heap->allocate(n);
}

Notice that, you don’t have operator delete() with arguments. You can write one, it will compile, but you can’t invoke it. In order to actually release memory on the heap, you have to do something else. You have to write your own custom deleter.

struct MyDeleter{
    /* ... */
    Heap* heap_;
    
    MyDeleter(Heap* heap) : heap_(heap)
    {}

    template <typename T>
    void operator()(T* ptr){
        ptr->~T();                  // invoke d'tor
        heap_->deallocate(ptr);     // release memory
    }
}

The function call operator() is overloaded and it accepts a pointer-to-T. It invokes the destructor ~T() and releases the memory occupied on the heap. The destructor has to be explicitly invoked, because you can’t call operator delete with arguments. Now, how do you hook this up to the unique_ptr? You pass it as a constructor argument. But, we also need to refer to a different type unique_ptr<int,MyDeleter>.

int main(){
    Heap myHeap;

    std::unique_ptr<int,MyDeleter> ptr{new (&heap) int(10), MyDeleter(&heap)};
}

This creates and deletes an int on the heap.

Notice that, unique_ptrs to the same type, but with different deleters are different types too. So, for example, you can’t assign from one to the other. You can actually deduce the Deleter type. If you have a unique_ptr object, you can actually interrogate it’s deleter_type. The deleter_type is embedded in the unique_ptr type.

#include <iostream>
#include <memory>

int main(){
    std::unique_ptr<int> p{new int(10)};
    static_assert(std::is_same_v<decltype(p)::deleter_type,std::default_delete<int>>);
    return 0;
}

Compiler Explorer

Now, let’s look at the shared_ptr<T> and contrast it with unique_ptr. If you don’t specify any deleters, they look exactly the same:

std::unique_ptr<int> u_ptr{ new int(10) };
std::shared_ptr<int> s_ptr{ new int(10) };

If you do specify a deleter, there’s a big difference. The constructor looks exactly the same.

int main(){
    Heap myHeap;
    std::unique_ptr<int,MyDeleter>  u_ptr{ new (&heap) int(10), MyDeleter(&heap) };
    std::shared_ptr<int>            s_ptr{ new (&heap) int(10), MyDeleter(&heap) };
}

But, in the unique_ptr the type of the deleter is the second template argument, whereas with the shared_ptr there is no mention of the deleter with the type. So, if you have an object of type shared_ptr, you cannot deduce which deleter was used to construct it. All shared_ptr instances with the same pointer type T*, are of the same type, even if they have different deleters.

So, the deleter type has been erased. That’s what it means to erase a type. Observe that the constructor call site is the last mention of the deleter type. From this point forward, you won’t see this type again.

Since, shared pointers with different deleters have the same type, you can assign one to the other.

int main(){
    Heap myHeap;
    std::shared_ptr<int>    p{ new int(10) };
    std::shared_ptr<int>    q{ new (&myHeap) int(10), MyDeleter(&myHeap) };
    q = p;      // Ok, they are the same type
}// Proper deleters are called

Also, when each shared pointer goes out of scope, the correct deleter is invoked. So, erased types are not explicitly visible in the program; they are hidden somewhere.

Type erasure as a design pattern

The ultimate type-erased object in C++ is std::function. Another one is std::any.

std::function<F> is a type that is instantiated from the signature of a callable.

#include <iostream>
#include <functional>
#include <vector>
#include <numeric>
#include <algorithm>
#include <cmath>

double gravitational_potential(std::vector<double> x){
    double r_squared = std::accumulate(x.begin(),x.end(),0.0,[](double accum, double element){
        accum += element * element;
        return accum;
    });
    double r = sqrt(r_squared);

    const double G = 6.6743e-11;
    const double M = 5.972e+24;
    double potential = -G*M/r;
    return potential;
};

int main(){
    std::function<double(std::vector<double>)> scalarValuedFunc;

    // the paraboloid z = x_0^2 + x_1^2 + ... + x_{n-1}^2 
    scalarValuedFunc = [](std::vector<double> x){
        return std::accumulate(x.begin(),x.end(),0.0,[](double accum, double element){
            accum += element * element;
            return accum;
        });
    };

    // Gravitational potential U at the point (x_0,...,x_{n-1}) in space
    scalarValuedFunc = gravitational_potential;
    return 0;
}

Compiler Explorer

The lambda and the free-standing function have different types. scalarValuedFunc has only one type, but can store any of these callable objects. Our std::function has only one type, and somehow, we can stick all these different types into it.

If you look at it from the design point of view, what it does is, it abstracts away all the behavior of the type you erase, except the set of behaviors you consider relevant. It’s a very flexible abstraction. In my case, I say, what’s relevant is, I can invoke this type with a vector of reals and I get back a real. That’s the only behavior of this type, that to me, at this moment is relevant. All other behavior is abstracted away.

Fundamentally, type erasure is an abstraction technique that allows you to separate interface from implementation, and when we talk about the interface, it’s actually a subset of the interface. It’s a subset of the interface, that we deem relevant for our particular problem.

Inheritance does the same thing. But, inheritance is significantly less flexible. Firstly, with inheritance, the interface you inherit - that has to be the whole interface, you can’t pick and choose. You can separate it from the implementation, but you don’t get to pick and choose like half of the interface. Second, it is much more intrusive. You can’t take an arbitrary class with the same interface and say, I want to use it through inheritance. No, you have to derive it from the base class.

How does type erasure work?

Consider the following C code.

void qsort(void* base, size_t nmemb, size_t size,
int (*compare)(const void*, const void*)        // no mention of specific types
);

int less(const void* a, const void* b){
    return *(const int*)a - *(const int*)b;     // type information recovered
}

int a[10] = [1, 10, 2, 9, 3, 8, 4, 7, 5, 6];
qsort(a, 10, size(int), less);                  // type of less erased here

qsort() comes from the standard C library. It takes void* pointer-to-array, size_t array count, size_t size of types you are trying to sort, and a comparator. The comparison function takes void*. It is used to compare whatever types you are actually sorting. Inside qsort, there is no mention of the type that you are going to sort.

The function call site is the last time in your execution flow, where you have the mention of the type int. From that moment on, there is no mention of int during the sorting process.

When you write the comparison function, the signature int (*)(void*, void*) is fixed. But, you know what you are writing this for. You are writing a comparator for ints. So, inside the function, I am going to recover the type information. So, the entire type erasure mechanism is seen here. The type is known at the invocation point and its the last time it is known. This is type erasure in C. The general code does not depend on which type we are sorting. All interfaces are completely generic and do not contain any type information.

I still need to perform a type dependent action at some point.

int less(const void* a, const void* b){
    return *(const int*)a - *(const int*)b;     // type information recovered
}

So, I have to generate some code that has a type-less or type agnostic interface, but the code itself is aware of the types. This act of recovering the types from typeless information is called reification (recovery). The comparator int less(void*, void*) performs the reification and executes type dependent code. Type reification in C is manual. From an understanding point of view, the only thing C++ adds to this, is that C++ automatically generates type reification functions.

The general code does not depend on the erased type. The call site is the last place where the actual type is known. Type is reified when type dependent action is performed. The type is hidden in the code of the function that performs this action. The function is invoked through a type-agnostic interface. The type dependent code converts from the abstract to the concrete type. In C++, we can have the compiler generate the type dependent code.

Type erasure - the basic mechanics

The classic type erasure pattern can be realized by first coding up a type-agnostic interface (a Concept class). Then we use an Impl class that wraps up the concrete type & provides the type-dependent implementation. Finally, we use dynamic polymorphism via virtual functions, but the caller only sees the interface. Finally, we use dynamic polymorphism via virtual functions, but the caller only sees the interface.

// Type erasure 101
#include <iostream>
#include <memory>

namespace dev{
    class function{
        public:

        function()
        : ptr{nullptr}
        {}

        template<typename Callable>
        function(Callable callable)
        : ptr{std::make_unique<Impl<Callable>>(callable)}
        {}

        void operator()(){
            (*ptr)();
        }
        // Type-agnostic interface
        struct Concept{
            virtual void operator()() = 0;
            virtual ~Concept(){}
        };

        // Type-dependent implementation
        template<typename Callable>
        struct Impl : public Concept{

            Impl(Callable callable)
            : m_callable(callable)
            {}

            void operator()() override{
                m_callable();
            }
            Callable m_callable;
        };

        std::unique_ptr<Concept> ptr;
    };
};

void foo(){
    std::cout << "\n" << "foo()";
}

void bar(){
    std::cout << "\n" << "bar()";
}

int main(){
    dev::function f;
    f = [](){
        std::cout << "\n" << "Anonymous lambda";
    };
    f();

    f = foo;
    f();

    void(*func_ptr)();
    func_ptr = bar;
    f = func_ptr;
    f();
    return 0;
}

Compiler Explorer

Type erasure - Adding support for a custom deleter to the `shared_ptr`

Suppose we’d like to add support for a custom deleter to the shared_ptr.

#include <iostream>

template<typename T>
class shared_ptr{

    private:
    T* m_underlying_ptr;
    std::atomic<unsigned long>* m_ref_count;
    // something about the deleter

    public:
    // Templated constructor
    template<typename Deleter>
    shared_ptr(T* ptr, Deleter deleter)
    : m_underlying_ptr{ptr}
    // ???  
    {}

    // Destructor
    ~shared_ptr(){
        // Decrement the ref-count. If m_ref_count == 0, 
        // delete m_underlying_ptr using deleter 
    }

    // Pointer like functions
    const T* operator->() const{
        return m_underlying_ptr;
    }
};

Compiler Explorer

We add a templated constructor shared_ptr(T*, Deleter ) that accepts a raw pointer and a custom deleter as the second argument to the design of our shared_ptr.

Let’s now write a Concept and Impl class that support destruction using an instance of the custom Deleter type.

    struct destroy_wrapper // RAII class
    {
        /* ... */

        // Concept class
        struct destroyer_base
        {
            virtual void operator()(void* ptr) = 0;
            virtual ~destroyer_base() = default;
        };

        // Impl class
        template<typename Deleter>
        struct destroyer : public destroyer_base
        {
            explicit destroyer(Deleter deleter)
              : destroyer_base()
              , m_deleter{ deleter }
            {
            }

            void operator()(T* ptr) override { m_deleter(ptr); }

            Deleter m_deleter;
        };

        ~destroy_wrapper() {}
        std::unique_ptr<destroyer_base> m_destroyer_ptr;
    };

The destroyer_base is the Concept class and the destroyer is its type-dependent implementation.

The wrapper class destroyer_wrapper wraps objects of different types (destoyer<Deleter>) into a common wrapper and treats them as though they are the same unified interface.

template<typename T>
class shared_ptr
{
    private:
    T* m_underlying_ptr;
    std::atomic<unsigned long>* m_ref_count_ptr;
    destroyer_wrapper* m_destroyer_ptr;

    public:
    // Templated constructor
    template<typename Deleter>
    shared_ptr(T* ptr, Deleter&& deleter)
    : m_underlying_ptr{ ptr }
    , m_ref_count_ptr{ nullptr }
    , m_destroyer_ptr{ nullptr }
    {
        try{
            m_destroyer_ptr = new destroy_wrapper(std::forward<Deleter>(deleter));
            m_ref_count_ptr->store(1u);
        } catch(std::exception& ex) {
            delete ptr;
            throw ex;
        }
    }

    /* ... */

    struct destroy_wrapper // RAII class
    {
        template<typename Deleter>
        explicit destroy_wrapper(Deleter&& deleter)
          : m_destroyer_ptr{ std::make_unique<destroyer>(std::forward<Deleter>(deleter)) }
        {
        }
        // destroy_wrapper is a wrapper over a unique_ptr<destroyer_base>.
        // It is intended to be move-constructible ONLY.
        destroy_wrapper(destroy_wrapper&& other) noexcept
          : m_destroyer_ptr{ std::exchange(other.m_destroyer_ptr, nullptr) }
        {
        }

        void operator()(pointer ptr)
        {
            (*m_destroyer_ptr)(ptr); // Virtual polymorphism
        }

        struct destroyer_base
        {
            virtual void operator()(pointer ptr) = 0;
            virtual ~destroyer_base() = default;
        };

        template<typename Deleter>
        struct destroyer : public destroyer_base
        {
            explicit destroyer(Deleter deleter)
              : destroyer_base()
              , m_deleter{ deleter }
            {
            }

            void operator()(pointer ptr) override { m_deleter(ptr); }

            Deleter m_deleter;
        };

        ~destroy_wrapper() {}
        std::unique_ptr<destroyer_base> m_destroyer_ptr;
    };
};

Introduction

How does type erasure look like?

Type erasure as a design pattern

How does type erasure work?

Type erasure - the basic mechanics

Type erasure - Adding support for a custom deleter to the shared_ptr

Type erasure - Adding support for a custom deleter to the `shared_ptr`