Introduction

Once you instantiate a std::function object, how is it, that you are able to stick objects of different actual types e.g. an anonymous lambda, a free-standing function or a function-pointer (with only a common function signature) to it? This is achieved through type erasure.

Type erasure is a programming technique by which the explicit type information is removed from the program. It is a type of abstraction that ensures that the program does not explicitly depend on some of the data-types. You might wonder, how is it, that a program is written in a strongly typed language but does not use the actual types?

How does type erasure look like?

The ultimate type-erased object in C++ is std::function. Another one is std::any. Consider the following code snip:

#include <print>
#include <functional>

void print_num(int i){
    std::println("{}", i);
}

auto display_lambda = [](int i){
    std::println("{}", i);
};

struct PrintFunctor{
    void operator()(int i){
        std::println("{}", i);
    }
};

int main()
{
    std::function<void(int)> f_print_num = print_num;
    std::function<void(int)> f_display_lambda = display_lambda;
    std::function<void(int)> f_print_functor = PrintFunctor{};
}

Compiler Explorer

The free-standing function print_num, the lambda function diplay_lambda and the functor PrintFunctor are objects with different types. So, the type of object being assigned to the std::function changed, but on the left hand side we have the same type. std::function<void(int)> can store any of these callable objects. Somehow, we can stick all these different types into it.

If you look at it from the design point of view, what it does is, it abstracts away all the behavior of the type you erase, except the set of behaviors you consider relevant. It’s a very flexible abstraction. In my case, I say, what’s relevant is, I can invoke this type with a int and I get back a void.

Type erasure - the basic mechanics

Step 1 - How to write a container that holds unrelated types?

On cppreference.com, std::function is defined as follows:

template< class R, class... Args >
class function<R(Args...)>;

The class template std::function is a general-purpose polymorphic function wrapper.

std::function has to be polymorphic, meaning it has to be able to hold completely unrelated types. They don’t have to be bound by an inheritance-hierarchy or any other sort of thing.

Our end-goal looks something like this:

struct MagicFunctionContainer
{};

MagicFunctionContainer f1 = print_num;
MagicFunctionContainer f2 = [](int i){ std::println({}, i); };
MagicFunctionContainer f3 = PrintFunctor{};

We should be able to assign different objects of types like a free-standing function, a lambda expression or a functor to this MagicFunctionContainer.

Let’s start with designing a container, which is constructible from completely unrelated types:

struct MagicFunctionContainer{
    template<typename Func>
    MagicFunctionContainer(Func&& func)
    : m_func{std::forward<Func>(func)} 
    {}

    void operator()(int i){
        m_func(i);
    }

    private:
    //void(*)(int) m_func;   // we need to think
                             // of m_func's type
};

We see that, there’s this container called MagicFunctionContainer. The most important thing to note is that, its constructor is now a templated constructor, so you can pass any type into the MaginFunctionContainer constructor, and it simply forwards the object func into m_func. We need to think, what the type of m_func is. We see that this function container also implements a function call operator operator(), which accepts an integer and returns type void. So, when this function container object is invoked with an integer i, it simply calls m_func(i) under the hood.

As long as we have defined the type of m_func, and its the correct type, this code satisfies our requirements. We now have a container, that can be constructed from completely unrelated types. How do we store these unrelated types? How do we now define what the type of m_func should be? The answer to this puzzle is step-2 of our design.

Step 2 - Can `m_func` be a polymorphic pointer to a place on the heap that will hold these unrelated types?

The classic type erasure pattern can be realized by first coding up a type-agnostic interface (a Concept class). Then we use an Impl class that wraps up the concrete type & provides the type-dependent implementation. Finally, we use dynamic polymorphism via virtual functions, but the caller only sees the interface.

// Type erasure 101
#include <iostream>
#include <memory>

namespace dev{
    struct Concept{
        virtual void operator()(int i) = 0;
        ~virtual Concept(){}
    };

    template<typename Callable>
    struct Impl{

        Impl(const Callable& callable) 
        : m_callable{callable}
        {}

        void operator()(int i){
            m_callable(i);
        }
        
        Callable m_callable;
    };
};

Each time I get any new type Callable, I am creating an implementation Impl<Callable>, and passing that object into this new type.

Any type Callable that implements operator()(int) can be stored in Impl<Callable>. And Impl<Callable> inherits from Concept.

Now, what we’ve achieved so far is, that any Impl<Callable> object, as long as Callable implements the function call operator operator(), accepts an int, returns void can be assigned to a pointer to Concept, Concept*. Remember, all types Impl<Callable_1>, Impl<Callable_2>, …, Impl<Callable_n> inherit from Concept.

We can now finish the revisit the definition of MagicFunctionContainer.

struct MagicFunctionContainer{
    template<typename Func>
    MagicFunctionContainer(Func&& func)
    : m_func{new Impl<Func>(func)} 
    {}

    void operator()(int i){
        if(m_func == nullptr)
            throw std::bad_function_call();

        (*m_func)(i);
    }

    private:
    Concept* m_func{nullptr};   
};

In the MagicFunctionContainer, I have still got the templated constructor. I have still got the function call operator. But, now I have type for m_func. m_func is a pointer to Concept. In the templated constructor, now what I’m doing is, each time I get any type Func, I am passing that object into this new type Impl<Func>. So, essentially a Concept* pointer is always pointing to an Impl<Func>.

As a result what happens is, although the MagicFunctionContainer is not templated itself, its constructor is templated and it’s m_func member variable is able to store different unrelated types.

Polishing our design for `std::function` like container

We can add some template magic and use concepts to constrain our template type parameter Func.

#include <print>
#include <functional>
#include <memory>
#include <concepts>

namespace dev{
    template<typename R, typename... Args>
    struct Concept{
        virtual R operator()(Args... args) = 0;
        virtual ~Concept(){}
    };

    template<typename Func, typename R, typename... Args>
    struct Impl : Concept<R, Args...>{
        Impl(Func func)
        : m_func{ func }
        {}

        R operator()(Args... args) override{
            return m_func(args...);
        }

        Func m_func;
    };

    template<typename R, typename... Args>
    class function{
        public:
        template<typename Func>
        requires std::invocable<Func, Args...>
        function(Func func) 
        : m_func{ std::make_unique<Impl<Func,R,Args...>>(func)}
        {}

        R operator()(Args... args){
            return (*m_func)(args...);
        }

        private:
        std::unique_ptr<Concept<R, Args...>> m_func;
    };
}


void print_num(int i){
    std::println("{}", i);
}

auto display_lambda = [](int i){
    std::println("{}", i);
};

struct PrintFunctor{
    void operator()(int i){
        std::println("{}", i);
    }
};

int main()
{
    dev::function<void, int> f_print_num = print_num;
    dev::function<void, int> f_display_lambda = display_lambda;
    dev::function<void, int> f_print_functor = PrintFunctor{};

    f_print_num(42);
    f_display_lambda(5);
    f_print_functor(17);
    return 0;
}

Compiler Explorer

Type safety in type erasure

The type erased container should have the ability to hold unrelated types. In addition to this, there is one other requirement. The type erased container should retain the type information of the assigned object.

Consider the following example:

struct Foo{
    Foo(){ std::println("Constructed Foo()"); }
    ~Foo(){ std::println("Destructed ~Foo()"); }
};

void* foo_ptr = new Foo();
delete foo_ptr;

I have a type agnostic foo_ptr and I am assigning it an object of type Foo. When a Foo object is constructed, it prints the text Constructed Foo() to the console, when it is destructed, it prints Destructed ~Foo() to the console. Do you think this code compiles?

gcc issues the following warning:

<source>: In function 'int main()':
<source>:11:12: warning: deleting 'void*' is undefined [-Wdelete-incomplete]
   11 |     delete foo_ptr;
      |            ^~~~~~~

Let’s try something else:

std::unique_ptr<void> uptr{ new Foo() };

Do you think this code builds?

It’s the same thing. As per the standard, deleteing a void* pointer is undefined behavior. In this particular case, unique_ptr has a static assertion built in to ensure that we don’t use a type-agnostic pointer.

<source>:11:43:   
   11 |     std::unique_ptr<void> uptr{ new Foo() };
      |                                           ^
/cefs/e6/e6c9babfba5a70d326be2358_gcc-trunk-20251219/include/c++/16.0.0/bits/unique_ptr.h:88:38: error: static assertion failed: can't delete pointer to incomplete type
   88 |         static_assert(!is_void<_Tp>::value,
      |                                      ^~~~~
  '!(bool)std::integral_constant<bool, true>::value' evaluates to false

We’ve seen that raw-pointers and unique_ptr don’t work. How about a shared_ptr? Do you think this builds?

std::shared_ptr<void> sptr{ new Foo() };

In this case, we see the following output:

Constructed Foo()
Destructed ~Foo()

This is suprising! Is the shared_ptr not storing a raw pointer of void*, whilst the unique_ptr is? How does shared_ptr even able to delete the object correctly? Usually, once you assign a typed address T* to a pointer-to-void void*, you have lost the type information. How is shared_ptr still able to store the original type Foo in order to be able to destruct it correctly?

This is what we mean by type-safety in type erasure. From cppreference.com, if you look at the declarations for the unique_ptr and shared_ptr, they look something like this:

template< class T > class shared_ptr;

template<
    class T,
    class Deleter = std::default_delete<T>> class unique_ptr;

shared_ptr only has a template type parameter T. But, magically it also stores the object type information and called the correct destructor. unique_ptr<T> only stores the template type information.

The signature of the shared_ptr templated class kind of gives this away. It doesn’t have a deleter as a template type parameter. The deleter type is erased. In shared_ptr the deleter is based on the object type being passed during construction(and not the template type).

Type erasure - Adding support for a custom deleter to the `shared_ptr`

Let’s use our learnings above to add support for a custom deleter to the shared_ptr. The shared_ptr needs to be aware of the actual type being passed during construction in order to delete the object correctly. Let’s see how we can achieve that.

A shared_ptr<T1> can store objects of any type T2 as long as T2* is convertible to T1*. It type erases T2. Upon destruction, the shared_ptr will call the correct destructor of type T2, that is the destructor of the actual object stored, instead of type T1.

Here is a quick overview of the shared_ptr. Typical implementations of shared_ptr look like as follows:

+-------------------+
|    shared_ptr<T>  |
+-------------------+
|                   |
|  +--------------+ |        +------------+
|  |   T* ptr     |-|------->|  T object  | 
|  +--------------+ |        |   (heap)   | 
|                   |        +------------+ 
|  +--------------+ |                       
|  | ControlBlock*| |        +------------------------------------------+               
|  |    cb_ptr    |-|------->|         Control Block (heap)             |
|  +--------------+ |        +------------------------------------------+
|                   |        |                                          |
+------------------+|        |  +------------------------------------+  |
                             |  |  size_t reference_count            |  |
                             |  +------------------------------------+  |
                             |                                          |
                             |  +------------------------------------+  |
                             |  |  size_t weak_count                 |  |
                             |  +------------------------------------+  |
                             |                                          |
                             |  +------------------------------------+  |
                             |  |  Deleter (custom or                |  |
                             |  |  default delete)                   |  |
                             |  +------------------------------------+  |
                             |                                          |
                             |  +------------------------------------+  |
                             |  |  Allocator (optional)              |  |
                             |  +------------------------------------+  |
                             |                                          |
                             +------------------------------------------+

You have a shared_ptr of type T. It stores a raw underlying pointer-to-T and a pointer to a control block. The control_block is a different from the managed object. It stores the reference count, the weak count and some additional data. The custom deleter is going to delete the object type passed during shared_ptr construction and not the type T.

Step 1 - Code up a container that can hold unrelated different types

We start with thinking about our shared_ptr class. So, we write a templated constructor shared_ptr(Y* ptr). You should be used to this by now - I can pass in any object to this. I later constrain this using the std::is_convertible_v<Y*, T*> type trait, to enforce that Y* is indeed convertible to T*.

We have another constructor that takes two parameters : a pointer to Y and a custom deleter.

Now, the pointer to Y, Y* is convertible to pointer to T, T*. So, the pointer(address) itself is simply stored in the class as a T* member variable. So, I have a member variable T* m_underlying_ptr at the bottom. The question is how do we store the original object type Y* and the deleter type Deleter. As soon as I assign Y* ptr to T* m_underlying_ptr, I have lost information about the original object type Y*. How do we store the deleter and the true object type Y?

#include <iostream>

template<typename T>
class shared_ptr{
    public:

    template<typename Y>
    requires std::is_convertible_v<Y*, T*>
    shared_ptr(Y* ptr)
    : shared_ptr(ptr, std::default_delete<Y>{})
    {}

    // Templated constructor
    template<typename Y, typename Deleter>
    shared_ptr(Y* ptr, Deleter deleter)
    : m_underlying_ptr{ptr}
    // ???  
    {}

    // Destructor
    ~shared_ptr(){
        // Decrement the ref-count. If m_ref_count == 0, 
        // delete m_underlying_ptr using deleter 
    }

    // Pointer like functions
    const T* operator->() const{
        return m_underlying_ptr;
    }

    private:
    T* m_underlying_ptr;
};

Step 2 - Coding up a type-agnostic interface and a type-dependent implementation

Let’s now write a Concept and Impl class that supports destruction using an instance of the custom Impl<ObjType, Deleter> type. We can actually define a ControlBlockBase and ControlBlockImpl<ObjType, Deleter> classes.

namespace dev {
    // Type-agnostic Concept class
    struct ControlBlockBase{
        std::atomic<std::size_t> m_ref_count{1uz};
        virtual ~ControlBlockBase() = default;
    };

    // Type dependent Impl<ObjType,Deleter> implementation
    template<typename ObjType, typename Deleter>
    struct ControlBlock : ControlBlockBase{
        ControlBlock( ObjType* object_ptr, Deleter deleter)
        : m_object_ptr{ object_ptr }
        , m_deleter{ deleter }
        {}

        ~ControlBlock()
        {
            m_deleter(m_object_ptr);
        }

        private:
        ObjType* m_object_ptr;
        Deleter m_deleter;
    };
}  // namespace dev

The ControlBlock inherits from ControlBlockBase and it is templated. Concept itself is not templated, but the implementation control_block class is. In this case, it is templated on the ObjectType and the Deleter type. The constructor control_block(ObjectType*, Deleter) takes two parameters - the object type and the deleter. ObjectType is not the type parameter of the shared_ptr, which is T, but the type of the object passed to the shared_ptr() during construction.

On destruction of the control block, it calls the deleter on a pointer to ObjType - the correct type. It’s not going to call the deleter on the template type. We can now, finish up with our definition of the shared_ptr.

#include <iostream>

template<typename T>
class shared_ptr{
    public:

    template<typename Y>
    requires std::is_convertible_v<Y*, T*>
    shared_ptr(Y* ptr)
    : shared_ptr(ptr, std::default_delete<Y>{})
    {}

    // Templated constructor
    template<typename Y, typename Deleter>
    shared_ptr(Y* ptr, Deleter deleter)
    : m_underlying_ptr{ptr}
    , m_control_block_ptr{ new ControlBlock<Y,Deleter>(deleter) }
    {}

    // Destructor
    ~shared_ptr(){
        // Decrement the ref-count. If m_ref_count == 0, 
        // delete the control_block_ptr which will destroy the managed object 
        delete m_control_block_ptr;
    }

    // Pointer like functions
    const T* operator->() const{
        return m_underlying_ptr;
    }

    private:
    T* m_underlying_ptr;
    ControlBlockBase* m_control_block_ptr;  // Concept* pointer
};

Observe that, as long as the two shared_ptr types share the same signature, they can be assigned to each other and the destruction is going to be correct.

#include <memory>

struct Foo{};
struct Bar{};
int main()
{
    std::shared_ptr<void> ptr1{ new Foo() };
    std::shared_ptr<void> ptr2{ new Bar() };
    ptr1 = ptr2;
}

Compiler Explorer

I was explaining earlier why a shared_ptr<void> is allowed whereas a unique_ptr<void> won’t compile.

You can see that, what the shared_ptr design does is, in the single parameter constructor, when you don’t pass a deleter, it uses a default deleter of the actual object type. It retains the information of the actual object type. It does not use a default deleter of template type parameter T.

#include <iostream>

template<typename T>
class shared_ptr{
    public:

    template<typename Y>
    requires std::is_convertible_v<Y*, T*>
    shared_ptr(Y* ptr)
    : shared_ptr(ptr, std::default_delete<Y>{}) // default deleter of ObjType
    {}

    // Templated constructor
    template<typename Y, typename Deleter>
    shared_ptr(Y* ptr, Deleter deleter)
    : m_underlying_ptr{ptr}
    , m_control_block_ptr{ new ControlBlock<Y,Deleter>(deleter) }
    {}

    // ...

    private:
    T* m_underlying_ptr;
    ControlBlockBase* m_control_block_ptr;  // Concept* pointer
};

Introduction

How does type erasure look like?

Type erasure - the basic mechanics

Step 1 - How to write a container that holds unrelated types?

Step 2 - Can m_func be a polymorphic pointer to a place on the heap that will hold these unrelated types?

Polishing our design for std::function like container

Type safety in type erasure

Type erasure - Adding support for a custom deleter to the shared_ptr

Step 1 - Code up a container that can hold unrelated different types

Step 2 - Coding up a type-agnostic interface and a type-dependent implementation

Step 2 - Can `m_func` be a polymorphic pointer to a place on the heap that will hold these unrelated types?

Polishing our design for `std::function` like container

Type erasure - Adding support for a custom deleter to the `shared_ptr`