Introduction

C unions are simple and crude. You don’t have a way to know what’s the currently active type. The destructor is implicitly deleted, and if the user supplies a destructor, we don’t know which underlying type should be destroyed.

#include <iostream>
#include <vector>
#include <string>
#include <print>

union S{
    std::string str;
    std::vector<int> vec;
    ~S(){}  // what to delete?
};

int main(){
    S s = {"Hello World"};
    std::println("s.str = {}", s.str);

    // you have to call the destructor of the contained objects
    s.str.~basic_string<char>();

    // and a constructor
    new (&s.vec) std::vector<int>{};

    s.vec.push_back(42);
    std::println("s.vec.size() = {}", s.vec.size());

    // another destructor
    s.vec.~vector<int>();
}

Compiler Explorer

As you see, the S union needs a lot of maintenance from our side. We have to know, which type is active and call appropriate constructors/destructors before switching to a new type.

`std::variant` usage

The std::variant template is essentially a smart union. std::variant<T1, T2, T3> is similar to union{T1 v1; T2 v2; T3 v3} in that both can store a value of one of the specified types and only one value can be stored at a time. The key difference is that a variant objects knows which type it contains, while with a union the programmer is wholly responsible for reading the same type as what was written earlier. It is undefined behavior to access a union as a type that differs from the one used to initialize it:

In contrast std::variant offers a safe way to store values of different types in the same memory. It is easy to check at runtime which alternative type is currently stored in the vartiant, and accessing a variant as the wrong type throws an exception.

std::variant<int, double, std::string> v;
v = 0;
std::cout << v.index();
++std::get<0>(v);
// std::get<1>(v);  //throws std::bad_variant_access

In many ways, std::variant offers capabilities similar to inheritance-based run-timne polymorphism: both let us write code where the same variable name can refer to objects of different types at run-time. The two major differences are: first, a std::variant does not require that all its types come from the same hierarchy(they need not be classes at all) and second, a variant object can only store one of the types listed in its declaration, while a base class pointer can point to any derived class.

Visitors are objects that are used to implement the Visitor design pattern. Typically, they are used to add an operation to all classes in a class hierarchy, that is the equivalent of adding a virtual function, but without modifying the classes themselves. When a visitor object visits a variant, they invoke the best matching function call operator for the actual value of the variant.

#include <variant>
#include <string>
#include <iostream>

struct MyVisitor{
    void operator()(int i) const{
        std::cout << "int: " << i << "\n";
    }

    void operator()(std::string s) const{
        std::cout << "string: " << s << "\n"; 
    }

    void operator()(double d) const{
        std::cout << "double: " << d << "\n";
    }
};

int main(){
    std::variant<int, std::string, double> var(42);
    std::visit(MyVisitor(), var);   // calls operator()(int)

    var = "hello";
    std::visit(MyVisitor(), var);   // calls operator()(std::string)

    var = 3.14;
    std::visit(MyVisitor(), var);   // calls operator()(double)
    return 0;
}

Compiler Explorer

Using generic lambdas as visitors

#include <variant>
#include <string>
#include <iostream>

auto printVisitor() = [](const auto& v){
    std::cout << "\n" << v;
};

int main(){
    std::variant<int, std::string, double> var(42);
    std::visit(printVisitor, var);
    var = "hello";
    std::visit(printVisitor, var);
    var = 42.7;
    std::visit(printVisitor, var);
    return 0;
}

The cracked-man’s match

template<typename... Ts>
struct overload : Ts... {
    using Ts::operator()...;
};

// base types are deduced from the passed arguments
template<typename... Ts>
overload(Ts...) -> overload<Ts...>;

int main(){
    std::variant<int, std::string> var(42);
    std::visit(overload{
        [](int i){ std::cout << "\n" << "int : " << i; },
        [](std::string s){ std::cout << "\n" << "string : " << s; }
        [](double d){ std::cout << "\n" << "double : " << d; }
    }, var);
}

Since the struct overload inherits from the closure types (__my_lambda_with_int_param_struct and __my_lambda_with_string_param_struct), we need to bring their function call operators in scope.

Multiple Dispatch

As mentioned before, the killer feature of std::variant is multiple dispatch. Here is a common interview problem. Imagine you have an excel spreadsheet, and a spreadsheet has cells. By definition, in a cell, you can have either an int or a string. And then, there’s an operation defined like plus operator+(). For example, you can add two numbers. Adding two strings results in concatentation. string + int : you convert into int to string and concatenate. And int + string : you throw an error. This interview problem is much easier to solve, if you have a variant.

Here, you have to select an overload not based on just one type (like in virtual functions), but on two types or in general \(N\) types.

#include <string>
#include <variant>
#include <stdexcept>

template<typename... Ts>
struct overloaded : public Ts...{
    using Ts::operator()...;
};

// base types are deduced from passed arguments
template<typename... Ts>
overloaded(Ts...) -> overloaded<Ts...>;

struct Cell{
    std::variant<int, std::string> data;

    Cell& operator+=(Cell& other){
        std::visit(overloaded{
            [](int a, int b){ a += b; },
            [](std::string a, int b){ a += std::to_string(b); },
            [](int a, std::string b){ throw std::runtime_error("Invalid inputs!"); },
            [](std::string a, std::string b){ a += b; }
        }, data, other.data);

        return *this;
    }
};

int main(){}

Compiler Explorer

std::visit() is a way to dispatch on a variant. It does multiple dispatch - based on my own self Cell type and the other Cell type.

What is a `std::variant` under the hood?

// Pseudo-code
namespace dev{
    template<typename... Ts>
    struct variant{
        union{
            Ts... elements;
        };

        // ....
        index_type index;
    };

    auto visit(auto&& visitor, auto&&... args){
        switch(...)
            case 0:
                visitor.visit(get<0>(args)...);
            case 1:
                visitor.visit(get<1>(args)...);
    }
}

Under the hood, a std::variant is a union of all of the possible alternatives. It has an active index. And then, in visit there is some sort of switch statement, between all of the possible types. And we’re going to implement all of these components.

TMP Utilities

We are going to start with just a bunch of template metaprogramming utilities.

`find_index_of_v`

#include <print>
#include <concepts>
#include <initializer_list>
#include <string>

namespace dev{

namespace tools{
    namespace type_lists{
        template<typename T, typename... Ts>
        constexpr size_t find_index_of_impl(){
            size_t i{0uz};
            for(bool is_same_type : { std::is_same_v<T, Ts>... }){
                if(is_same_type)
                    break;

                ++i;
            }

            return i;
        }

        // define a concept that checks if the index is not
        // out of bounds
        template<typename T, typename... Ts>
        concept out_of_bounds_check = (
            find_index_of_impl<T,Ts...>() < sizeof...(Ts)
        );
    } // namespace type_lists

    template<typename T, typename... Ts>
    requires type_lists::out_of_bounds_check<T, Ts...>
    constexpr size_t find_index_of_v = type_lists::find_index_of_impl<T,Ts...>();

    template<typename T, typename... Ts>
    concept exists = requires{
        { find_index_of_v<T,Ts...> };
    };

} // namespace tools

}

// -------------- tests ------------
static_assert(dev::tools::find_index_of_v<int, int, float, double> == 0);
static_assert(dev::tools::find_index_of_v<float, int, float, double> == 1);
static_assert(dev::tools::exists<double, int, float, std::string, double>);
static_assert(!dev::tools::exists<std::string, int, float>);

int main(){}

Compiler Explorer

`uint_atleast_t`

The next utility we are going to write is uint_atleast_t. I want to represent a number and I want to use the smallest unsigned int possible to store this number. For a 100, uint8_t will be enough. For 256, I already need uint16_t.

#include <cstdint>
// -------------- tests ------------
static_assert(std::same_as<uint8_t,dev::tools::uint_atleast<100>>);
static_assert(std::same_as<uint8_t,dev::tools::uint_atleast<255>>);
static_assert(std::same_as<uint16_t,dev::tools::uint_atleast<256>>);
static_assert(std::same_as<uint32_t,dev::tools::uint_atleast<std::numeric_limits<uint32_t>>>);
static_assert(std::same_as<uint64_t,dev::tools::uint_atleast<5'000'000'000>>);

We define uint_atleast_t as the type returned by the function uint_atleast_impl().

namespace dev{

namespace tools{
    template<size_t max>
    using uint_atleast_t = decltype(int_utils::uint_atleast_impl<max>());
}

}

The implementation is very easy - we just define a constexpr table. If you see, I am just checking conditions one after the other, in a fallthrough.

#include <cstdint>

namespace dev{

namespace tools{
    namespace int_utils{
        template<size_t max>
        constexpr auto uint_atleast_impl(){
            if constexpr(max <= std::numeric_limits<std::uint8_t>::max())
                return uint8_t{};
            else if constexpr(max <= std::numeric_limits<std::uint16_t>::max())
                return uint16_t{};
            else if constexpr(max <= std::numeric_limits<std::uint32_t>::max())
                return uint32_t{};
            else if constexpr(max <= std::numeric_limits<std::uint64_t>::max())
                return uint64_t{};                
        }
    }
    
    template<size_t max>
    using uint_atleast_t = decltype(int_utils::uint_atleast_impl<max>());
    
}

// -------------- tests ------------
static_assert(std::is_same_v<uint8_t, dev::tools::uint_atleast_t<100>>);
static_assert(std::is_same_v<uint8_t, dev::tools::uint_atleast_t<255>>);
static_assert(std::is_same_v<uint16_t, dev::tools::uint_atleast_t<256>>);
static_assert(std::is_same_v<uint32_t, dev::tools::uint_atleast_t<std::numeric_limits<uint32_t>>>);
static_assert(std::is_same_v<uint64_t, dev::tools::uint_atleast_t<5'000'000'000>>);

Compiler Explorer

`get_nth_type`

Writing a good get_nth_type implementation is a surprisingly difficult problem. There is an excellent CppCon talk by Kris Jusiak on this, called the nth_element case study, where he talks specifically about this problem - how to get the nth type from a pack. This problem is not trivial to solve in optimal compile-time.

What we do is, to create a constexpr table. We are going to unroll the first 10 types. If the type index is 0, we are going to return the first type, if the type index is 1, we going to return the second type and so forth. If, for example, you request for the type at index 5 or even index 9, the function can directly return the type. If you request the type at index 10, the function calls itself with the template argument list get_nth_type<10-10, T10>() and the template is instantiated again.

namespace dev{
namespace tools{
    namespace type_lists{
        template <size_t n,
                  typename T0 = void,
                  typename T1 = void,
                  typename T2 = void,
                  typename T3 = void,
                  typename T4 = void,
                  typename T5 = void,
                  typename T6 = void,
                  typename T7 = void,
                  typename T8 = void,
                  typename T9 = void,
                  typename... Ts>
        constexpr auto get_nth_type_impl(){
            /**/ if constexpr ( n == 0 ) return std::type_identity<T0>{};
            else if constexpr ( n == 1 ) return std::type_identity<T1>{};
            else if constexpr ( n == 2 ) return std::type_identity<T2>{};
            else if constexpr ( n == 3 ) return std::type_identity<T3>{};
            else if constexpr ( n == 4 ) return std::type_identity<T4>{};
            else if constexpr ( n == 5 ) return std::type_identity<T5>{};
            else if constexpr ( n == 6 ) return std::type_identity<T6>{};
            else if constexpr ( n == 7 ) return std::type_identity<T7>{};
            else if constexpr ( n == 8 ) return std::type_identity<T8>{};
            else if constexpr ( n == 9 ) return std::type_identity<T9>{};
            else                         return get_nth_type_impl<n-10, Ts...>();
        }
    } // namespace type_lists

    template<size_t n, typename... Ts>
    requires(n < sizeof...(Ts))
    using get_nth_type_t = typename decltype(type_lists::get_nth_type_impl<n, Ts...>())::type;
} // namespace tools
}

// -------------- tests ------------
static_assert(std::is_same_v<
    dev::tools::get_nth_type_t<0, char, short, int, long, float, double>, 
    char>);

static_assert(std::is_same_v<
    dev::tools::get_nth_type_t<5, char, short, int, long, float, double>, 
    double>);

Compiler Explorer

Generally, the compile-time performance depends upon the number of function template instantiations. If I would have a single if constexpr(n == 0) condition, and requested get_nth_type<4, char, short, int, float, double>, the function would instantiate itself 4 times. In our implementation, we instantiate only once for every ten types.

There is another interesting thing here. In the uint_atleast_t implementation, we just default constructed the types uint8_t etc. But, in the general case, you can’t just default construct the type. So, we wrap it up in the identity meta-function std::type_identity<T>.

`instance_of`

The next utility we would like to build is instance_of. We want a concept that will tell us, if a concrete type is an instance of a certain templated type. For example, vector<int> is an instance of vector<T>. But, int is not an instance of a vector. A little bit of a tricky one is : vector<int>& is not an instance of vector.

// -------------- tests ------------
static_assert(dev::tools::instance_of<std::vector<int>, std::vector>);
static_assert(!dev::tools::instance_of<int, std::vector>);
static_assert(!dev::tools::instance_of<std::vector<int>&, std::vector>);

How do we do that? First, we define the concept, which just calls into the implementation impl. The impl is a meta-function that accepts two parameters, a template and a concrete type.

template<template <typename ...> class Template, typename What>
//       ^--------------------------------^       ^--------^
//       1st param: a template that accepts       2nd param: a
//       any number of args                       concrete type
struct impl : std::false_type {};

The default implementation of impl is to define as std::false_type. The partial specialization impl<What, What<Ts...>>, says, if we have a template What and What<Ts...> is an instantiation of the template, that inherit from true_type.

template<template<typename...> typename What, typename... Ts>
        struct impl<What, What<Ts...>> : std::true_type {};

Together, we have:

#include <vector>

namespace dev{
namespace tools{
    namespace _instance_of{
        template<template <typename...> typename Template, typename What>
        //       ^--------------------------------^       ^--------^
        //       1st param: a template that accepts       2nd param: a
        //       any number of args                       concrete type        
        struct impl : std::false_type {};

        template<template<typename...> typename What, typename... Ts>
        struct impl<What, What<Ts...>> : std::true_type {};
    }

    template<typename T, template <typename... > typename Template>
    concept instance_of = _instance_of::impl<Template, T>::value;
} // namespace tools 
}

// -------------- tests ------------
static_assert(dev::tools::instance_of<std::vector<int>, std::vector>);
static_assert(!dev::tools::instance_of<int, std::vector>);
static_assert(!dev::tools::instance_of<std::vector<int>&, std::vector>);

Compiler Explorer

A macro for forwarding arguments

We all know how std::forward works. It unconditionally casts the type of its argument T to T&&. Sometimes you have to specify the types of the arguments, so you use decltype. For example,

std::forward<decltype(x)>(x);

This can be annoying to type, people often use a macro for economy.

#define TOOLS_FWD(...)
    static_cast<decltype(__VA_ARGS__)&&>(__VA_ARGS__)

`union_` type definition

The union_ type is something which is used to store the data. It is similar to the built-in union but it allows unpacking. That is, you could, for example request to construct the element at index 2 or get the element at index 2.

Let’s first understand how to write the union_ type. A while back people just created a bunch of chars - a buffer area to accomodate the largest type and they would cast pointers to the alternative types in the union.

The modern approach is to write a recursive solution. Using some mathematical formalism, the union of N+1 types T0, T1, ..., TN written \(\bigcup_{i=0}^{N} \texttt{Ti}\) is equivalent to pulling out the first type, union’ing with the union of the rest of the types, that is \(\texttt{T0} \cup \left(\bigcup_{i=1}^N \texttt{Ti}\right)\). In code:

template<typename First, typename... Rest>
struct union_{
    union{
        First head;
        union_<Rest...> tail; 
    }
};

A union must contain atleast one type. This is the default case.

template<typename T>
struct union_<T>{
    T head;
};

We implement some constructors. There is a default constructor, a constructor for the head and a constructor for the Ith element.

The API of these constructors is templated by a parameter size_t I - the index of the element to be constructed. The first argument to the constructor is of the type std::in_place_index_t, which is simply a wrapper type for storing indexes at compile-time.

namespace dev{
namespace tools{

    template<typename First, typename... Rest>
    struct union_{
        union{
            First head;
            union_<Rest...> tail; 
        }

        constexpr union_() = default;
        constexpr union_(std::in_place_index_t<0u>, auto&&... args)
        : head{ TOOLS_FWD(args)... }
        {}

        template<size_t I>
        constexpr union_(std::in_place_index_t<I> idx, auto&&... args)
        : tail{ std::in_place_index<I - 1>{}, TOOLS_FWD(args)... }
        {}
    };

    template<typename T>
    struct union_<T>{
        T head;

        constexpr union_() = default;
        constexpr union_(std::in_place_index_t<0u>, auto&&... args)
        : head{ TOOLS_FWD(args)... }
        {}
    };

} // namespace tools
}

`union_size`

There is union_size to tell me, how many elements are there in the union. This is very simple. I have a base case and then have a partial specialization, where I say that sizeof...(Ts) is my template union size.

namespace dev{
namespace tools{
    template<typename>
    struct union_size;

    template<typename... Ts>
    struct union_size<union_<Ts...>>
    : std::integral_constant<size_t, sizeof...(Ts)>
    {}

    template <typename Union>
    constexpr size_t union_size_v = union_size<Union>::value;
} // namespace tools
}

`get`

The implementation of get is very simple. If the index equals 0, get the head. If its not 0, get the tail.

namespace dev{
namespace tools{
    template<size_t i, typename Self>
    constexpr auto&& get(Self&& self)
    requires instance_of<std::remove_cvref_t<Self>, union_> &&
    (i < union_size<std::remove_cvref_t<Self>>)
    {
        if constexpr(i == 0)
            return TOOLS_FWD(self).head;
        else
            return get<i - 1>(TOOLS_FWD(self).tail);
    }
} // namespace tools
}

Note that, because of how perfect forwarding works, Self is sometimes not deduced as union_, but rather union_&. Hence, to strip off the reference qualifiers, I call the meta-function std::remove_cvref_t.

`construct_at` and `destroy_at`

Given a union_<T0,...,Ti,...,TN>, construct_at, destroy_at are used to construct/destroy an object of one of the possible alternatives T0,T1,...Ti,...Tn specified using an index i.

We use the STL algorithms std::construct_at and std::destroy_at. These algorithms perform placement new at compile-time; they construct/destroy an object at a specified address.

namespace dev{
namespace tools{
    template<size_t i, typename... Ts>
    constexpr void construct_at(union_<Ts...>& self, auto&&... args){
        // std::construct_at(<address>, <ctor args>...)
        std::construct_at(&get<i>(self), std::in_place_index<i>, TOOLS_FWD(args)...);
    }

    template<size_t i, typename... Ts>
    constexpr void destroy_at(union_<Ts...>& self){
        // std::destroy_at(<address>)
        std::destroy_at(&get<i>(self));
    }
} // namespace tools
}

Wrapping it up - the `variant` class

A basic implementation is listed below. Perhaps, the most challenging problem to solve is writing a destructor/copy ctor.

When implementing ~variant() we know the active index, so we can use std::get<I> to get underlying T object. However, template<size_t I> std::get() and other meta-functions are designed to accept compile-time constants as template parameters. We don’t know, the active index until run-time.

The solution is to generate an array of if-conditionals, at compile-time using pack expansion of the form:

// Generate if-conditionals sizeof...(Ts) times
if(m_idx == 0)
    tools::destroy_at<0>(m_data);
else if(m_idx === 1)
    tools::destroy_at<1>(m_data);
//...

We use std::index_sequence for this purpose, which represents a compile-time sequence of numbers. We construct the sequence 0,1,2,3,...,sizeof...(Ts)-1 and then define a lambda function that accepts this as a parameter pack Indices. Inside the lambda function, we use a ternary expression to check if the active index m_idx == Indices (the pack variable) and perform destruction in the true case. We apply C++17 fold expressions to expand Indices and generate multiple if statements.

// Destructor
constexpr ~variant(){
   constexpr auto idx_seq = std::make_index_sequence<sizeof...(Ts)>();
   [&]<auto... Indices>(std::index_sequence<Indices...>){
       (((m_idx == Indices) ? (tools::destroy_at<Indices>(m_data), 0) : 0), ...);
   }(idx_seq);
}

Code listing

namespace dev{

template<typename... Ts>
class variant;

// Forward declaration 
// so the friend declaration inside variant resolves
template<size_t I, typename... Ts>
constexpr decltype(auto) get(const variant<Ts...>& v);

template<typename T, typename... Ts>
constexpr T get(const variant<Ts...>& v);

template<typename... Ts>
class variant{
    using index_type = tools::uint_atleast_t<sizeof...(Ts)>;
    static constexpr size_t size = sizeof...(Ts);

    tools::union_<Ts...> m_data;
    index_type m_idx;

    public:

    constexpr variant()
    {
        tools::construct_at<0>(m_data, tools::get_nth_type_t<0, Ts...>());
        m_idx = 0;
    }

    template<tools::exists<Ts...> T>
    constexpr explicit variant(T&& x){
        tools::construct_at<tools::find_index_of_v<T, Ts...>>(m_data, TOOLS_FWD(x));
        m_idx = tools::find_index_of_v<T, Ts...>;
    }

    
    constexpr variant(const variant& other){
        index_type idx = other.m_idx;
        constexpr auto idx_seq = std::make_index_sequence<sizeof...(Ts)>();
        [&]<auto... Indices>(std::index_sequence<Indices...>){
              (((idx == Indices) 
            ? (tools::construct_at<Indices>(m_data, tools::get<Indices>(other.m_data)), 0) 
            : 0), ...);
        }(idx_seq);
        m_idx = idx;
    }

    // Move ctor - steal data from other
    constexpr variant(variant&& other) noexcept{
        index_type idx = other.m_idx;
        constexpr auto idx_seq = std::make_index_sequence<sizeof...(Ts)>();
        [&]<auto... Indices>(std::index_sequence<Indices...>){
              (((idx == Indices) 
            ? (tools::construct_at<Indices>(m_data, tools::get<Indices>(std::move(other.m_data))), 0) 
            : 0), ...);
        }(idx_seq);
        m_idx = std::exchange(other.m_idx, 0);
    }

    template<typename... Us>
    void swap(variant<Us...>& other) noexcept{
        using std::swap;
        if(m_idx == other.m_idx){
            swap(get<m_idx>(*this), tools::get<m_idx>(other));
        }else{
            variant<Ts...> old_value = std::move(*this);
            this->~variant();
            std::construct_at(this, std::move(other));
            other->~variant();
            std::construct_at(&other, std::move(old_value));
        }
    }

    template<typename... Us>
    variant<Ts...>& operator=(const variant<Us...>& other){
        variant(other).swap(*this);
        return *this;
    }

    template<typename... Us>
    variant<Ts...>& operator=(variant<Us...>&& other){
        variant(std::move(other)).swap(*this);
        return *this;
    }

    template<tools::exists<Ts...> T>
    constexpr variant<Ts...>& operator=(T&& element){
        this->~variant();
        constexpr size_t i = tools::find_index_of_v<T, Ts...>;
        tools::construct_at<i>(m_data, TOOLS_FWD(element));
        m_idx = i;
        return *this;
    }

    constexpr ~variant(){
        constexpr auto idx_seq = std::make_index_sequence<sizeof...(Ts)>();
        [&]<auto... Indices>(std::index_sequence<Indices...>){
            (((m_idx == Indices) ? (tools::destroy_at<Indices>(m_data), 0) : 0), ...);
        }(idx_seq);
    }

    template<size_t I, typename... Us>
    constexpr friend decltype(auto) get(const variant<Us...>& v);

    template<typename T, typename... Us>
    constexpr friend T get(const variant<Us...>& v);
}; 

template<size_t I, typename... Ts>
constexpr decltype(auto) get(const variant<Ts...>& v){
    return tools::get<I>(v.m_data);
}

template<typename T, typename... Ts>
constexpr T get(const variant<Ts...>& v){
    return tools::get<tools::find_index_of_v<T, Ts...>>(v.m_data);
}
}

Compiler Explorer

Introduction

std::variant usage

Using generic lambdas as visitors

The cracked-man’s match

Multiple Dispatch

What is a std::variant under the hood?

TMP Utilities

find_index_of_v

uint_atleast_t

get_nth_type

instance_of

A macro for forwarding arguments

union_ type definition

union_size

get

construct_at and destroy_at

Wrapping it up - the variant class

Code listing

`std::variant` usage

What is a `std::variant` under the hood?

`find_index_of_v`

`uint_atleast_t`

`get_nth_type`

`instance_of`

`union_` type definition

`union_size`

`get`

`construct_at` and `destroy_at`

Wrapping it up - the `variant` class