C++ Ranges

What is a range?

C++ ranges are a programmatic abstraction for any container/type T that allows iteration over its elements by providing begin and end iterators. A std::ranges::range is defined as a concept that requires a container type T satisfy 2 constraints: it has a begin and an end.

template< class T >
concept range = requires( T& t ) {
    ranges::begin(t); // equality-preserving for forward iterators
    ranges::end (t);
};

The C++ ranges library also includes rangified algorithms which are applied to ranges eagerly, and range adaptors that are applied to views lazily.

There are three kind of ranges : they can be an abstraction on

a pair of iterators
an iterator and a count
an iterator and a predicate

Here’s a quick code example:

#include<iostream>
#include<vector>
#include<ranges>
#include<type_traits>

template<std::ranges::input_range Rng>
auto sum(Rng&& rng){
    Rng result{};
    for(auto&e : rng){
        result += e;
    }
    return result;
}

int main()
{
    std::vector v{1, 2, 3, 4, 5};
    auto result = sum(v);
}

Play on Compiler Explorer

Using universal references to accept ranges

You always accept ranges using universal references. The motivation for this is as follows. Consider the function find_second_occurrence that finds the second occurrence of a character in the string.

The function call find_second_occurrence( "Hello World", 'l') invokes find_second_occurrence with a string literal. The string literal is copied to a temporary std::string instance. We can bind this to the const lvalue reference str. All good so far. In this instance, we even find the second occurrence of the character l at the index 3. We return str[3]. Except, that once we return, the temporary goes out of scope and is destroyed. So, we have a dangling reference.

#include<string>
const char& find_second_occurrence(const std::string& str, char c){
    static char not_found = '\0';

    std::size_t idx = str.find(c);
    if(idx == std::string::npos) return not_found;

    idx = str.find(c, idx+1);
    if(idx == std::string::npos) return not_found;

    return str[idx];
}

Something very interesting happens if we change the signature of the function to take a string_view. This function is no longer broken. We are kind of doing the same thing. We call the function with the string literal Hello World. We construct the temporary instance of the string_view. We bind the temporary instance of a string to a const reference. Then, we find the second instance of l. We return reference to that. We destroy the string_view. But, the important difference is that string_view doesn’t hold it’s own data. It just points to some place else. That some place else in this case is a global object - a string literal. Recall, string literals are lvalues. So, we have a reference into a string literal.

#include<string>
const char& find_second_occurrence(const std::string_view& str, char c){
    static char not_found = '\0';

    std::size_t idx = str.find(c);
    if(idx == std::string::npos) return not_found;

    idx = str.find(c, idx+1);
    if(idx == std::string::npos) return not_found;

    return str[idx];
}

This difference is something that is captured under the name borrowed_range. You probably know std::string_view and std::span. These are ranges that don’t hold their own data, but simply point to some place else. They are called borrowed ranges.

If you use range algorithms, they actually take this into account.

If you call ranges_find() with a std::string_view, it will work absolutely fine. You will get an iterator back.

auto it1 = std::ranges::find(std::string_view("Hello World!"), 'o');

// decltype(it1) == std::string_view::iterator

If you call ranges_find() with a temporary std::string instance, you will get std::ranges::dangling, which is a special type and this is just a empty type, meaning if you try to do anything with it, you will get a compilation error, because it doesn’t support any operations.

But, importantly, if you call ranges_find with an lvalue, meaning that the lifetime of the argument is outside of the function call, well, then the type of the range in the function signature actually doesn’t matter; it would not lead to a dangling reference. It would be considered that the function is borrowing from the outside.

std::string str1("Hello World");
auto it3 = std::ranges::find(str1, 'o');
// decltype(it3) == std::string::iterator

std::string_view str2("Hello World!");
auto it4 = std::ranges::find(str2, 'o');
// decltype(it4) == std::string_view::iterator

This is precisely when const references break down, because we cannot actually distinguish which of these two situations we are actually in:

void fun(const std::string& rng) {}

fun(std::string(""));   // passing a temporary - we are taking ownership

std::string str;
fun(str);               // passing an lvalue - we are borrowing

If we switch to universal references, we can actually interrogate our argument inside of the function, to see if we are actually borrowing the data, and therefore it’s safe to return references and iterators from it without the danger of dangling or if you are actually taking ownership of the data, in which case, you better not.

template<typename Rng>
void fun(Rng&& rng) {
    if constexpr(std::ranges::borrowed_range<decltype(rng)>)
    {
        //borrowed range
    }
    else{
        //taking ownership
    }
}

fun(std::string(""));       // taking ownership

std::string str;
fun(str);                   // borrowing
fun("");                    // borrowing
fun(std::string_view(""));  // borrowing

Universal references are also necessary, because of views.

C++ 20 Views

A view is a light-weight object. A view is a range that is:

Cheap to move
Cheap to destroy when moved-from
Cheap to copy if copyable

Let’s imagine that, we wish to code up a generic print function:

#include<ranges>
#include<iostream>
#include<concepts>
#include<vector>
#include<list>

template<std::ranges::input_range Rng>
void print(Rng&& rng){
    std::cout << "\n";
    for(int i{0}; auto elem : rng){
        std::cout << "\n" << "[" << i++ << "] = " << elem;
    }
}

int main(){
    std::vector vec{0, 8, 15, 47, 11, 42};
    std::list lst{0, 8, 15, 47, 11, 42};

    print(vec);
    print(lst);

    print(std::views::take(vec, 3));    //print first 3 elements
    print(std::views::take(lst, 3));    //print first 3 elements

    print(vec | std::views::take(3));   //print first 3 elements
    print(lst | std::views::take(3));   //print first 3 elements

    return 0;
}

Play of Compiler Explorer

What you can do since C++ 20 is, you can say, I can take this print function and instead of printing the entire collection as a whole, we can say, well print a view on this collection.

So, I can, for example take the first 3 elements of the vector vec and pass them to the print function. And I can do the same for a list.

There is some nice syntax for this. You can pipe the vector or list into the view using the pipe symbol |.

We can have real pipelines doing consecutive specifications of what to do with the elements, which elements to use. In C++23, we have a zip_view which can zip the elements of two views.

A std::views::iota(1) view generates a sequence of values \(\{1,2,3,4,5,\ldots\}\), its an infinite sequence and then we have a second collection, a vector \(\{0, 8, 15, 47, 11, 42\}\). We can zip the elements of these two views together. The elements are then tuples where the first member is the index, the next one is the vector element.

for(auto [idx, elem] : std::views::zip(std::views::iota(1), vec))
    std::cout << "\n" <<  idx << " : " << elem;

Member functions of views

Views do not provide expensive member functions.

#include<ranges>
#include<iostream>
#include<concepts>
#include<vector>
#include<list>

int main(){
    std::vector vec{1, 2, 3, 4, 5};
    std::list lst{1, 2, 3, 4, 5};

    auto vVec = vec | std::views::drop(3);      
    vVec.begin();                               // fast: vec.begin() + n
    vVec.empty();                               // fast: vec.size() <= n
    vVec.size();                                // fast: n >= vec.size() ? 0 : vec.size() - n
    vVec[2];                                    // vec[n + idx]

    auto vLst = lst | std::views::drop(3);
    vLst.begin();                               // slow: lst.begin() and n times ++
    vLst.empty();                               // fast
    vLst.size();                                // fast
    //vLst[2];                                  // Very slow, n + idx times ++

    auto vFlt = vec | std::views::filter([](int x){ return x >= 3; });
    vFlt.begin();                               // slow: pred for all elements until first is true
    vFlt.empty();                               // slow: pred for all elements until first is true
    //vFlt.size();                                // Not supported. 
                                                // slow: pred for all elements until first is true
    //vFlt[2];                                    // Not supported. Slow.
    return 0;
}

Example of pipeline of range adapters

Let’s say, we have a map of composers of classic music. We want to deal with that collection, but only those composers who were born after 1700.

#include<map>
#include<ranges>
#include<concepts>
#include<iostream>

int main(){
    std::map<std::string, int> composers{
        {"Bach", 1685},
        {"Mozart", 1756},
        {"Beethoven",1770},
        {"Tchaikovsky",1840},
        {"Chopin",1810},
        {"Vivaldi",1678},
    };

    for(const auto& elem : composers 
                            | std::views::filter([](std::pair<std::string,int> composer){
                                return std::get<1>(composer) > 1700;
                            })
                            | std::views::take(3)
                            | std::views::keys
    )
        std::cout << " - " << elem << "\n";
    return 0;
}

Play on Compiler Explorer

Let’s only take the first 3 of those composers and we only need their keys, which are the names. And, let’s use that as the right hand side collection we iterate over in a range-based for loop. So, that’s the output of the program.

Numerical sequences

In numerical algorithms, one needs often sequences of numerical values. We can use ranges to implement numerical sequences, dynamical systems and numerical algorithms. The range becomes a proxy for the algorithm.

#include<iostream>
#include<type_traits>
#include<concepts>
#include<vector>
#include<ranges>
#include<cmath>
#include<numbers>
#include<functional>

void printSequence(auto seq) {
    std::cout << "\n";
    for (auto x : seq)
        std::cout << x << " ";
}
int main()
{
    /* Generating numerical sequences */
    int n{ 1024 };
    
    // 0, 1, 2, 3, ...
    auto seq = std::ranges::iota_view(0, n);
    printSequence(seq | std::views::take(5));

    // Let's say we are interested to sample a continuous function F
    // at x_0 = 0.0, x_1 = 0.1, x_2 = 0.2, x_3 = 0.3, ....
    // We can generate the sampling domain as:
    double sampling = 0.1;
    auto seq2 = seq | std::views::transform([sampling](double i) {
        return sampling * i;
        });

    printSequence(seq2 | std::views::take(5));

    // f_n = sin(2x_n) + 0.1
    auto seq3 = seq2 | std::views::transform([](double x_n) {
        return sin(2 * x_n) + 0.1;
        });

    printSequence(seq3 | std::views::take(5));

    // We can wrap this logic into a lambda that accepts a sampling (frequency),
    // an arbitrary function F and generates the sequence F(x[0]), F(x[1]), ...
    auto sequence = [](int n, double sampling, auto&& F) {
        auto seq = std::ranges::iota_view(0, n);
        return seq | std::views::transform([sampling, F](double i) {
            return F(sampling * i);
            });
    };

    /* Great for scientific computing! */
    printSequence(sequence(1024, 0.1, [](double x) {return sin(2 * x) + 0.1;})
        | std::views::take(5));

    /* Custom break conditions */
    auto identity = [](double x) { return x;};
    auto seq4 = std::views::take_while(sequence(1024, 0.1, identity),
        [](double x) { return x < 0.5;}
    );

    printSequence(seq4);

    /* Combine sequences : You get a sequence of tuples */
    auto result = std::ranges::zip_view(sequence(1024, 0.1, identity), sequence(1024, 0.1, identity));

    return 0;
}

Play on Compiler Explorer

0 1 2 3 4 
0 0.1 0.2 0.3 0.4 
0.1 0.298669 0.489418 0.664642 0.817356 
0.1 0.298669 0.489418 0.664642 0.817356 
0 0.1 0.2 0.3 0.4

Custom Ranges

Consider the Newton’s root-finding algorithm. The Newton’s algorithm is:

Choose \(x_0\).
Iterate \(x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}\)

To implement numerical schemes like Newton’s, we can hand roll-out a custom range called map_range. A map_range represents the fixed-point iteration algorithm. Given an initial-value \(x_0\) and a function \(f\), map_range represents the recursive sequence

\[x_{n+1} = f(x_n)\]

that is

\[ \{x_0, f(x_0), f(f(x_0)), \ldots, \} \]

From basic analysis, it is a well-known fact, that if \(f\) is a contraction, then the sequence \((y_n)_{n=0}^{\infty}\), where \(y_{n+1} = f(x_n)\) converges to a finite value.

map_range holds three member-variables : the current state m_x, the function m_f and the break condition m_break_condition. map_range must satisfy the std::ranges::range concept.

#include<iostream>
#include<type_traits>
#include<concepts>
#include<vector>
#include<ranges>
#include<cmath>
#include<numbers>
#include<functional>
/* 
We write a new type map_range that will be used to implement fixed-point iteration in C++.
Beginning with the initial value x[0], map_range represents the recursive sequence

x[n+1] = F(x[n])

that is {x[0], F(x[0]), F(F(x[0])), F(F(F(x[0]))), ...
*/
template<typename T, typename Func, typename Cond>
class map_range{
private:
    T m_x;
    Func m_f;
    Cond m_break_condition;

    struct iterator {
        map_range* r;

        iterator(map_range * r_) : r{r_} {}

        /* Compute the next iterate x[n+1] = f(x[n]) */
        iterator& operator++() {
            r->m_x = r->m_f(r->m_x);
            if (r->m_break_condition(r->m_x))
                r = nullptr;
            return (*this);
        }

        /* Dereference the iterator and return the current state x[n]*/
        T operator*() {
            return r->m_x;
        }

        bool operator==(iterator & o) {
            return (o.r == r);
        }

        bool operator!=(iterator & o)  {
            return !(o == *this);
        }
    };
public:
    map_range(T x, Func func, Cond cond )
    : m_x {x}
    , m_f {func}
    , m_break_condition{cond}
    { }

    // begin() and end() methods which return iterators
    iterator begin() { return iterator{ this }; }
    iterator end()  { return iterator{ nullptr }; }
    T value() { return m_x; }
};

template<typename T, typename Func, typename Cond>
auto make_range_map(T value, Func func, Cond cond) {
    return map_range(value, func, cond);
}

int main(){
    /* Let's solve exp(-x^2 / 2 ) = 0 */
    std::function<double(double)> f = [](auto x) { return exp(-0.5 * x * x); };
    std::function<double(double)> df = [](auto x) { return -x * exp(-0.5 * x * x); };
    auto cond = [f](double x) { return std::abs(f(x)) < 1.0e-12; };
    auto range = newton_range( 1.0, f, df, cond );

    std::cout << "\n" << "Solving f(x) = exp(-0.5 * x * x) = 0" << "\n";
    for(auto r : range)
        std::cout << r << " ";
    return 0;
}

Play on Compiler Explorer

References

Ranges and Iterators for numerical Problems, Karsten Ahnert at Meeting C++ 2014
What is a range in C++, Simon Toth, C++ on the Sea, 2024