Introduction
An iterator is a generalization of a pointer. C++ STL containers usually expose iterators as part of their interface. They abstract away lower-level implementation details of traversing through container types, thus freeing the container-user to focus on algorithm design/business logic.
Traditional C++ iterators
Iterators have been around since before C++11, but they really hit mainstream since C++11 started shipping. The STL containers all implement their own iterators, however, it’s possible for developers to write their own iterators for custom collections.
In the past, you’d implement iterators using tagging. A tag is simply an empty struct, with no data or behavior. It is often used to perform static dispatching (compile-time polymorphism). Here is a minimalistic example:
#include<iostream>
namespace dev{
struct random_access_iterator_tag{};
struct forward_iterator_tag{};
template<typename T>
struct vector{
T* m_data;
std::size_t m_size;
std::size_t m_capacity;
struct iterator{
using iterator_category = random_access_iterator_tag;
/* .... */
T* m_ptr;
};
iterator begin(){
return(iterator(m_data));
}
};
template<typename T>
struct list{
struct node{
T data;
node* next;
};
node* head;
std::size_t m_size;
struct iterator{
using iterator_category = forward_iterator_tag;
/* .... */
node* m_ptr;
};
iterator begin(){
return(iterator(head));
}
};
template<typename It>
It advance(It iterator, std::size_t n, forward_iterator_tag){
std::cout << "\n" << "Advance a foward iterator";
return iterator;
}
template<typename It>
It advance(It iterator, std::size_t n, random_access_iterator_tag){
std::cout << "\n" << "Advance a random access iterator";
return iterator;
}
}
int main(){
dev::vector<int> vec;
dev::list<int> lst;
auto it = vec.begin();
dev::advance(it, 3, dev::vector<int>::iterator::iterator_category());
auto it2 = lst.begin();
dev::advance(it2, 3, dev::list<int>::iterator::iterator_category());
}
An iterator over a custom array-like sequence of elements would look like the following:
#include <iterator>
template <class T>
struct Iterator {
using iterator_category = std::forward_iterator_tag;
using value_type = T;
using difference_type = std::ptrdiff_t;
using pointer = T*;
using reference = T&;
// ...rest of iterator implementation here
};
This requires you to write a lot of boiler-plate code. While tagging works, it puts an extra-burden on the developer.
Modern C++ iterators
C++20 has a language feature called concepts - a set of named constaints that a type should satisfy. So, we can now implement iterators based on their behavior, rather than their identity. This is often referred to as duck typing: the idea that if an entity looks like a duck and quacks like a duck, it must be a duck.
A new system of iterators based on concepts has been introduced.
std::input_iterator
std::output_iterator
std::forward_iterator
std::birdirectional_iterator
std::random_access_iterator
std::contiguous_iterator
Before we deep-dive into these iterator concepts, let’s understand std::sentinel_for
concept.
A sentinel signals the end of a sequence of values. Prior to C++20, when traversing a collection, the way you’d check if you’ve hit the end of the collection was to compare your current iterator with an end()
iterator.
#include<vector>
int main(){
std::vector<double> v{1.0, 2.0, 3.0, 4.0, 5.0};
// Traverse a std::vector
for(auto it {v.begin()}; it!=v.end(); ++it){
std::cout << *it << "\n";
}
}
Usually, this end iterator was just a case of your normal iterator that had some internal state identifying it as one past the last element of the container.
Beginning C++20, you can actually use any type as a sentinel for an iterator so long as the type satisfies the std::sentinel_for
concept. std::sentinel_for
concept requires
std::input_or_output_iterator
The input_or_output_iterator
is the basis of the iterator concept taxonomy. It only requires that an iterator type It
supports the operations for dereferencing and incrementing the iterator.
std::output_iterator
std::output_iterator
concept models the idea of a write-only iterator. E.g. such an iterator can be used to write to the standard output stream. Hence, they can only be dereferenced on the left-hand side of an assignment operator.
Since, they are single pass, we don’t even need to implement an equality comparison operator, because they don’t have an end iterator or sentinel value to compare against.
#include <cstddef>
#include <iterator>
#include <concepts>
template<typename T>
struct SimpleOutputIterator {
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T*;
pointer m_buffer_ptr;
// Default constructor
SimpleOutputIterator() = default;
// Constructor
SimpleOutputIterator(pointer start)
: m_buffer_ptr{start} {}
// Dereference operator
T& operator*() {
return *m_buffer_ptr;
}
// Pre-increment
SimpleOutputIterator& operator++() {
++m_buffer_ptr;
return (*this);
}
// Post-increment
SimpleOutputIterator operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
};
static_assert(std::output_iterator<SimpleOutputIterator<int>, int>);
int main(){
return 0;
}
std::input_iterator
std::input_iterator
concept models the idea of a read-only iterator. Such an iterator, for example, can be used read packets data from a network socket.
Input iterators are also single-pass, because once you’ve read a byte of data from a network socket, you can’t read it again. They must also be comparable to some sentinel value such as EOF
, \0
, to signal the end of data etc.
However, the equality comparison operator bool operator==(It, Sen)
is only used by the algorithm operating on the container, and therefore it’s the responsibility of the algorithm writer to supply an implementation of bool operator==(It, Sen)
. This definition is not required in the container implementation.
#include <cstddef>
#include <iterator>
#include <concepts>
template<typename T>
struct SimpleInputIterator {
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T*;
using reference = T&;
pointer m_socket_fd;
// Default constructor
SimpleInputIterator() = default;
// Constructor
SimpleInputIterator(pointer start)
: m_socket_fd{start} {}
// Dereference operator
const T& operator*() const {
return *m_socket_fd;
}
// Pre-increment
SimpleInputIterator& operator++() {
++m_socket_fd;
return (*this);
}
// Post-increment
SimpleInputIterator operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
};
static_assert(std::input_iterator<SimpleInputIterator<int>>);
int main(){
return 0;
}
std::forward_iterator
std::forward_iterator
requires that the iterator type be an input (read-only) iterator and also be std::incrementable.
std::input_iterator
only requires the iterator be std::weakly_incrementable
. So while it supports the increment operator++()
, if i
and j
are two instances of the iterator type It
, i == j
does not imply ++i == ++j
. That is, algorithms on weakly-incrementable types must be single-pass algorithms.
std::incrementable
concept informally means that i == j
\(\implies\) ++i == ++j
. Algorithms on incrementable types are multi-pass algorithms.
You might use an iterator satisfying std::forward_iterator
concept to traverse through a std::forward_list
(a singly linked-list).
#include <cstddef>
#include <iterator>
#include <concepts>
template <typename T>
struct list_node {
T m_data;
list_node* m_next;
};
template <typename T>
struct SimpleForwardIterator {
using difference_type = std::ptrdiff_t;
using value_type = T; // The value type is T, not list_node<T>
using pointer = T*;
using reference = T&;
list_node<T>* m_node_ptr;
// Default constructor
SimpleForwardIterator() = default;
// Constructor
SimpleForwardIterator(list_node<T>* start)
: m_node_ptr{start} {}
// Dereference operator
reference operator*() const {
return m_node_ptr->m_data; // Return the data stored in the node
}
// Pre-increment
SimpleForwardIterator& operator++() {
m_node_ptr = m_node_ptr->m_next;
return *this;
}
// Post-increment
SimpleForwardIterator operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
// Equality comparison
bool operator==(const SimpleForwardIterator& other) const {
return m_node_ptr == other.m_node_ptr;
}
// Inequality comparison
bool operator!=(const SimpleForwardIterator& other) const {
return !(*this == other);
}
};
static_assert(std::forward_iterator<SimpleForwardIterator<int>>);
int main() {
return 0;
}
std::bidirectional_iterator
A std::list
is a doubly linked that supports both traversals in the forward as well as reverse direction. When we want to be able to move forward and backwards across our collection, we must implement an iterator satisfying std::bidirectional_iterator
concept.
You need to implement pre-increment, post-increment, pre-decrement and post-decrement operations.
#include <cstddef>
#include <iterator>
#include <concepts>
template <typename T>
struct list_node {
T m_data;
list_node* m_next;
list_node* m_prev;
};
template <typename T>
struct SimpleBidirectionalIterator {
using difference_type = std::ptrdiff_t;
using value_type = T; // The value type is T, not list_node<T>
using pointer = T*;
using reference = T&;
list_node<T>* m_node_ptr;
// Default constructor
SimpleBidirectionalIterator() = default;
// Constructor
SimpleBidirectionalIterator(list_node<T>* start)
: m_node_ptr{start} {}
// Dereference operator
reference operator*() const {
return m_node_ptr->m_data; // Return the data stored in the node
}
// Pre-increment
SimpleBidirectionalIterator& operator++() {
m_node_ptr = m_node_ptr->m_next;
return *this;
}
// Post-increment
SimpleBidirectionalIterator operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
// Pre-decrement
SimpleBidirectionalIterator& operator--(){
m_node_ptr = m_node_ptr->m_prev;
return *this;
}
// Post-decrement
SimpleBidirectionalIterator operator--(int){
auto tmp = *this;
--(*this);
return tmp;
}
// Equality comparison
bool operator==(const SimpleBidirectionalIterator& other) const {
return m_node_ptr == other.m_node_ptr;
}
// Inequality comparison
bool operator!=(const SimpleBidirectionalIterator& other) const {
return !(*this == other);
}
};
static_assert(std::bidirectional_iterator<SimpleBidirectionalIterator<int>>);
int main() {
return 0;
}
std::random_access_iterator
Containers such as std::vector<T>
and std::array<T,N>
are a collection of elements that are stored contiguously in memory. Hence, the element at index i
can be accessed in \(O(1)\) constant-time.
What if I want to code up an iterator for jumping around the collection? Such an iterator must satisfy the std::random_access_iterator
concept. The std::random_access_iterator
concept requires that advancement with +=
, -=
, +
and -
, computation of distance between two elements and element access using the indexing operator []
be constant-time operations.
#include <cstddef>
#include <iterator>
#include <concepts>
template <typename T>
struct SimpleRandomAccessIterator {
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T*;
using reference = T&;
T* m_raw_data_ptr;
// Default constructor
SimpleRandomAccessIterator() = default;
// Constructor
SimpleRandomAccessIterator(T* start)
: m_raw_data_ptr{start} {}
// Dereference operator
reference operator*() const {
return *m_raw_data_ptr;
}
// Pre-increment
SimpleRandomAccessIterator& operator++() {
++m_raw_data_ptr;
return *this;
}
// Post-increment
SimpleRandomAccessIterator operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
// Pre-decrement
SimpleRandomAccessIterator& operator--() {
--m_raw_data_ptr;
return *this;
}
// Post-decrement
SimpleRandomAccessIterator operator--(int) {
auto tmp = *this;
--(*this);
return tmp;
}
// Array subscript operator
reference operator[](difference_type n) const {
return m_raw_data_ptr[n];
}
// Compound addition
SimpleRandomAccessIterator& operator+=(difference_type n) {
m_raw_data_ptr += n;
return *this;
}
// Compound subtraction
SimpleRandomAccessIterator& operator-=(difference_type n) {
m_raw_data_ptr -= n;
return *this;
}
// Addition
SimpleRandomAccessIterator operator+(difference_type n) const {
return SimpleRandomAccessIterator(m_raw_data_ptr + n);
}
// Subtraction
SimpleRandomAccessIterator operator-(difference_type n) const {
return SimpleRandomAccessIterator(m_raw_data_ptr - n);
}
// Distance between iterators
difference_type operator-(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr - other.m_raw_data_ptr;
}
// Equality comparison
bool operator==(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr == other.m_raw_data_ptr;
}
// Inequality comparison
bool operator!=(const SimpleRandomAccessIterator& other) const {
return !(*this == other);
}
// Relational operators
bool operator<(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr < other.m_raw_data_ptr;
}
bool operator<=(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr <= other.m_raw_data_ptr;
}
bool operator>(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr > other.m_raw_data_ptr;
}
bool operator>=(const SimpleRandomAccessIterator& other) const {
return m_raw_data_ptr >= other.m_raw_data_ptr;
}
// Friend operator+ for n + iterator
template <typename U>
friend SimpleRandomAccessIterator<U> operator+(
typename SimpleRandomAccessIterator<U>::difference_type n,
const SimpleRandomAccessIterator<U>& it
) {
return SimpleRandomAccessIterator<U>(it.m_raw_data_ptr + n);
}
};
static_assert(std::random_access_iterator<SimpleRandomAccessIterator<int>>);
int main() {
return 0;
}
References
- cpp iterators in depth by Braden Hitchcock.