Iterators I

If you know how to use a vector or a map, then you know that [], the subscript operator, can be used to directly access any element from the data structure.

For vectors, the subscript operator references elements by their index value (an integer greater than or equal to 0); and for maps, the subscript operator references elements by their unique key value (which need not have to be an integer).

Not all data structures support the subscript operator and some require you to use an alternative mechanism for accessing elements: iterators.

Iterators are a category of data types whose instances point to a particular element in a particular existing data structure.

Iterating through a vector with iterators

Here are three examples that implement the same function: vector_mean, a function for calculating the mean value of a vector of double:

This implementation uses a range-based for loop:

double vector_mean(const vector<double>& v) { double sum = 0.0; for (double e : v) sum += e; return sum / v.size(); }

This implementation uses an incrementing integer with the subscript operator:

double vector_mean(const vector<double>& v) { int i; double sum = 0.0; for (i = 0; i < v.size(); ++i) sum += v[i]; return sum / v.size(); }

This implementation uses iterators:

double vector_mean(const vector<double>& v) { vector<double>::const_iterator it; double sum = 0.0; for (it = v.begin(); it != v.end(); ++it) sum += *it; return sum / v.size(); }

Every standard C++ data structure class is not only a data type, it is also like a namespace, housing definitions for member data types. These member data types are accessible using ::, the scope resolution operator. In our iterator example, we are using the data type vector<double>::const_iterator to iterate over our vector of double.

Unlike other operators, the scope resolution operator does not primarily operate on variables (although it can), it operates on namespaces and data types. All qualified C++ data structures contain iterator and const_iterator as member data types. The reason we use const_iterator in our example is because the parameter to vector_mean is constant.

Iterators are so named because their primary purpose is to iterate over the elements from a source of data. All iterator types will, at the very least, will be incrementable and dereferencable.

In our example, the expression it = v.begin() means that our iterator variable will be set to point to the first element in the vector variable v. All standard C++ data structures include a begin() member function returning an iterator to the beginning of the elements.

The for loop stops when our iterator reaches the end of the vector's element, which can be found using the end() member function.

++it increments the iterator to point to the next element in our vector.

The operator * in the expression *it is known as the dereference operator. In this context, it means to retrieve the value of the element that is currently being referenced by our iterator.

An iterator may support additional functionality depending on the data structure that it's related to.

vector iterators support random-access, so you can use the addition or subtraction operators to jump to the desired element. For example, to set the 4th element in v to the value 7, you can use *(v.begin() + 3) = 7;. The equivalent expression using indexing would be: v[3] = 7;

Inserting and erasing vector elements

The insert and erase member functions for any standard C++ data structure requires the use of iterators. Let's consider a vector variable named v that contains the values { 9, 2, 33, 42, 100, 8 }. Here's an image of it:

Notice that v.end() returns an iterator pointing right past the last element.

The expression v.insert(v.begin() + 2, 77); will modify the v to contain { 9, 2, 77, 33, 42, 100, 8}.

What insert does is it creates a new element containing your desired value and places it at the position referenced by your iterator argument. Any existing element at that position, as well as all elements proceeding it, will be shifted over to the right to make room for the new element.

You may use insert to place a new element at the end of a vector. The expressions:

will all have the same effect, resulting in v containing {9, 2, 33, 42, 100, 8, 98}.

The erase member function does the opposite of insert. It removes an element pointed to by an iterator and shifts any proceeding elements over to the left in order to fill in the gap created by the removed element.

The expression v.erase(v.begin() + 2); changes v from { 9, 2, 33, 42, 100, 8 } to { 9, 2, 42, 100, 8 }.

You can also use v.end() as a base location, so v.erase(v.begin() + 2); in our example is equivalent to v.erase(v.end() - 4);.

You can also specify a range of elements to erase with 2 iterators. v.erase(v.begin() + 2, v.begin() + 4); will remove 3 elements. Here is a visual for erase: