C++ Associative Container and Iterator Validness

I used to believe that iterators will be invalidated after calling the member functions insert() or erase() of containers. Thus, I would adopt a conservative approach:

  1. Create a temporary container.
  2. Copy the elements which I would like to keep to the temporary container.
  3. Swap the container.

For example, to remove odd numbers from a set:

void remove_odds(std::set<unsigned> &c) {
  std::set<unsigned> result;
  for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ++i) {
    if (*i % 2 == 0) {
      result.insert(*i);
    }
  }
  c.swap(result);
}

However, this code is inefficient for two reasons:

  1. Elements copy overhead -- If the container is hold heavy elements, e.g. std::vector<std::string>, then copying elements will cost a lot of time and space.
  2. Temporary container construction overhead -- The overhead to allocate new element nodes and insert the elements to the container.

In addition, this conservative approach sounds like an overkill if only a little portion of elements are expected to be removed.

Fortunately, according to the latest C++14 draft (N3797), the associative containers (23.2.4 #9) guarantees that:

The insert() and emplace() members shall not affect the validity of iterators and references to container [1], and the erase() shall invalidate only iterators and references to the erased elements.

Besides, the unordered associative containers (23.2.5 #14) guarantees that:

The insert() and emplace() members shall not affect the validity of references to container elements, but may invalidate all iterators to the container [2]. The erase() members shall invalidate only iterators and references to the erased elements.

Thus, for associative containers (including std::set, std::map, std::multiset, and std::multimap) and unordered associative containers (including std::unorderd_set, std::unordered_map, std::unordered_multiset, std::unordered_multimap), we can directly erase the elements from the container:

void remove_odds_2(std::set<unsigned> &c) {
  for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ) {
    if (*i % 2 == 0) {
      ++i;  // Move to next element.
    } else {
      c.erase(i++);  // Erase current element and post increase iterator.
    }
  }
}

However, please be careful that following code remove_odds_3() is not equivalent to remove_odds_2() because the erase() function will invalidate the iterator to the erased element. According to 24.2.1 #5 and #10, any operations on the invalid iterator except the destruction (including ++i) will in result undefined behavior.

// XXX Incorrect code to demostrate undefined behavior
void remove_odds_3(std::set<unsigned> &c) {
  for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ++i) {
    if (*i % 2 != 0) {
      c.erase(i);  // erase() will invalidate the iterator it.
    }
  }
}

To complete our discussion, it is important to know that there is no such guarantee for sequence containers [3]. We should either fall back to the conservative copy-and-swap approach or restart from c.begin() after removing any elements.

[1]IMO, there is a typo in "shall not affect the validity of iterators and references to container". The "container" should be replaced with "elements."
[2]Please notice that the unordered associative containers do not always guarantee the iterator validness on insert(), since the iterator will be invalidated when the hash table is resized. Read 23.2.5 #15 for more details. However, this is not related to this post because we are focusing on the erase() function.
[3]As a special case, std::list gives extra guarantee on insert() and erase(). But this does not apply to sequence containers, such as std::vector and std::deque, in general.