I used to believe that iterators will be invalidated after calling the
member functions insert()
or erase()
of containers. Thus, I
would adopt a conservative approach:
- Create a temporary container.
- Copy the elements which I would like to keep to the temporary container.
- Swap the container.
For example, to remove odd numbers from a set:
void remove_odds(std::set<unsigned> &c) {
std::set<unsigned> result;
for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ++i) {
if (*i % 2 == 0) {
result.insert(*i);
}
}
c.swap(result);
}
However, this code is inefficient for two reasons:
- Elements copy overhead -- If the container is hold heavy elements, e.g.
std::vector<std::string>
, then copying elements will cost a lot of time and space. - Temporary container construction overhead -- The overhead to allocate new element nodes and insert the elements to the container.
In addition, this conservative approach sounds like an overkill if only a little portion of elements are expected to be removed.
Fortunately, according to the latest C++14 draft (N3797), the associative containers (23.2.4 #9) guarantees that:
Theinsert()
andemplace()
members shall not affect the validity of iterators and references to container [1], and theerase()
shall invalidate only iterators and references to the erased elements.
Besides, the unordered associative containers (23.2.5 #14) guarantees that:
Theinsert()
andemplace()
members shall not affect the validity of references to container elements, but may invalidate all iterators to the container [2]. Theerase()
members shall invalidate only iterators and references to the erased elements.
Thus, for associative containers (including std::set
, std::map
,
std::multiset
, and std::multimap
) and unordered associative
containers (including std::unorderd_set
, std::unordered_map
,
std::unordered_multiset
, std::unordered_multimap
), we can
directly erase the elements from the container:
void remove_odds_2(std::set<unsigned> &c) {
for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ) {
if (*i % 2 == 0) {
++i; // Move to next element.
} else {
c.erase(i++); // Erase current element and post increase iterator.
}
}
}
However, please be careful that following code remove_odds_3()
is not
equivalent to remove_odds_2()
because the erase()
function will
invalidate the iterator to the erased element. According to 24.2.1 #5 and #10,
any operations on the invalid iterator except the destruction (including
++i
) will in result undefined behavior.
// XXX Incorrect code to demostrate undefined behavior
void remove_odds_3(std::set<unsigned> &c) {
for (std::set<unsigned>::iterator i = c.begin(); i != c.end(); ++i) {
if (*i % 2 != 0) {
c.erase(i); // erase() will invalidate the iterator it.
}
}
}
To complete our discussion, it is important to know that there is no such
guarantee for sequence containers [3]. We should either fall back to
the conservative copy-and-swap approach or restart from c.begin()
after
removing any elements.
[1] | IMO, there is a typo in "shall not affect the validity of iterators and references to container". The "container" should be replaced with "elements." |
[2] | Please notice that the unordered associative containers do not always
guarantee the iterator validness on insert() , since the iterator
will be invalidated when the hash table is resized. Read 23.2.5 #15 for
more details. However, this is not related to this post because we are
focusing on the erase() function. |
[3] | As a special case, std::list gives extra guarantee on
insert() and erase() . But this does not apply to sequence
containers, such as std::vector and std::deque , in general. |