Dunce of the Week #1
If you unfamilar with "Dunce of the Week", please read my post explaining what DotW is and why I chose "dunce".
Please forgive me for the formatting of the code. I'm still trying to figure out how to get code to be formatted correctly on Blogger.
I want to thank reddit user caballist for his feedback on this item and for teaching me a couple things. I have incorporated much of his feedback when I updated this post.
Unsafe Usage of
IteratorsPlease forgive me for the formatting of the code. I'm still trying to figure out how to get code to be formatted correctly on Blogger.
I want to thank reddit user caballist for his feedback on this item and for teaching me a couple things. I have incorporated much of his feedback when I updated this post.
Q1) The following function (modified from its original version) was flagged by CppCheck as using
iterators in an unsafe manner. Can you identify what the problem
with the code is?
void ErrorMgr::Clear()
{
/* Clear only the clearable errors and alarms
*/
std::set<AppError>::iterator it;
{
mutexLock autoLock(m_mutexActiveAlarms);
for (it = m_setOfActiveAlarms.begin(); it !=
m_setOfActiveAlarms.end(); it++)
{
if
(it->GetClearable())
{
m_setOfActiveAlarms.erase(it);
}
}
}
{
mutexLock autoLock(m_mutexActiveErrors);
for (it = m_setOfActiveErrors.begin(); it !=
m_setOfActiveErrors.end(); it++)
{
if (it->GetClearable())
{
m_setOfActiveErrors.erase(it);
}
}
}
}
The important part of the definition for AppError is as follows:
class AppError
class AppError
{
// ...
public:
bool IsClearable() const;
// ...
};
Q2) How can the code be
fixed to use iterators in a safe manner.// ...
public:
bool IsClearable() const;
// ...
};
Q3) Let us assume that the
there is a need other places in our product to remove elements from sets based
on the value stored in the sets. What can we do to help us from
repeating the same mistake while (hopefully) at the same time making the code
easier to read?
Answers
A1) The call to erase removes the iterator from the
set. The iterator no longer points to any valid item – especially
not one in the set. (The iterator has been “invalidated” to use
the term of the C++ standards committee.) To then increment the
iterator (via the last term in the for loop statement) will result in undefined
behavior because the iterator is not valid. The iterator may
accidentally point to the “next” element in the set, but it does not have
to. (I suppose it did this with the STL library we were using with our product or the code never would have
passed our tests.)
A2) There are two simple ways to fixing the problem (each
block has a different solution):
void ErrorMgr::Clear()
{
// Clear only the clearable errors and alarms
std::set<AppError>::iterator it, itEnd;
// std::erase returns the next valid iterator in the set
{
mutexLock autoLock(m_mutexActiveAlarms);
it = m_setOfActiveAlarms.begin();
itEnd = m_setOfActiveAlarms.end();
while (it != itEnd)
{
if (it->IsClearable())
{
it = m_setOfActiveAlarms.erase(it);
}
else
{
++it;
}
}
}
/* Use a temporary iterator (testIt) to store the old value of the loop iterator.
Increment the loop iterator, then perform erase operations on the temporary
iterator if necessary. */
{
mutexLock autoLock(m_mutexActiveErrors);
std::set<AppError>::iterator testIt;
Increment the loop iterator, then perform erase operations on the temporary
iterator if necessary. */
{
mutexLock autoLock(m_mutexActiveErrors);
std::set<AppError>::iterator testIt;
it = m_setOfActiveErrors.begin();
itEnd = m_setOfActiveErrors.end();
while (it != itEnd)
{
testIt = it;
++it;
if (testIt->IsClearable())
++it;
if (testIt->IsClearable())
{
m_setOfActiveErrors.erase(testIt);
}
}
}
}
It is my opinion that the second strategy is preferable
because:
- it avoids the post-increment operator which I am trying to discourage our team from using (especially in iterators where it can lead to a performance hit).
- there is no else clause in the second strategy which in my opinion makes the logic of the algorithm easier to follow.
A3) Well if we’re going to have the same problem elsewhere
in the product, we should aim to remove duplicate code. We want to
pull out the algorithm and put it somewhere where it can be used again.
We don’t know what comparator member function we may use elsewhere so we
will have to pass that into the algorithm somehow.
I am going to follow the spirit of the standard algorithms (found
in <algorithm>)
as much as I can for my solution and name this algorithm set_erase_if which
will require that I pass in a reference to the set and a pointer to a set
element member function. The resulting code will
read:
void ErrorMgr::Clear()
{
// Clear only the clearable errors and alarms
{
mutexLock autoLock(m_mutexActiveAlarms);
set_erase_if(m_setOfActiveAlarms, &AppError::IsClearable);
set_erase_if(m_setOfActiveAlarms, &AppError::IsClearable);
}
{
mutexLock autoLock(m_mutexActiveErrors);
set_erase_if(m_setOfActiveErrors, &AppError::IsClearable);
mutexLock autoLock(m_mutexActiveErrors);
set_erase_if(m_setOfActiveErrors, &AppError::IsClearable);
}
}
The problem now is to write the code for this algorithm cleanly so
that it can be reused. Well, let us start off by factoring out the
common code. We can look later at how we can make it more
generic. The result of simply factoring out the common code
is:
void
set_erase_if(std::set<AppError>& theSet, bool (AppError::*compareMemFn)() const)
{
std::set<AppError>::iterator it =
theSet.begin();
std::set<AppError>::iterator itEnd = theSet.end();
std::set<AppError>::iterator itTest;
while (it != itEnd)
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
std::set<AppError>::iterator itEnd = theSet.end();
std::set<AppError>::iterator itTest;
while (it != itEnd)
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
}
This code compiles and testing (you are writing tests for you code, right?) shows that it does what we expect it
to do. But in its current state it only works for
std::set<AppError>. We can fix this by templatizing the key
value of the set. This leads us to the following
code:
template<class Key>
void set_erase_if(std::set<Key>& theSet, bool (Key::*compareMemFn)() const)
{
typename std::set<Key>::iterator it = theSet.begin();
typename std::set<Key>::iterator itEnd = theSet.end();
typename std::set<Key>::iterator itTest;
while (it != itEnd)
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
typename std::set<Key>::iterator itEnd = theSet.end();
typename std::set<Key>::iterator itTest;
while (it != itEnd)
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
}
We had to add ‘typename’ to the beginning of the declarations for the iterator because ‘iterator’ is now an inner type of a class used as a template parameter. Without the keyword, gcc will fail to properly compile the code.
We test and find out that everything still works.
J
We could stop here. But since we’re creating a new
algorithm, we should ask ourselves to see if we can make it more generic.
Here are some questions that pop to my mind:
- The algorithm runs over the entire set. What if I wanted to only cover a range?
- Why are we using a member function of type bool (Key::*memberFn)() const? Does the member function have to be ‘const’? How can we make it more generic?
- We limit ourselves to std::set. Is it possible to make this code generic enough to work with other associative containers?
I will tackle these one at a time. First off, let’s
figure out what we’d need to cover any range for a set. This new
function can be a function overload – keeping the same name as the
already defined function and offering new functionality while at the same time
still allowing users to use the previously defined function.
The STL library (which I am trying to mimic) uses start and end
iterators to define ranges. Most algorithms are generic enough
that you only have to pass in the iterators, not a reference to the container on
which you are operating. Unfortunately for us, ‘erase’ is a member
function of the set, not the iterator so we have to pass in a reference to
set. With this is mind, we can construct the following
function:
template<class Key>
void set_erase_if(std::set<Key>& theSet,
const typename std::set<Key>::iterator& rangeBegin,
const typename std::set<Key>::iterator& rangeEnd,
bool (Key::* compareMemFn)() const)
{
typename std::set<Key>::iterator itTest;
void set_erase_if(std::set<Key>& theSet,
const typename std::set<Key>::iterator& rangeBegin,
const typename std::set<Key>::iterator& rangeEnd,
bool (Key::* compareMemFn)() const)
{
typename std::set<Key>::iterator itTest;
typename
std::set<Key>::iterator it =
rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it; if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
while (it != rangeEnd)
{
itTest = it;
++it; if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
This compiles and tests we write for this pass. But now we realize we have duplicate code again and that the first function we wrote (the one without the range iterators) can be re-written to use the above function.
template<class Key>
void set_erase_if(std::set<Key>& theSet, bool (Key::*compareMemFn)() const)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compareMemFn);
}
void set_erase_if(std::set<Key>& theSet, bool (Key::*compareMemFn)() const)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compareMemFn);
}
Nice and succinct. Now our logic is in one place. And
we can remove elements by using either one of the following
lines:
set_erase_if(m_setOfActiveErrors,
&AppError::IsClearable);
set_erase_if(m_setOfActiveErrors, m_setOfActiveErrors.begin(), m_setOfActiveErrors.end(),
&AppError::IsClearable);
So on to the next improvement – making the algorithm work so it doesn’t have to rely on such a specific member function signature. A common way to make algorithms like this more generic is to pass in a functor instead of a specific member function. A common method for binding member functions (which can have different signatures) to functors is to use std::mem_fun defined in <functional>. The resulting functor’s call operator takes as its first argument a pointer to the class type. We need to know this because it will affect how we write our test condition. After some tinkering around we come up with a solution:
template<class Key, class RemoveComparisonFunctor>
void set_erase_if(std::set<Key>& theSet,
const typename std::set<Key>::iterator& rangeBegin,
const typename std::set<Key>::iterator& rangeEnd,
const RemoveComparisonFunctor& compare)
{
typename std::set<Key>::iterator itTest;
typename std::set<Key>::iterator it = rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it;
if (compare(&*itTest))
{
theSet.erase(itTest);
}
}
}
template<class Key, class RemoveComparisonFunctor>
void set_erase_if(std::set<Key >& theSet, const RemoveComparisonFunctor& compare)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compare);
}
void set_erase_if(std::set<Key>& theSet,
const typename std::set<Key>::iterator& rangeBegin,
const typename std::set<Key>::iterator& rangeEnd,
const RemoveComparisonFunctor& compare)
{
typename std::set<Key>::iterator itTest;
typename std::set<Key>::iterator it = rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it;
if (compare(&*itTest))
{
theSet.erase(itTest);
}
}
}
template<class Key, class RemoveComparisonFunctor>
void set_erase_if(std::set<Key >& theSet, const RemoveComparisonFunctor& compare)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compare);
}
We can now remove elements by using one of the following statements:
set_erase_if(m_setOfActiveErrors,
&AppError::IsClearable);
set_erase_if(m_setOfActiveErrors, m_setOfActiveErrors.begin(), m_setOfActiveErrors.end(),
&AppError::IsClearable);
set_erase_if(m_setOfActiveErrors, m_setOfActiveErrors.begin(), m_setOfActiveErrors.end(),
&AppError::IsClearable);
set_erase_if(m_setOfActiveErrors, mem_fun(&AppError::IsClearable));
set_erase_if(m_setOfActiveErrors, m_setOfActiveErrors.begin(), m_setOfActiveErrors.end(),
mem_fun(&AppError::IsClearable));
// Let’s pretend there’s a free function whose prototype is:
// bool isClearable(KasError* error);
set_erase_if(m_setOfActiveErrors, isClearable);
set_erase_if(m_setOfActiveErrors, m_setOfActiveErrors.begin(), m_setOfActiveErrors.end(),
isClearable);
Now can we get rid of anything? The answer is yes. The non-range based functions are extremely similar. It turns out that the newer one (with the comparison functor) will forward correctly to the member function range-based overload. We could also remove the member function range-based overload too since mem_fun will do its work for us. And I am currently debating whether or not to keep it. Ideally, I would remove it (since we can use mem_fun), but many people on the development team on which I work do not use <algorithm> at all. The additional call to mem_fun may tip them toward avoiding the use of algorithms. For now I will keep it the member function ranged-based overload, but I will be discussing the issue with the members of the team to see how they want to proceed. Afterall, it is important to work with others and developing a technically correct solution will not be helpful if you can't get the people you work with to use it. I will update this section once the team comes up with a consensus.
OK. Now on to the last point. We want
to make the functions handle different associative containers. It is pretty obvious that we can make the code generic enough to work with std::multiset, std::unordered_set (for those who have access to that C++11 container), or other set-like containers. The other major associative continer types are the map-like containers (std::map, std::multimap, std::unordered_map, etc...). The map-like containers present us with some pretty large complications:
Anyway, back to generalizing the code to work with different set-like containers. We can accomplish that by substituting std::set<Key> with a different template class. The resulting code is:
- Which iterator member should we use -- key or value (first or second)?
- How does the user easily specify which one they want to use?
Anyway, back to generalizing the code to work with different set-like containers. We can accomplish that by substituting std::set<Key> with a different template class. The resulting code is:
template<class SetLikeContainer>
void set_erase_if(SetLikeContainer& theSet,
const typename SetLikeContainer::iterator& rangeBegin,
const typename SetLikeContainer::iterator& rangeEnd,
bool (SetLikeContainer::key_type::* compareMemFn)() const)
{
typename SetLikeContainer::iterator itTest;
typename SetLikeContainer::iterator it = rangeBegin;
void set_erase_if(SetLikeContainer& theSet,
const typename SetLikeContainer::iterator& rangeBegin,
const typename SetLikeContainer::iterator& rangeEnd,
bool (SetLikeContainer::key_type::* compareMemFn)() const)
{
typename SetLikeContainer::iterator itTest;
typename SetLikeContainer::iterator it = rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
{
itTest = it;
++it;
if (((*itTest).*(compareMemFn))())
{
theSet.erase(itTest);
}
}
}
template<class SetLikeContainer, class RemoveComparisonFunctor>
void set_erase_if(SetLikeContainer& theSet,
const typename SetLikeContainer::iterator& rangeBegin,
const typename SetLikeContainer::iterator& rangeEnd,
const RemoveComparisonFunctor& compare)
{
typename SetLikeContainer::iterator itTest;
typename SetLikeContainer::iterator it = rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it;
if (compare(&*itTest))
{
theSet.erase(itTest);
}
}
}
void set_erase_if(SetLikeContainer& theSet,
const typename SetLikeContainer::iterator& rangeBegin,
const typename SetLikeContainer::iterator& rangeEnd,
const RemoveComparisonFunctor& compare)
{
typename SetLikeContainer::iterator itTest;
typename SetLikeContainer::iterator it = rangeBegin;
while (it != rangeEnd)
{
itTest = it;
++it;
if (compare(&*itTest))
{
theSet.erase(itTest);
}
}
}
template<class SetLikeContainer, class RemoveComparisonFunctor>
void set_erase_if(SetLikeContainertheSet, const RemoveComparisonFunctor& compare)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compare);
}
void set_erase_if(SetLikeContainertheSet, const RemoveComparisonFunctor& compare)
{
set_erase_if(theSet, theSet.begin(), theSet.end(), compare);
}
That is about as generic as I can make this algorithm.
The code is pretty short and clear too. And the resulting code for the function
we were originally analyzing is now more clear. (Go ahead and
compare it again to the original at the top of the page. And yes, this identical to what we wrote at the beginning of A3; I just put it here to make comparing easier.)
void ErrorMgr::Clear()
{
// Clear only the clearable errors and alarms
{
mutexLock autoLock(m_mutexActiveAlarms);
set_erase_if(m_setOfActiveAlarms, &AppError::IsClearable);
set_erase_if(m_setOfActiveAlarms, &AppError::IsClearable);
}
{
mutexLock autoLock(m_mutexActiveErrors);
set_erase_if(m_setOfActiveErrors, &AppError::IsClearable);
mutexLock autoLock(m_mutexActiveErrors);
set_erase_if(m_setOfActiveErrors, &AppError::IsClearable);
}
}
Thank you for taking the time to finish this rather long
first installment of DotW.
std::set::erase doesn't return the next iterator because its iterators do not get invalidated from modifying the std::set.
ReplyDeletehttp://en.cppreference.com/w/cpp/container/set/erase
How interesting. The Dinkumware and MS STL (which is licensed from Dinkumware) has std::set::erase returning an iterator.
Deletehttp://www.dinkumware.com/manuals/?manual=compleat&page=set.html#set::erase
http://msdn.microsoft.com/en-us/library/8h4a3515(v=vs.100).aspx
That's interesting that they deviated from the standard.
Nice read. We have a copy of Scott Meyers' Effective STL in our shop that is frequently used for "Uh, how do I get rid of all those elements in that std::map without blowing everything up again?".
ReplyDelete