Monday 8 June 2009

The problem with Maps in C++

The following discussion is from More Exceptional C++ By Herb Sutter:

Question 1:
a: What's wrong with the following code? How would you correct it?

map::iterator i = m.find( 13 );
if( i != m.end() )
{
const_cast( i->first ) = 9999999;
}

b: To what extent are the problems fixed by writing the following instead?

map::iterator i = m.find( 13 );
if( i != m.end() )
{
string s = i->second;
m.erase( i );
m.insert( make_pair( 9999999, s ) );
}

Consider a map named m that has the contents shown in Figure; each node within m is shown as a pair. I'm showing the internal structure as a binary tree because this is what all current standard library implementations actually use.

As the keys are inserted, the tree's structure is maintained and balanced such that a normal inorder traversal visits the keys in the usual less ordering. So far, so good.

But now say that, through an iterator, we could arbitrarily change the second entry's key, using code that looks something like the following:

1. a) What's wrong with the following code? How would you correct it?

// Example: Wrong way to change a

// key in a map m.

//

map::iterator i = m.find( 13 );

if( i != m.end() )

{

const_cast( i->first ) = 9999999; // oops!

}

Note that we have to cast away const to get this code to compile. The problem here is that the code interferes with the map's internal representation by changing the map's internals in a way that the map isn't expecting and can't deal with.

Example above corrupts the map's internal structure (see Figure). Now, for example, an iterator traversal will not return the contents of the map in key order, as it should. For example, a search for key 144 will probably fail, even though the key exists in the map. In general, the container is no longer in a consistent or usable state. Note that it is not feasible to require the map to automatically defend itself against such illicit usage, because it can't even detect this kind of change when it occurs. In Example above, the change was made through a reference into the container, without calling any map member functions.

A better, but still insufficient, solution is to follow this discipline: To change a key, remove it and reinsert it. For example:

b) To what extent are the problems fixed by writing the following instead?

// Example: Better way to change a key

// in a map m.

//

map::iterator i = m.find( 13 );

if( i != m.end() )

{

string s = i->second;

m.erase( i );

m.insert( make_pair( 9999999, s ) ); // OK

}

This is better, because it avoids any change to keys, even keys with mutable members that are significant in the ordering. It even works with our specific example. So this must be the solution, right?

Unfortunately, it's still not enough in the general case, because keys can still be changed while they are in the container. "What?" one might ask. "How can keys be changed while they're in the container, if we adopt the discipline of never changing key objects directly?" Here are two counterexamples:

Let's say the Key type has some externally available state that other code can get at—for example, a pointer to a shared buffer that can be modified by other parts of the system without going through the Key object. Let's also say that that externally available state participates in the comparison performed by Compare. Then making a change in the externally available state, even without the knowledge of the Key object and without the knowledge of the code that uses the associative container, can still change the relative ordering of keys. So in this case, even if the code owning the container tries to follow an erase-then-reinsert discipline, a key ordering change can happen at any time somewhere else and therefore without an erase-then-reinsert operation.

Consider a Key type of string and a Compare type that interprets the key as a file name and compares the contents of the files. It's obvious that even if the keys are never changed, the relative ordering of keys can still change if the files are modified by another process, or (if the file is shared on a network) even by a user on a different machine on the other side of the world.

For details see Item 8 of the book More Exceptional C++ By Herb Sutter



No comments:

Post a Comment