Выбрать главу

The solution for flexible element access is an iterator, an object whose job is to select the elements within a container and present them to the user of the iterator. As a class, an iterator also provides a level of abstraction, which you can use to separate the details of the container from the code that’s accessing that container. The container, via the iterator, is seen as a sequence. The iterator lets you traverse that sequence without worrying about the underlying structure—that is, whether it’s a vector, a linked list, a set, or something else. This gives you the flexibility to easily change the underlying data structure without disturbing the code in your program that traverses the container. Separating iteration from the control of the container traversed also allows having multiple iterators simultaneously.

From a design standpoint, all you really want is a sequence that can be manipulated to solve your problem. If a single type of sequence satisfied all your needs, there’d be no reason to have different kinds. You need a choice of containers for two reasons. First, containers provide different types of interfaces and external behavior. A stack has an interface and a behavior that is different from that of a queue, which is different from that of a set or a list. One of these might provide a more flexible solution to your problem than the other. Second, different containers have different efficiencies for certain operations. Compare a vector to a list, as an example. Both are simple sequences that can have nearly identical interfaces and external behaviors. But certain operations can have radically different costs. Randomly accessing elements in a vector is a constant-time operation; it takes the same amount of time regardless of the element you select. However, it is expensive to move through a linked list to randomly access an element, and it takes longer to find an element if it is farther down the list. On the other hand, if you want to insert an element in the middle of a sequence, it’s much cheaper in a list than in a vector. The efficiencies of these and other operations depend on the underlying structure of the sequence. In the design phase, you might start with a list and, when tuning for performance, change to a vector, or vice-versa. Because of iterators, code that merely traverses sequences is insulated from changes in the underlying sequence implementation.

Remember that a container is only a storage cabinet in which to put objects. If that cabinet solves all your needs, it probably doesn’t really matter how it is implemented. If you’re working in a programming environment that has built-in overhead due to other factors, the cost difference between a vector and a linked list might not matter. You might need only one type of sequence. You can even imagine the "perfect" container abstraction, which can automatically change its underlying implementation according to the way it is used.

STL reference documentation

As in the previous chapter, you will notice that this chapter does not contain exhaustive documentation describing each of the member functions in each STL container. Although we describe the member functions we use, we’ve left the full descriptions to others. We recommend the online resources available for the Dinkumware, Silicon Graphics, and STLPort STL implementations.[91] 

A first look

Here’s an example using the set class template, a container modeled after a traditional mathematical set and which does not accept duplicate values. This simple set was created to work with ints:.

//: C07:Intset.cpp

// Simple use of STL set

#include <cassert>

#include <set>

using namespace std;

int main() {

  set<int> intset;

    for(int i = 0; i < 25; i++)

      for(int j = 0; j < 10; j++)

        // Try to insert duplicates:

        intset.insert(j);

  assert(intset.size() == 10);

} ///:~

The insert( ) member does all the work: it attempts to insert an element and ignores it if it’s already there. Often the only activities involved in using a set are simply insertion and testing to see whether it contains the element. You can also form a union, an intersection, or a difference of sets and test to see if one set is a subset of another. In this example, the values 0–9 are inserted into the set 25 times, but only the 10 unique instances are accepted.

Now consider taking the form of Intset.cpp and modifying it to display a list of the words used in a document. The solution becomes remarkably simple.

//: C07:WordSet.cpp

#include <fstream>

#include <iostream>

#include <iterator>

#include <set>

#include <string>

#include "../require.h"

using namespace std;

void wordSet(char* fileName) {

  ifstream source(fileName);

  assure(source, fileName);

  string word;

  set<string> words;

  while(source >> word)

    words.insert(word);

  copy(words.begin(), words.end(),

    ostream_iterator<string>(cout, "\n"));

  cout << "Number of unique words:"

    << words.size() << endl;

}

int main(int argc, char* argv[]) {

  if(argc > 1)

    wordSet(argv[1]);

  else

    wordSet("WordSet.cpp");

} ///:~

The only substantive difference here is that string is used instead of int. The words are pulled from a file, but the other operations are similar to those in Intset.cpp. Not only does the output reveal that duplicates have been ignored, but because of the way set is implemented, the words are automatically sorted.

A set is an example of an associative container, one of the three categories of containers provided by the standard C++ library. The containers and their categories are summarized in the following table.

Category Containers
Simple Sequence Containers vector, list, deque
Container Adapters queue, stack, priority_queue
Associative Containers set, map, multiset, multimap

All the containers in the standard library hold objects and expand their resources as needed. The key difference between one container and another is the way the objects are stored in memory and what operations are available to the user.

A vector, as you already know, is a linear sequence that allows rapid random access to its elements. However, it’s expensive to insert an element in the middle of a co-located sequence like a vector, just as it is with an array. A deque (double-ended-queue, pronounced "deck") also allows random access that’s nearly as fast as vector, but it’s significantly faster when it needs to allocate new storage, and you can easily add new elements at the front as well as the back of the sequence. A list is a doubly linked list, so it’s expensive to move around randomly but cheap to insert an element anywhere. Thus list, deque and vector are similar in their basic functionality (they all hold linear sequences), but different in the cost of their activities. So for your first shot at a program, you could choose any one and experiment with the others only if you’re tuning for efficiency.