Like any STL container that must order its elements, the multiset template uses the less template by default to determine element ordering. This uses the contained classes’ operator<, but you can of course substitute your own comparison function.
Consider a simple class that contains one element that is used in the comparison and another that is not:
//: C07:MultiSet1.cpp
// Demonstration of multiset behavior
#include <algorithm>
#include <ctime>
#include <iostream>
#include <iterator>
#include <set>
using namespace std;
class X {
char c; // Used in comparison
int i; // Not used in comparison
// Don't need default constructor and operator=
X();
X& operator=(const X&);
// Usually need a copy-constructor (but the
// synthesized version works here)
public:
X(char cc, int ii) : c(cc), i(ii) {}
// Notice no operator== is required
friend bool operator<(const X& x, const X& y) {
return x.c < y.c;
}
friend ostream& operator<<(ostream& os, X x) {
return os << x.c << ":" << x.i;
}
};
class Xgen {
static int i;
// Number of characters to select from:
enum { span = 6 };
public:
Xgen() { srand(time(0)); }
X operator()() {
char c = 'A' + rand() % span;
return X(c, i++);
}
};
int Xgen::i = 0;
typedef multiset<X> Xmset;
typedef Xmset::const_iterator Xmit;
int main() {
Xmset mset;
// Fill it with X's:
generate_n(inserter(mset, mset.begin()),
25, Xgen());
// Initialize a regular set from mset:
set<X> unique(mset.begin(), mset.end());
copy(unique.begin(), unique.end(),
ostream_iterator<X>(cout, " "));
cout << "\n----\n";
// Iterate over the unique values:
for(set<X>::iterator i = unique.begin();
i != unique.end(); i++) {
pair<Xmit, Xmit> p = mset.equal_range(*i);
copy(p.first, p.second,
ostream_iterator<X>(cout, " "));
cout << endl;
}
} ///:~
In X, all the comparisons are made with the char c. The comparison is performed with operator<, which is all that is necessary for the multiset, since in this example the default less comparison object is used. The class Xgen randomly generates X objects, but the comparison value is restricted to the span from ‘A’ to ‘E’. In main( ), a multiset<X> is created and filled with 25 X objects using Xgen, guaranteeing that there will be duplicate keys. So that we know what the unique values are, a regular set<X> is created from the multiset (using the iterator, iterator constructor). These values are displayed, and then each one produces the equal_range( ) in the multiset (equal_range( ) has the same meaning here as it does with multimap: all the elements with matching keys). Each set of matching keys is then printed.
As a second example, a (possibly) more elegant version of WordCount.cpp can be created using multiset:
//: C07:MultiSetWordCount.cpp
// Count occurrences of words using a multiset
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;
int main(int argc, char* argv[]) {
char* fname = "MultiSetWordCount.cpp";
if(argc > 1) fname = argv[1];
ifstream in(fname);
assure(in, fname);
multiset<string> wordmset;
string word;
while(in >> word)
wordmset.insert(word);
typedef multiset<string>::iterator MSit;
MSit it = wordmset.begin();
while(it != wordmset.end()) {
pair<MSit, MSit> p = wordmset.equal_range(*it);
int count = distance(p.first, p.second);
cout << *it << ": " << count << endl;
it = p.second; // Move to the next word
}
} ///:~
The setup in main( ) is identical to WordCount.cpp, but then each word is simply inserted into the multiset<string>. An iterator is created and initialized to the beginning of the multiset; dereferencing this iterator produces the current word. The equal_range( ) member function (not generic algorithm) produces the starting and ending iterators of the word that’s currently selected, and the algorithm distance( ) (defined in <iterator>) counts the number of elements in that range. The iterator it is then moved forward to the end of the range, which puts it at the next word. If you’re unfamiliar with the multiset, this code can seem more complex. The density of it and the lack of need for supporting classes such as Count has a lot of appeal.
In the end, is this really a "set," or should it be called something else? An alternative is the generic "bag" that is defined in some container libraries, since a bag holds anything at all without discrimination—including duplicate objects. This is close, but it doesn’t quite fit since a bag has no specification about how elements should be ordered. A multiset (which requires that all duplicate elements be adjacent to each other) is even more restrictive than the concept of a set, which could use a hashing function to order its elements, in which case they would not be in sorted order. Besides, if you wanted to store a bunch of objects without any special criteria, you’d probably just use a vector, deque, or list.
Combining STL containers
When using a thesaurus, you want to know all the words that are similar to a particular word. When you look up a word, then, you want a list of words as the result. Here, the "multi" containers (multimap or multiset) are not appropriate. The solution is to combine containers, which is easily done using the STL. Here, we need a tool that turns out to be a powerful general concept, which is a map of vector: