Overloading insertion and extraction

As we build our own classes, we may want to control how they are printed, and how they are read in. E.g., suppose we have our trusty dog class:

class dog {
  public: 
    string name;
    string color;
    string breed;
    int age;
};

While we could add a .print method, it would be really great if we could do this:

dog fido{"Fido", "Brown", "Corgi", 7};

cout << fido << endl; // Prints information about Fido to cout

Can we just teach C++ what << means for our class dog? Yes, in fact we can. We can overload the << operator for objects of our class’s type (just as we can overload functions, we can also overload operators). In order to do this, we need to know two things:

the answer is

ostream& operator<< (ostream& out, ...) {
  ...
  return out;
}

istream& operator>> (istream& in, &...) {
  ...
  return in;
}

There are a few things that every overload of << and >> for input/output must do:

Thus, for our dog class, the << operator could look like this:

ostream& operator<< (ostream& out, dog d) {
  out << d.name << ", ";
  out << d.color << " " << d.age << ", ";
  out << "age " << d.age;
  return out;
}

Note that out is not a typo: I am printing to the out ostream that was passed in as the first argument, NOT to cout. If you print to cout, then your program will work, until you try to print to something other that cout. The version printed above will actually let you print to any ostream: file, stringstream, whatever. We don’t want to tie our printing down to just cout, especially when we can get the ability to print to anything basically for free.

Input is more difficult, just because we have to account for the fact that the input might not be in the right format. If the input fails then we have to reset the variable we are reading into to its default value. For example, suppose we create a class for two-dimensional points:

class point2d {
  public:
    point2d() { x = y = 0; }

    float x,y;
}

To print a point is relatively easy:

ostream& operator<< (ostream& out, point2d p) {
  out << p.x << "," << p.y;
  return out;
}

This prints the point \( (x,y) \) as x,y, i.e., coordinates with a comma between. To read in a point, we have to check for the comma: if it’s present we ignore it, if it’s not present then we reset p to the “default” point, using the default constructor:

istream& operator>> (istream& in, point2d& p) {
  if((in >> p.x) && 
     i.peek() == ',' &&
     in.ignore(1) &&
     (in >> p.y)) {

  }
  else  
    p = point2d{};

  return in;
}

Writing a correct extraction operator is more difficult than writing a correct insertion operator, just because there are many different ways that things can go wrong: wrong input format, stream ends mid-object, etc. The more complex the object is, the more complex its extraction will be.

Here I’ve overloaded the operators as functions, outside the class, which means that if they needed access to the class’s private members, they would have to be declared as friend functions.

Operator overloading

Overloading << and >> is just the beginning: it turns out that we can overload almost all the operators. The full list of overloadable operators is

Operators Description
+ - * / % Arithmetic
ˆ & | ~ Bitwise
! Negation
= += -= *= /= %= ˆ= &= |= << >> >>= <<= Assignment
< > == != <= >= Comparison
&& || Logical
++ -- Inc/Dec
, Comma
->* -> Pointer-member
() Function call
[] Array element
conversion Cast
new delete Allocation/Deallocation

This is subject to a few caveats:

There are two ways you can overload an operator:

Overloading the assignment operator

The assignment operator can be overloaded, which means that when we write

a = b;

where a and b are of some user-defined type, we can control exactly what happens. Normally, this just copies all the members (public and private) of b into a, but maybe we want something different to happen. It may be that some classes have some internal data that should not be copied; if so, the overloaded = can avoid copying it.

The overloaded = must be defined as a method on the class, and it must take and return references:

dog& dog::operator= (dog& other);

The purpose of this overload is to copy the contents of other into *this and then return *this. (Returning *this is necessary to make things like a = b = c; work.)

If you don’t overload =, you get a version for free that, as mentioned above, just copies all data members. If that’s what you want, then there’s really no point in overloading it.

The copy constructor

The copy constructor is closely related to the assignment overload. Normally, if you define one you’ll define the other. While the assignment operator is used when we explicitly copy an object into another, the copy constructor is used implicitly, when C++ needs to make a copy in the background. For example, if we write a function

void breed(dog a, dog b);

and then pass it two dogs:

dog fido{...};
dog woofy{...};

breed(fido, woofy);

both fido and woofy are passed by copy, so C++ has to make copies of them. Normally, it does the same thing as the default assignment operator: just calls the default constructor (the one with no arguments) and then copies over any data members, but you can change this if you want.

The copy constructor is just a constructor that takes another object of the class’s type by reference (can’t take it by copy, because we are in the process of definition how to do that!). For dog, it looks like this:

class dog {
  public:
    dog(dog& other) {
        // Copy other into *this
    }
};

Usually the copy-constructor and the overloaded assignment work in tandem, essentially doing the same thing, just in different situations.

The “big three”

Usually you’ll need to overload the assignment operator at the same time as two other things:

The normal reason for needing to overload these is because the class “owns” some resource that will not automatically copy itself. (E.g., a pointer to dynamic memory.) In that case, you need to overload all three:

Thus, these three together are known as the “big three”: if you find yourself writing one, you probably need to write the other two, as well.

The increment/decrement operators

You can overload ++ and --, both the pre- and post-increment versions. The pre- versions look like this (for a class thing)

thing& operator++ (thing& t); // Function 
thing& operator-- (thing& t); // Function
thing& thing::operator++ ();  // Method
thing& thing::operator-- ();  // Method

What about the post-increment version? These have two differences:

thing operator++ (thing& t, int i); // Function 
thing operator-- (thing& t, int i); // Function
thing thing::operator++ (int i);  // Method
thing thing::operator-- (int i);  // Method

Arithmetic operators

You can overload any of the arithmetic operators + - * / %, including the unary (prefix) minus. Normally these are overloaded as functions rather than methods (so that they can take advantage of implicit conversions on both operands). They usually take their arguments by copy.

thing operator+ (thing a, thing b);
thing operator- (thing a, thing b);
thing operator- (thing a);           // Unary minus
thing operator* (thing a, thing b);
thing operator/ (thing a, thing b); 
thing operator% (thing a, thing b);

Note that % is normally only defined on int-like types (whole numbers).

We could use these to build a new numeric type, like complex, which supports all the usually arithmetic operations:

class complex {
  public:
    complex(float r, float i) { re = r; im = i; }
    complex(float r)          { re = r; im = 0; }
    float re, im;
};

complex operator+ (complex a, complex b) {
    return complex{a.re + b.re, a.im + b.im}
}

complex operator- (complex a) {
    return complex{-a.re, -a.im};
}

complex operator- (complex a, complex b) {
    return a + -b;
}

complex operator* (complex a, complex b) {
    return complex{a.re * b.re, a.im * b.im}
}

complex operator/ (complex a, complex b) {
    return complex{a.re / b.re, a.im / b.im}
}

Overloading conversion operators

We’ve already seen how we can define how to convert things into a user-defined type, by defining a constructor. Can we go the other way, and define how elements of a user-defined type are converted into other things? Yes, by overloading the cast operator. When we write something like

float x = 1.2;
int y = int(x); // convert x to int

we are performing a cast, a explicit conversion from float to int. The cast operator for a given target type looks like this:

int operator int (thing& a); // Function
int thing::operator int ();  // Method

Both of these define how instances of thing are converted into ints. After defining one of these, we can proceed to write

thing x = ...;
int y = x;
int z = x + 2; // Convert x to int and then add 2

Overloading comparisons

Traditionally, if you’re going to overload the comparison operators, you’d do it by first overloading < with the actual comparison. All the other comparisons can be defined in terms of <. (For normal use, all of these should return bool.) Although we can write == in terms of > (a == b iff !(a < b || a > b)), it’s usually possible to write a more efficient comparison.

bool operator<  (thing a, thing b) { /* compare */ }
bool operator>  (thing a, thing b) { return b < a; }
bool operator<= (thing a, thing b) { return !(a > b); }
bool operator>= (thing a, thing b) { return !(a < b); }
bool operator== (thing a, thing b) { /* compare */ }
bool operator!= (thing a, thing b) { return !(a == b); }

That way we really only have to write one or two “custom” comparison operator, the rest can be written fairly easily.

By default, if you don’t overload == and != you get a version that just compares all the members (public and private). None of the other comparison operators have default versions.

Overloading NOT and bool conversion

If your class has some idea of being “good” or “bad”, often you’ll implement a conversion to bool, along with overloading the ! (logical NOT) operator, with the idea that true corresponds to “good” and false corresponds to “bad”. Thus, a user can test whether a given thing is good or bad by just doing

thing x = ...;
if(x) {
    // x must be good
}

Input/output streams overload them to do just this:

bool operator bool (ostream& out) {
    return out.good();
}

bool operator! (ostream& out) {
    return !out.good();
}

Whenever we write something like

while(cin >> i) {... 

we are taking advantage of this conversion.

Abstract data types

Classes, together with overloaded operators, give us the power to make new data types. Just like vector and string, we can create a new type to represent some collection of data. Sometimes a description of what a class does, as distinct from how it does it, is called an “abstract data type”. As an example, let’s implement a Set ADT. This is intended to act like a mathematical set, so we can:

Note that there are many ways we could implement sets. The ADT is just a description of what a Set can do, but sometimes we’ll refer to a particular version of the Set as an ADT. We’re going to build a version of Set that is built on a vector where all the elements are unique. Let’s think about how to implement all the above operations (bearing in mind that any operation which changes a set or constructs a new set must ensure that the elements are unique):

Our class will look something like this:

class set {
  public:
    // Constructor: constructs the empty set
    set() { } 

    int size()   { return data.size(); }
    bool empty() { return data.empty(); }

    bool contains(int x) {
        for(int i : data)
            if(i == x)
                return true;

        return false;
    }

    void insert(int x) {
        if(!contains(x))
            data.push_back(x);
    }

    void remove(int x) {
        for(int i = 0; i < data.size(); ++i)
            if(data.at(i) == x) {
                data.erase(data.begin() + i);
                break;
            }
    }

    set union(set& other) {
        set result = *this; // Copy of this set       

        // Insert other data (to avoid duplicates)
        for(int i : other.data)
            result.insert(i); 

        return result;
    }

    set intersection(set& other) {
        set result; // Empty
        for(int i : data)
            if(other.contains(i))
                result.data.push_back(i);

        return result;
    }

    set difference(set& other) {
        set result = *this;
        for(int i : other.data)
            result.remove(i);

        return result;
    }

    set symmetric_diff(set& other) {
        set result = union(other);
        return result.difference(intersection(other));
    }

    friend ostream& operator<<(ostream&, set);

  private:
    vector<int> data;
};

ostream& operator<< (ostream& out, set x) {
    cout << "{";
    for(int i = 0; i < x.data.size() - 1; ++i) 
        out << x.data.at(i) << ",";

    if(!x.empty())
        out << x.data.back() << "}";

    return out;
}

To make things easier, we can overload some operators for the set operations:

set operator| (set a, set b) {
    return a.union(b);
}

set operator& (set a, set b) {
    return a.intersection(b);
}

set operator- (set a, set b) {
    return a.difference(b);
}

set operator^ (set a, set b) {
    return a.symmetric_diff(b);
}

set operator+= (set a, int x) }
  a.insert(x);
  return a;

And now we can write a quite natural set manipulation program:

int main() {
    set a; 
    a += 1; a += 2; a += 3;

    set b; 
    b += 3; b += 4; b += 5;

    cout << (a & b) << endl;

    return 0; 
}

Optional topics: const correctness. Using initializer_list for more natural set creation. Member initalizer lists.