Overloading insertion and extraction
As we build our own classes, we may want to control how they are printed, and
how they are read in. E.g., suppose we have our trusty dog
class:
class dog {
public:
string name;
string color;
string breed;
int age;
};
While we could add a .print
method, it would be really great if we could do
this:
dog fido{"Fido", "Brown", "Corgi", 7};
cout << fido << endl; // Prints information about Fido to cout
Can we just teach C++ what <<
means for our class dog
? Yes, in fact we can.
We can overload the <<
operator for objects of our class’s type (just as
we can overload functions, we can also overload operators). In order to
do this, we need to know two things:
What is the name of the function that represents the
<<
and>>
operators? Once we have a function name, we can create our own function with the same name.What are the parameters and return value of these functions? We need to know how many inputs it takes, and what it returns, if we want to overload it.
the answer is
ostream& operator<< (ostream& out, ...) {
...
return out;
}
istream& operator>> (istream& in, &...) {
...
return in;
}
There are a few things that every overload of <<
and >>
for input/output
must do:
The return type must be either an
ostream
(for<<
) or anistream
(for>>
) and it must be returned by.The first parameter must be either an
ostream
(for<<
) or anistream
(for>>
) and it must be passed by reference.The second parameter is a value of whatever type we are trying to print/input. For an
>>
operator it must be passed by reference.Both overloads must return the
istream
/ostream
that they took in as the first parameter.
Thus, for our dog
class, the <<
operator could look like this:
ostream& operator<< (ostream& out, dog d) {
out << d.name << ", ";
out << d.color << " " << d.age << ", ";
out << "age " << d.age;
return out;
}
Note that out
is not a typo: I am printing to the out
ostream
that was
passed in as the first argument, NOT to cout
. If you print to cout
, then
your program will work, until you try to print to something other that cout
.
The version printed above will actually let you print to any ostream:
file, stringstream
, whatever. We don’t want to tie our printing down to just
cout
, especially when we can get the ability to print to anything basically
for free.
Input is more difficult, just because we have to account for the fact that the input might not be in the right format. If the input fails then we have to reset the variable we are reading into to its default value. For example, suppose we create a class for two-dimensional points:
class point2d {
public:
point2d() { x = y = 0; }
float x,y;
}
To print a point is relatively easy:
ostream& operator<< (ostream& out, point2d p) {
out << p.x << "," << p.y;
return out;
}
This prints the point \( (x,y) \) as x,y
, i.e., coordinates with a comma
between. To read in a point, we have to check for the comma: if it’s present
we ignore it, if it’s not present then we reset p to the “default” point,
using the default constructor:
istream& operator>> (istream& in, point2d& p) {
if((in >> p.x) &&
i.peek() == ',' &&
in.ignore(1) &&
(in >> p.y)) {
}
else
p = point2d{};
return in;
}
Writing a correct extraction operator is more difficult than writing a correct insertion operator, just because there are many different ways that things can go wrong: wrong input format, stream ends mid-object, etc. The more complex the object is, the more complex its extraction will be.
Here I’ve overloaded the operators as functions, outside the class, which means that if they needed access to the class’s private members, they would have to be declared as friend functions.
Operator overloading
Overloading <<
and >>
is just the beginning: it turns out that we can
overload almost all the operators. The full list of overloadable operators
is
Operators | Description |
---|---|
+ - * / % |
Arithmetic |
ˆ & | ~ |
Bitwise |
! |
Negation |
= += -= *= /= %= ˆ= &= |= << >> >>= <<= |
Assignment |
< > == != <= >= |
Comparison |
&& || |
Logical |
++ -- |
Inc/Dec |
, |
Comma |
->* -> |
Pointer-member |
() |
Function call |
[] |
Array element |
conversion | Cast |
new delete |
Allocation/Deallocation |
This is subject to a few caveats:
You can only overload operators to customize their behavior on your own classes. You can’t add overloads on built-in types, or pointer types.
You can’t add new operators. Just writing
operator<|>
doesn’t magically create a new<|>
operator.C++ won’t stop you from completely screwing up the types on the operators. In particular, many of the operators should have particular return types, or require some/all arguments to be passed by reference, in order to work correctly.
There are two ways you can overload an operator:
As a normal function, as we did above with
<<
and>>
. In this case, you write a normal function whose name isoperator
OP where OP is the operator you want to overload. (Note that, like a normal function, you must write a declaration for this operator before you use it.)As a method on the class for which you are specializing the operator. You can only overload as a method if the first argument to the operator is of the class type. E.g., we can’t overload
<<
and>>
for output/input as methods, because their first arguments are the streams. (You can overload them as shift operators, however.)Some operators must be overloaded as methods. These are the assignment operator, the function call operator, and the array element operator.
Overloading the assignment operator
The assignment operator can be overloaded, which means that when we write
a = b;
where a and b are of some user-defined type, we can control exactly what
happens. Normally, this just copies all the members (public and private) of
b into a, but maybe we want something different to happen. It may be
that some classes have some internal data that should not be copied; if so,
the overloaded =
can avoid copying it.
The overloaded =
must be defined as a method on the class, and it must
take and return references:
dog& dog::operator= (dog& other);
The purpose of this overload is to copy the contents of other
into *this
and then return *this
. (Returning *this
is necessary to make things like
a = b = c;
work.)
If you don’t overload =
, you get a version for free that, as mentioned above,
just copies all data members. If that’s what you want, then there’s really no
point in overloading it.
The copy constructor
The copy constructor is closely related to the assignment overload. Normally, if you define one you’ll define the other. While the assignment operator is used when we explicitly copy an object into another, the copy constructor is used implicitly, when C++ needs to make a copy in the background. For example, if we write a function
void breed(dog a, dog b);
and then pass it two dogs:
dog fido{...};
dog woofy{...};
breed(fido, woofy);
both fido
and woofy
are passed by copy, so C++ has to make copies of
them. Normally, it does the same thing as the default assignment operator:
just calls the default constructor (the one with no arguments) and then copies
over any data members, but you can change this if you want.
The copy constructor is just a constructor that takes another object of
the class’s type by reference (can’t take it by copy, because we are
in the process of definition how to do that!). For dog
, it looks like this:
class dog {
public:
dog(dog& other) {
// Copy other into *this
}
};
Usually the copy-constructor and the overloaded assignment work in tandem, essentially doing the same thing, just in different situations.
The “big three”
Usually you’ll need to overload the assignment operator at the same time as two other things:
The destructor
The copy constructor
The normal reason for needing to overload these is because the class “owns” some resource that will not automatically copy itself. (E.g., a pointer to dynamic memory.) In that case, you need to overload all three:
The assignment operator is overloaded to copy the resource into an existing object.
The copy constructor is overloaded to copy the resource into a new object.
The destructor is overloaded to free the resource (e.g.,
delete
the dynamic memory).
Thus, these three together are known as the “big three”: if you find yourself writing one, you probably need to write the other two, as well.
The increment/decrement operators
You can overload ++
and --
, both the pre- and post-increment versions.
The pre- versions look like this (for a class thing
)
thing& operator++ (thing& t); // Function
thing& operator-- (thing& t); // Function
thing& thing::operator++ (); // Method
thing& thing::operator-- (); // Method
What about the post-increment version? These have two differences:
They take an extra argument of type
int
. The value of this argument is always 0, it’s just there to distinguish them from the pre- versions.The usually don’t return a reference.
thing operator++ (thing& t, int i); // Function
thing operator-- (thing& t, int i); // Function
thing thing::operator++ (int i); // Method
thing thing::operator-- (int i); // Method
Arithmetic operators
You can overload any of the arithmetic operators + - * / %
, including the
unary (prefix) minus. Normally these are overloaded as functions rather
than methods (so that they can take advantage of implicit conversions on
both operands). They usually take their arguments by copy.
thing operator+ (thing a, thing b);
thing operator- (thing a, thing b);
thing operator- (thing a); // Unary minus
thing operator* (thing a, thing b);
thing operator/ (thing a, thing b);
thing operator% (thing a, thing b);
Note that %
is normally only defined on int
-like types (whole numbers).
We could use these to build a new numeric type, like complex
, which supports
all the usually arithmetic operations:
class complex {
public:
complex(float r, float i) { re = r; im = i; }
complex(float r) { re = r; im = 0; }
float re, im;
};
complex operator+ (complex a, complex b) {
return complex{a.re + b.re, a.im + b.im}
}
complex operator- (complex a) {
return complex{-a.re, -a.im};
}
complex operator- (complex a, complex b) {
return a + -b;
}
complex operator* (complex a, complex b) {
return complex{a.re * b.re, a.im * b.im}
}
complex operator/ (complex a, complex b) {
return complex{a.re / b.re, a.im / b.im}
}
Overloading conversion operators
We’ve already seen how we can define how to convert things into a user-defined type, by defining a constructor. Can we go the other way, and define how elements of a user-defined type are converted into other things? Yes, by overloading the cast operator. When we write something like
float x = 1.2;
int y = int(x); // convert x to int
we are performing a cast, a explicit conversion from float
to int
. The
cast operator for a given target type looks like this:
int operator int (thing& a); // Function
int thing::operator int (); // Method
Both of these define how instances of thing
are converted into int
s. After
defining one of these, we can proceed to write
thing x = ...;
int y = x;
int z = x + 2; // Convert x to int and then add 2
Overloading comparisons
Traditionally, if you’re going to overload the comparison operators,
you’d do it by first overloading <
with the actual comparison. All the
other comparisons can be defined in terms of <
. (For normal use, all
of these should return bool
.) Although we can write ==
in terms of
>
(a == b
iff !(a < b || a > b)
), it’s usually possible to write a
more efficient comparison.
bool operator< (thing a, thing b) { /* compare */ }
bool operator> (thing a, thing b) { return b < a; }
bool operator<= (thing a, thing b) { return !(a > b); }
bool operator>= (thing a, thing b) { return !(a < b); }
bool operator== (thing a, thing b) { /* compare */ }
bool operator!= (thing a, thing b) { return !(a == b); }
That way we really only have to write one or two “custom” comparison operator, the rest can be written fairly easily.
By default, if you don’t overload ==
and !=
you get a version that just
compares all the members (public and private). None of the other comparison
operators have default versions.
Overloading NOT and bool
conversion
If your class has some idea of being “good” or “bad”, often you’ll
implement a conversion to bool
, along with overloading the !
(logical
NOT) operator, with the idea that true
corresponds to “good” and false
corresponds to “bad”. Thus, a user can test whether a given thing
is
good or bad by just doing
thing x = ...;
if(x) {
// x must be good
}
Input/output streams overload them to do just this:
bool operator bool (ostream& out) {
return out.good();
}
bool operator! (ostream& out) {
return !out.good();
}
Whenever we write something like
while(cin >> i) {...
we are taking advantage of this conversion.
Abstract data types
Classes, together with overloaded operators, give us the power to make new
data types. Just like vector
and string
, we can create a new type to
represent some collection of data. Sometimes a description of what a class
does, as distinct from how it does it, is called an “abstract data type”.
As an example, let’s implement a Set ADT. This is intended to act like a
mathematical set, so we can:
Insert and remove elements from it
Test whether its empty
Get the size (cardinality) of the set
Test whether a specific element is in it
Given two sets, construct their union, intersection, difference, and symmetric difference.
Note that there are many ways we could implement sets. The ADT is just a
description of what a Set can do, but sometimes we’ll refer to a particular
version of the Set as an ADT. We’re going to build a version of Set that is
built on a vector
where all the elements are unique. Let’s think about how
to implement all the above operations (bearing in mind that any operation
which changes a set or constructs a new set must ensure that the elements are
unique):
To insert an element we first check to see if it is already in the set. If it is not, then we simply
push_back
it.To remove an element we search the vector for it and then
erase
it. Because the elements are unique, we know that there will be at most one copy of it, so we can stop as soon as we find it.To check whether a set is empty, we just call
.empty()
on the set.To get the size of the set, we just call
.size()
on the vector.To check whether an element is in the set, we loop over all the elements in the set and see if we find it.
To construct the union of two sets, we loop over both sets and then insert both their elements into a new (initially empty) set. Because we assume that insert won’t add duplicate elements, this is safe.
To construct the intersection, we loop over one set, and then for each element, check to see if it is in the second set. If it is, we add it to a new, initially empty set (note that we don’t have to use insert here, because all the things we are inserting are coming from a single existing set, and thus are guaranteed to be unique).
To construct the set difference \(A \setminus B\) we loop over \(A\) and check to see if its elements are in \(B\) and add them only if they are not. Once again, it’s safe to not use insert here (just
push_back
).To construct the symmetric difference, we can just construct the union, and then “subtract” (set difference) the intersection.
Our class will look something like this:
class set {
public:
// Constructor: constructs the empty set
set() { }
int size() { return data.size(); }
bool empty() { return data.empty(); }
bool contains(int x) {
for(int i : data)
if(i == x)
return true;
return false;
}
void insert(int x) {
if(!contains(x))
data.push_back(x);
}
void remove(int x) {
for(int i = 0; i < data.size(); ++i)
if(data.at(i) == x) {
data.erase(data.begin() + i);
break;
}
}
set union(set& other) {
set result = *this; // Copy of this set
// Insert other data (to avoid duplicates)
for(int i : other.data)
result.insert(i);
return result;
}
set intersection(set& other) {
set result; // Empty
for(int i : data)
if(other.contains(i))
result.data.push_back(i);
return result;
}
set difference(set& other) {
set result = *this;
for(int i : other.data)
result.remove(i);
return result;
}
set symmetric_diff(set& other) {
set result = union(other);
return result.difference(intersection(other));
}
friend ostream& operator<<(ostream&, set);
private:
vector<int> data;
};
ostream& operator<< (ostream& out, set x) {
cout << "{";
for(int i = 0; i < x.data.size() - 1; ++i)
out << x.data.at(i) << ",";
if(!x.empty())
out << x.data.back() << "}";
return out;
}
To make things easier, we can overload some operators for the set operations:
set operator| (set a, set b) {
return a.union(b);
}
set operator& (set a, set b) {
return a.intersection(b);
}
set operator- (set a, set b) {
return a.difference(b);
}
set operator^ (set a, set b) {
return a.symmetric_diff(b);
}
set operator+= (set a, int x) }
a.insert(x);
return a;
And now we can write a quite natural set manipulation program:
int main() {
set a;
a += 1; a += 2; a += 3;
set b;
b += 3; b += 4; b += 5;
cout << (a & b) << endl;
return 0;
}
Optional topics: const correctness. Using initializer_list
for more natural
set creation. Member initalizer lists.