Review of last time
Multi-file projects consist of
Source files (
.cpp
), containing function and method definitions, which are compiled and then listed on the command-line to G++ to combine them into an executable.Header files (
.hpp
), containing classes and function declarations, which are#include
-d within source files that need to use those declarations.
You have to compile all the .cpp
files together when you build your program:
g++ -o my_program file1.cpp file2.cpp ...
Header files, on the hand, are used via #include
:
#include "file1.hpp"
to bring in all the declarations in file1.hpp
(which, presumably correspond to
the functions defined in file1.cpp
).
Function declarations and class definitions go in header files. Function definitions and method definitions (if there are any) go in source files.
Namespaces let us separate the names of things (functions, classes, variables)
into different “spaces” so that (for example) the names of things we create
won’t collide and cause problems with the names of things in the std
namespace. Instead of doing using namespace std;
it’s better to think about
what we’re really going to use, where we’re going to use it, and then write
specific using
declarations for those things (e.g., using std::cout;
instead
of bringing in all of the std
namespace.)
The fully qualified name of anything is its name with the namespace it’s in
before it, separated by ::
. E.g., the fully qualified name of cout
is
std::cout
. If you don’t want to use the fully qualified name, you can
always import a name:
using namespace whatever;
brings in all the names in namespacewhatever
, so you don’t have to use the qualified name of anything (but you also can’t use any of those names for your own things!).using whatever::thing;
brings in just the namewhatever::thing
, so now you can usething
by itself.
Either of these can be put in a file (brings in the name for the entire file), or a function (brings in the name(s) just in that function), or a class definition (just within that class) or method (just within that method).
You can define your own namespaces by doing
namespace mystuff {
int f() {
return 0;
}
}
The fully qualified name of f
is mystuff::f
. Within the mystuff
namespace,
you can refer to it just as f
.
You can also define one namespace inside another:
namespace A {
namespace B {
int f() {
return 1;
}
}
}
The fully qualified name of f
is A::B::f
.
Exceptions
Exceptions are the way C++ handles errors, in two senses:
When your code does something wrong at runtime (e.g., if you do
v.at(6)
when the vectorv
only has 5 rows) the system creates or throws an exception. Exceptions are also thrown if you try to read from a file that doesn’t exist, etc. If you want to be able to make your program deal with these kinds of problems, you’ll have to deal with exceptions.When something goes wrong and you want to signal an error, you can create and throw your own exceptions. Thus, your code can signal that particular things went wrong, and other parts of it can respond to those signals.
How exceptions work: history
In order to understand how exceptions work, you have to understand how errors
we’re indicated before exceptions were invented. Before, there were two ways
that functions signaled that an error had occured: error codes and errno
.
Error codes
With error codes, any function that could go wrong returned
a special code in that case. E.g., if you had a function open_file()
which
opened a file, it might return a code ERROR_FILE_NO_EXIST
if the file did not exist.
So after you opened a file, you’d have to check and see if an error code was
returned, and if so, deal with it in some way. For example:
do {
cout << "Enter filename: ";
string filename;
cin >> filename;
file f;
int code = open_file(filename, f);
if(code == ERROR_OK)
break;
} while(true);
// Now the file is open
(Note that there is no function open_file
; this is just a hypothetical
example.)
Now suppose we want to write a larger function that uses open_file
, but one
that isn’t really prepared to deal with an errors it might generate. This
function must now handle passing those error codes on up, in its return:
int read_in_file(string filename) {
file f;
int error = open_file(filename,f);
if(error == ERROR_OK) {
// Read file
return ERROR_OK;
}
else
return error;
}
Every function that wants to use open_file
needs to have some code like
this, even thought the function read_in_file
and others like it don’t actually
care about the errors that open_file
might produce. Even worse is if you
write a function that does more than one thing, each of which might produce
different errors:
int error1 = read_file(filename);
int error2 = write_file(filename);
if(error1 == ERROR_OK && error2 == ERROR_OK)
return ERROR_OK;
else
// Now what?
In the else case, what do we do if both error1
and error2
are not OK?
Which error do we return? Maybe the error in error2
was caused by the
fact that the read_file
failed!
Using error codes requires us to duplicate the error-code-handling code literally everywhere, which is why often many programmers would just leave it out entirely: any errors just cause the entire program to crash, with no attempt made to figure out why they happened or if they could be corrected.
Another problem is that any function that uses error codes cannot use
return
for any other purposes; you can’t return a value the way you normally
would, you’d have to use a reference (“out”) parameter.
errno
The other standard mechanism (used by the C standard library) for indicating
errors is with a global variable named errno
. If errno
is not 0 then an
error has occured (the value indicates the exact error) in whatever function
you used last.
Using a global variable means that, in theory, functions that don’t care about errors don’t need to bother with them: when an error occurs, the variable is set and retains that value until the error is handled. It also means that functions can now use their return values for whatever they want. For simple situations, this is sufficient.
However, the problem with this mechanism is that the errno
only tells you
about any errors that occured in the most recently called standard function.
In particular, if you call two functions, in a row, that can both generate
errors, then any errors in the second would totally replace any errors in the
first. Again, we have the problem of multiple errors. If you want to deal with
this correctly, every function that uses any facilities that could change
errno
must check the value after every call that could change it. In fact,
we are back in the world of having to check every return value, except that
instead of checking return values, we’re checking errno
:
do_something();
if(errno != 0)
// Handle error
do_something_else();
if(errno != 0)
// Handle error
And so forth.
Exceptions
Exceptions were invented to solve all these problems:
Functions having to write repetetive code to deal with errors that they didn’t care about.
Error codes making return values unusable for any other purpose.
The key observation behind exceptions is that the location where an error
occurs is not necessarily the location where we want to handle it. In the
most extreme case, errors might occur anywhere in our program, but we might
want to handle all of them centrally in main
. Usually, we’ll take a more
nuanced approach than that (e.g., allowing each component to handle errors
in some central place).
In order to understand how exceptions work, you have to understand how functions work, when they are called. E.g., suppose we have (using error codes)
void f() {
// We want to handle errors here
if(g() != ERROR_OK)
// Handle errors
}
int g() {
return h();
}
int h() {
// An error occurs here
return ERROR_YIKES;
}
The function call stack for this looks like this:
| |
|h |
|g |
|f |
|----|
Note that regardless of whether we are using error codes or exceptions, errors
can only be “passed off” to a function that is closer to the bottom of the
stack. Errors can only “bubble” to functions further down the calls stack
(usually we think of functions like f
as being “higher up” in the function
call hierarchy, which is just the call stack turned upside down).
What we want is for g
to not have to care about any errors that might pass
through it, on their way down the stack, and that’s what exceptions give us:
with exceptions this code looks like
#include <stdexcept>
using namespace std;
void f() {
try {
g();
}
catch (exception& e) {
// Handle error
}
}
void g() {
h();
}
void h() {
...
throw exception{"Error!"};
}
There are two elements present that deal with exceptions: try-catch
and throw
.
If you only want to catch exceptions, then you only need to worry about
try-catch
, so that’s what we’ll deal with first.
(Note that a lot of code working with exceptions needs things from <stdexcept>
so we’ll include that.)
Catching exceptions
To catch an exception, we need two things: the code where the exception might
occur, and what to do if it does. try-catch
gives us both. The try
part
is the code where an error might occur. It says, “we are interested in
errors that occur here”. The catch
part specifies what particular errors we
are interested in, and what to do if they occur.
If an error occurs in any
function called in the try
(or any function called by any function in the
try
, etc.) then C++ will check to see if the type of that exception matches
the catch
; if it does, then it will pass the actual exception object to
the catch
-handler and run the code it contains. (After the catch
is finished,
code continues with the first line of code outside it.)
E.g., in the following code, let’s trace the flow of control:
int main() {
try {
f();
}
catch(exception& e) {
cout << e.what() << endl;
}
}
void f() {
g();
h();
}
void g() {
cout << "In g()" << endl;
}
void h() {
cout << "In h()" << endl;
throw exception{"Error"};
}
exception
is a type (declared in <stdexcept>
), like vector
or string
.
We’ll talk a lot more about it and its friends later, but for now, notice
two things:
When you construct an
exception
you give it astring
.Later on, using the
.what()
member function will give you back the string that was used to build the exception. (All of the standard exception types support.what()
, so you can always use it to get a human-readable version of whatever error occured.)When we
catch
an exception, we always do it by reference. Not because we’re going to modify it, but for reasons we’ll see later…
Note that while <stdexcept>
declares exception
and kin, it is not
required to use try-catch-throw
. You can use try-catch
without including
<stdexcept>
, you just can’t use any of the standard exception types. But note
that an exception can be anything, of any type:
throw 1; // The exception is an integer
throw "Hello"; // The exception is a string
throw 1.34; // The exception is a float
vector<int> data = {1,2,3};
throw data; // The exception is a vector<int>
In general, however, it’s better to use the standard exception types, or at least to build off of them, as other people’s code will expect yours to work that way.
Note that an exception is caught by the “nearest” catch
that matches it.
For example
int main() {
try {
f();
}
catch(exception& e) {
cout << "Main" << endl;
}
}
void f() {
try {
g();
}
catch(exception& e) {
cout << "f" << endl;
}
}
void g() {
throw exception{"Error"};
}
What will this print?
The standard exception hierarchy
Although we could just throw strings for every kind of exception, containing a description of what went wrong, that’s not very useful from the computer’s perspective. Remember that we don’t just want to print out a message saying “Error whatever occured”, we want to respond to the error, maybe do something to correct it and try again. If we’re going to do that, the exception needs to tell the computer (not just us humans) some information about what went wrong. Consequently, there are various standard exception types for various situations:
Trying to access an element of a
vector
orstring
that is after the end, or before the beginning.Trying to resize a vector to a negative size
Trying to convert the string
"abc"
to an integerTrying to divide by 0.
etc.
If we think about the different kinds of errors, we can begin to group them together:
Accessing something out of bounds in a container
Using an invalid size for a containiner (either when creating it, or when resizing it).
Errors related to passing an invalid parameter to a function (e.g.,
stoi
expects a string that can be converted to a number). We’re not talking about passing a value of the wrong type; that will be caught at compile time. Rather, sometimes there are values of the right type, that are nonetheless not valid for that particular function.Errors related to arithmetic
And all of these can be grouped together under the banner of “exceptions”. We would like for these logical groupings to be somehow represented in code. For example, maybe I care about all “logic errors”, those that come from using a value that is too big or too small, but I don’t particularly care whether its an out-of-bounds error on a container, or something else. What we want is a way to group the standard exception types into a hierarchy, so that we can say
A logic error is an exception
An out-of-bounds error is a logic error
A vector-out-of-bounds error is an out-of-bounds error
And so forth.
We want the freedom to be specific, or general, when talking about exception types.
Another way to think about this is in the example we looked at previously,
with the dog
class. Suppose we add an animal
class. We would expect that
dog
would be related to animal
in some way, to reflect the fact that
every dog
is also an animal
. In fact, we want a class hierarchy, so
we can say that corgis are dogs, dogs are animals, animals are organisms, and
so forth. Then, we would know that whatever an animal
can do, dog
(or a
corgi
) can also do. C++ let’s us express this through class inheritance.
We’re going to talk a lot about inheritance later, but for now, all you need to
know is that it’s possible, and that the standard exception types use it
to reflect the connections between the different types.
The standard exception hierarchy looks like this:
exception
|
+---- logic_error
| +-------------- invalid_argument
| +-------------- domain_error
| +-------------- invalid_argument
| +-------------- length_error
| +-------------- out_of_range
|
+---- runtime_error
| +-------------- range_error
| +-------------- overflow_error
| +-------------- underflow_error
| +-------------- system_error
| +--------------- ios_base::failure
+---- bad_alloc
+-------------- bad_array_new_length
plus a few other specialized exception classes for particular situations.
In order, we have
exception
– the base of all exceptions. The only thing this provides is the ability to construct an exception with a string describing the error (e.g.,exception{"Computer exploded"}
) and then to later access that string through the.what()
method.logic_error
– base class for logic errors, errors that reflect “violations to the internal logic of the program”.invalid_argument
– exception class for errors related to passing an invalid argument to a function (e.g., passing a string"abc"
to a function that expects a string representing a number).domain_error
– a general-purposes class for errors relating the mathematical domain of a function. For example, passing a negative value to thesqrt
function is a domain error.length_error
– class for errors related to the length (size
) of containers. Trying to create a negative-sized containiner would be an example. If we had a version ofvector
that could only have its size set to a multiple of 10, then presumably it would throwlength_error
if we tried to resize it to size 11.out_of_range
– class for errors related to trying to access partiuclar elements of a container. If you try to dov.at(-1)
orv.at(v.size())
this will be thrown. In more general terms, if you try to access an element of a container that does not exist,out_of_range
will be thrown.runtime_error
– a class for errors that occur at runtime, due to events outside the program’s control. The difference between this andlogic_error
is that logic errors are things that, in theory, your program should have checked for. For example, if you accessedv.at(i)
you should check to make surei
is not negative. Runtime errors are things you could not detect: in fact, the exception is often how you detect it! Examples include things like trying to open a file (that does not exist) or passing too big a value to a math function.range_error
– a class that is vaguely defined as concerning the mathematical range of a function. For example, if a function is defined to return anunsigned int
but its arguments would require it to return a negative value.underflow_error
– a class that is used for arithmetic underflow situations, situations where we need to represent a number smaller than what the type can represent. For example, withfloat
there is some “smallest number” that can be represented; a function that needed to deal with floats smaller than this might throwunderflow_error
.overflow_error
– a corresponding class for arithmetic overflow, trying to create or work with values too big for the type.system_error
– a catch-all class for “system errors”: file errors, etc. This adds an extra member to contain an error code, accessible via.code()
, because many system functions would return an error code. (Note that this is defined inside the header<system_error>
)ios_base::failure
– this is a specialized class which is defined inside the classios_base
(yes you can have classes inside classes!), hence the weird name. This exception is used for errors relating to input and output. There are various subclasses of it that are related to specific input/output errors.bad_alloc
– is thrown when you try to donew
and there isn’t enough memory to create the object.bad_array_new_length
– is thrown when you try to usenew
to create an array and the length of the array is invalid for some reason (negative, bigger than memory, etc.).
You can catch exceptions of any of these types. Note that whenever you catch an exception you want to catch it by reference. The reason for this is something we’ll talk about later, but suffice it to say, if you write
catch(exception& e) {
}
then it will catch exceptions not just of type exception
but of any type
that is below it in the hierarchy. If, however, we write
catch(exception e) {
}
then we lose this behavior. (References have a bit of a “magical” behavior in that they can refer to things of classes lower down on the hierarchy. We’ll see why this happens later on.)
For example, suppose we do
try {
g();
}
catch(logic_error& e) {
cout << "Hello";
}
...
void g() {
throw invalid_argument();
}
This will be caught by the catch
clause because an invalid_argument
exception is a kind of logic_error
. This is the case even if a higher-up
catch
would have matched more specifically:
int main() {
try {
f();
}
catch(invalid_argument& e) {
cout << "Main" << endl;
}
}
void f() {
try {
throw invalid_argument{"Oops!"};
}
catch(logic_error& e) {
cout << "f()" << endl;
}
}
This will print out f()
because the first catch
in line to see the
exception is the one in f
, and it can handle invalid_argument
exceptions.
The fact that the catch
in main
might be a better “match” makes no
difference: the exception never makes it that far.
This also illustrates another facet of exception handling: after we exit the
catch
clause in f()
, the exception itself goes away; it evaporates. From
the perspective of main
, no exceptions occured, because f()
handled it.
Catching multiple exception types
Because the same code might throw many different types of exceptions, and
because we might want to response to more than one type, we can use multiple
catch
clauses to specify which types we are interested in:
try {
// stuff
}
catch(out_of_range& e) {
}
catch(logic_error& e) {
}
catch(exception& e) {
}
Note that the order in which I listed them is important: C++ will try them in
the order they are listed, so you want to put the most specific exception type
first. If I had put exception
first, then everything that derives from
exception
would be caught by it, too, and none of the other clauses would
ever be touched.
If you want to catch all kinds of exceptions, and don’t care at all about what they are, you can write
...
catch(...) {
}
Obviously, this must be last in the list of catch
clauses, because it will
catch anything. Also note that because we don’t give a variable here, you
get no information about the exception that occured; all you know is that
it happened.
Generally, you should avoid writing catch(...)
precisely because it throws
away information about what error occured. The only time it is acceptible to
write catch(...)
is maybe in your main()
, where you can use it to
“clean up” before your program quits:
int main() {
try {
// Do stuff here
}
catch(...) {
// Something bad happened here, clean up for the program.
// ...
return 1;
}
return 0;
}
Here’s a larger example, where we have multiple catch clauses, at multiple levels:
int main() {
try {
f();
}
catch(invalid_argument& e) {
cout << "Invalid argument!" << endl;
}
catch(runtime_error& e) {
cout << "Runtime error!" << endl;
}
catch(...) {
cout << "Something else!" << endl;
}
}
void f() {
try {
g();
}
catch(domain_error& e) {
cout << "Domain error" << endl;
}
catch(logic_error& e) {
cout << "Logic error" << endl;
}
}
void g() {
throw ___;
}
What will be printed if
g()
throwsinvalid_argument
?What will be printed if
g()
throwsout_of_range
?What will be printed if
g()
throwsunderflow_error
?What will be printed if
g()
throwsbad_alloc
?
Creating our own exception classes
As mentioned, you can throw anything as an exception, so that means we can create our own exception classes and throw them:
class potato {
public:
potato(int i) { number = i; }
int number;
};
...
throw potato{1};
...
catch(potato& p) {
cout << "Caught a potato!" << endl;
}
This will work, but it doesn’t really play nice with the standard exceptions,
or code that is written to deal with them. Instead, we should make our classes
be part of the same hierarchy. In other words, we need to say that potato
is an exception:
#include <stdexcept>
class potato : public std::exception {
public:
potato(int i) { number = i; }
int number;
};
The bit after the class name : public std::exception
says that the class
potato
extends or inherits from the class std::exception
. Effectively,
we are saying that every potato
is also an exception
! Among other things,
this means that every potato
now gets the ability to be constructed with a
message:
throw potato{"Incoming!"};
which can be accessed via .what()
:
catch(potato& p) {
cout << p.what() << endl;
}
Also note that if we catch exception& e
, it will now also catch potato
s,
although at that point we can’t tell the difference between any old
exception
and a potato
. (Remember, if you catch(exception& e)
you are
saying you don’t care what particular kind of exception it is; if you want
to know about particular exceptions, you have to catch them specifically!)
Ideally, when you create your own exception classes, you should think about
where they fit in to the exception hierarchy, and make them extend the
appropriate exception class, and not just always use exception
. For example,
let’s write a function that takes a string of numeric digits (e.g., "123"
)
and converts it to an int
(this already exists in the standard function
stoi()
, which is defined in <string>
).
unsigned int string_to_int(string s) {
unsigned int value = 0;
unsigned int p10 = 1;
for(int i = s.length() - 1; i >= 0; i--) {
char d = s.at(s.length() - i);
if(d >= '0' && d <= '9') {
value += p10 * (d - '0');
p10 *= 10;
}
else
throw ???;
}
return value;
}
(I’m using unsigned int
because we’re not taking account any minus that
might be before the number.)
This works by looping over the characters in s
in reverse order. We do it
in reverse because the low digits (ones, tens, etc.) are on the end of the
string while the higher digits (hundreds, thousands) are at the beginning.
For each digit, we need to multiply it by its corresponding power of 10. There
is a function in <cmath>
named pow()
that will compute powers for us, but
it’s not the most efficient way to do it. If we think about it, we don’t need to
compute any power of 10, rather, given a power of 10 \(10^p\) we need to
compute \(10^{p+1}\), which we can do by just mulitplying by 10. So we
keep track of the “current” power of two in p10
and multiply it by 10 each
time through the loop.
What kind exception should this throw? The problem is that we’ve received
an invalid argument to this function: a string containing non-numeric
characters, when only numeric characters are allowed. So we should throw
bad_argument
. Is this the only exception we could throw? No; we could also
look for situations where the number represented by the string was too big
for an int
and throw an overflow_error
in that case. (How would we detect
that? Due to the way unsigned int overflow works, we could look for the
situation where p10 * 10 < p10
.) We might also throw bad_argument
if the
input string is empty. Maybe we could look for a -
at the beginning of the
string and throw a range_error
, indicating that this function’s range
does not include negative numbers. A well-designed function will use exceptions
to signal errors in every situation that it has control over; other problems
(e.g., caused by functions that it uses) will just be passed on unchanged,
thanks to the exception “bubbling” mechanism.
unsigned int string_to_int(string s) {
if(s.empty())
throw bad_argument{"Empty string"};
else if(s.front() == '-')
throw range_error{"Negative value"};
unsigned int value = 0;
unsigned int p10 = 1;
for(int i = s.length() - 1; i >= 0; i--) {
char d = s.at(s.length() - i);
if(d >= '0' && d <= '9') {
value += p10 * (d - '0');
if(p10 * 10 < p10)
throw overflow_error{"Input number too big for uint"};
p10 *= 10;
}
else
throw bad_argument{"Non-numeric character"};
}
return value;
}
Assertions
Assertions are part of a practice known as “defensive programming”; the idea is to intentionally make our program crash, as soon as we detect a mistake that we’ve made. Let’s take the area functions above as an example:
#include "area.h"
float rectangle_area(float width, float height) {
return width * height;
}
float triangle_area(float width, float height) {
return width * height / 2.0;
}
What happens if the width or height is negative? Well, that should never happen. We should have checked the width/height before we passed it to the funciton. If the width/height should really, really never be negative, such that if it is negative, it means something has gone terribly wrong and there’s no way to recover, then we can write an assertion to express that:
#include <cassert>
#include "area.h"
float rectangle_area(float width, float height) {
assert(width >= 0 && height >= 0);
return width * height;
}
float triangle_area(float width, float height) {
assert(width >= 0 && height >= 0);
return width * height / 2.0;
}
The statement assert(condition)
says that condition must be true at
this point in the program; if it is false, then something has gone terribly
wrong and the program should stop immediately. If condition == false
then
the program will print an error message (giving at least the file name and
line number of the failed assertion) and stop immediately.
Assertions are useful for expression preconditions, things that must be true in order for the code to work correctly. Exceptions are used in situations where it might be possible to recover from the error, so we want to give the program a chance to do that. Assertions are for situations where things are so screwed up that the program must stop. Another way to put it is, ideally, your program should never trigger an assertion; your goal should be to write other code to ensure that the condition is always true, no matter what happens.
Another difference between assertions and exceptions is that assertions can be turned off. Suppose you’ve gone over your code with a fine-toothed comb and you’ve got everything setup so that no matter what happens, no assertion will ever be triggered. Testing all those assertions still takes a bit of time, so you can disable all assertions in your program, making them do nothing at all, by adding
#define NDEBUG
#include <cassert>
#include "area.h"
float rectangle_area(float width, float height) {
assert(width >= 0 && height >= 0);
return width * height;
}
float triangle_area(float width, float height) {
assert(width >= 0 && height >= 0);
return width * height / 2.0;
}
NDEBUG
means “no debugging”, which turns off all assertions. (There’s a way
that you can also do this from the compiler command-line, so you don’t have
to modify your files at all.)
Again, assertions are designed to capture programmer errors, your own mistakes, while exceptions are intended to capture user errors (e.g., in input) or system errors (out of memory, file not found, etc.). If you’re trying to catch your own mistakes, you don’t want the program trying to intercept those mistakes and fix them: you just want it to crash with some indication of where the mistake is.