Review of last time

Multi-file projects consist of

Source files (.cpp), containing function and method definitions, which are compiled and then listed on the command-line to G++ to combine them into an executable.
Header files (.hpp), containing classes and function declarations, which are #include-d within source files that need to use those declarations.

You have to compile all the .cpp files together when you build your program:

g++ -o my_program file1.cpp file2.cpp ...

Header files, on the hand, are used via #include:

#include "file1.hpp"

to bring in all the declarations in file1.hpp (which, presumably correspond to the functions defined in file1.cpp).

Function declarations and class definitions go in header files. Function definitions and method definitions (if there are any) go in source files.

Namespaces let us separate the names of things (functions, classes, variables) into different “spaces” so that (for example) the names of things we create won’t collide and cause problems with the names of things in the std namespace. Instead of doing using namespace std; it’s better to think about what we’re really going to use, where we’re going to use it, and then write specific using declarations for those things (e.g., using std::cout; instead of bringing in all of the std namespace.)

The fully qualified name of anything is its name with the namespace it’s in before it, separated by ::. E.g., the fully qualified name of cout is std::cout. If you don’t want to use the fully qualified name, you can always import a name:

using namespace whatever; brings in all the names in namespace whatever, so you don’t have to use the qualified name of anything (but you also can’t use any of those names for your own things!).
using whatever::thing; brings in just the name whatever::thing, so now you can use thing by itself.

Either of these can be put in a file (brings in the name for the entire file), or a function (brings in the name(s) just in that function), or a class definition (just within that class) or method (just within that method).

You can define your own namespaces by doing

namespace mystuff {
  int f() {
    return 0;
  }
}

The fully qualified name of f is mystuff::f. Within the mystuff namespace, you can refer to it just as f.

You can also define one namespace inside another:

namespace A {
  namespace B {
    int f() {
      return 1;
    }
  }
}

The fully qualified name of f is A::B::f.

Exceptions

Exceptions are the way C++ handles errors, in two senses:

When your code does something wrong at runtime (e.g., if you do v.at(6) when the vector v only has 5 rows) the system creates or throws an exception. Exceptions are also thrown if you try to read from a file that doesn’t exist, etc. If you want to be able to make your program deal with these kinds of problems, you’ll have to deal with exceptions.
When something goes wrong and you want to signal an error, you can create and throw your own exceptions. Thus, your code can signal that particular things went wrong, and other parts of it can respond to those signals.

How exceptions work: history

In order to understand how exceptions work, you have to understand how errors we’re indicated before exceptions were invented. Before, there were two ways that functions signaled that an error had occured: error codes and errno.

Error codes

With error codes, any function that could go wrong returned a special code in that case. E.g., if you had a function open_file() which opened a file, it might return a code ERROR_FILE_NO_EXIST if the file did not exist. So after you opened a file, you’d have to check and see if an error code was returned, and if so, deal with it in some way. For example:

do {
    cout << "Enter filename: ";
    string filename;
    cin >> filename;
    file f;
    int code = open_file(filename, f);
    if(code == ERROR_OK)
        break;
} while(true);
// Now the file is open

(Note that there is no function open_file; this is just a hypothetical example.)

Now suppose we want to write a larger function that uses open_file, but one that isn’t really prepared to deal with an errors it might generate. This function must now handle passing those error codes on up, in its return:

int read_in_file(string filename) {
    file f;
    int error = open_file(filename,f);
    if(error == ERROR_OK) {
        // Read file 
        return ERROR_OK;
    }
    else
        return error;
}

Every function that wants to use open_file needs to have some code like this, even thought the function read_in_file and others like it don’t actually care about the errors that open_file might produce. Even worse is if you write a function that does more than one thing, each of which might produce different errors:

int error1 = read_file(filename);
int error2 = write_file(filename);

if(error1 == ERROR_OK && error2 == ERROR_OK)
    return ERROR_OK;
else
    // Now what?

In the else case, what do we do if both error1 and error2 are not OK? Which error do we return? Maybe the error in error2 was caused by the fact that the read_file failed!

Using error codes requires us to duplicate the error-code-handling code literally everywhere, which is why often many programmers would just leave it out entirely: any errors just cause the entire program to crash, with no attempt made to figure out why they happened or if they could be corrected.

Another problem is that any function that uses error codes cannot use return for any other purposes; you can’t return a value the way you normally would, you’d have to use a reference (“out”) parameter.

errno

The other standard mechanism (used by the C standard library) for indicating errors is with a global variable named errno. If errno is not 0 then an error has occured (the value indicates the exact error) in whatever function you used last.

Using a global variable means that, in theory, functions that don’t care about errors don’t need to bother with them: when an error occurs, the variable is set and retains that value until the error is handled. It also means that functions can now use their return values for whatever they want. For simple situations, this is sufficient.

However, the problem with this mechanism is that the errno only tells you about any errors that occured in the most recently called standard function. In particular, if you call two functions, in a row, that can both generate errors, then any errors in the second would totally replace any errors in the first. Again, we have the problem of multiple errors. If you want to deal with this correctly, every function that uses any facilities that could change errno must check the value after every call that could change it. In fact, we are back in the world of having to check every return value, except that instead of checking return values, we’re checking errno:

do_something();
if(errno != 0)
    // Handle error
do_something_else();
if(errno != 0)
    // Handle error

And so forth.

Exceptions

Exceptions were invented to solve all these problems:

Functions having to write repetetive code to deal with errors that they didn’t care about.
Error codes making return values unusable for any other purpose.

The key observation behind exceptions is that the location where an error occurs is not necessarily the location where we want to handle it. In the most extreme case, errors might occur anywhere in our program, but we might want to handle all of them centrally in main. Usually, we’ll take a more nuanced approach than that (e.g., allowing each component to handle errors in some central place).

In order to understand how exceptions work, you have to understand how functions work, when they are called. E.g., suppose we have (using error codes)

void f() {
    // We want to handle errors here
    if(g() != ERROR_OK)
        // Handle errors
}

int g() {
    return h();
}

int h() {
    // An error occurs here
    return ERROR_YIKES;
}

The function call stack for this looks like this:

|    |
|h   |
|g   |
|f   |
|----|

Note that regardless of whether we are using error codes or exceptions, errors can only be “passed off” to a function that is closer to the bottom of the stack. Errors can only “bubble” to functions further down the calls stack (usually we think of functions like f as being “higher up” in the function call hierarchy, which is just the call stack turned upside down).

What we want is for g to not have to care about any errors that might pass through it, on their way down the stack, and that’s what exceptions give us: with exceptions this code looks like

#include <stdexcept>
using namespace std;

void f() {
    try {
        g();
    }
    catch (exception& e) {
        // Handle error
    }
}

void g() {
    h();
}

void h() {
    ...
    throw exception{"Error!"};
}

There are two elements present that deal with exceptions: try-catch and throw. If you only want to catch exceptions, then you only need to worry about try-catch, so that’s what we’ll deal with first.

(Note that a lot of code working with exceptions needs things from <stdexcept> so we’ll include that.)

Catching exceptions

To catch an exception, we need two things: the code where the exception might occur, and what to do if it does. try-catch gives us both. The try part is the code where an error might occur. It says, “we are interested in errors that occur here”. The catch part specifies what particular errors we are interested in, and what to do if they occur.

If an error occurs in any function called in the try (or any function called by any function in the try, etc.) then C++ will check to see if the type of that exception matches the catch; if it does, then it will pass the actual exception object to the catch-handler and run the code it contains. (After the catch is finished, code continues with the first line of code outside it.)

E.g., in the following code, let’s trace the flow of control:

int main() {
    try {
        f();
    }
    catch(exception& e) {
        cout << e.what() << endl;
    }
}

void f() {
    g();
    h();
}

void g() {
    cout << "In g()" << endl;
}

void h() {
    cout << "In h()" << endl;
    throw exception{"Error"};
}

exception is a type (declared in <stdexcept>), like vector or string. We’ll talk a lot more about it and its friends later, but for now, notice two things:

When you construct an exception you give it a string.
Later on, using the .what() member function will give you back the string that was used to build the exception. (All of the standard exception types support .what(), so you can always use it to get a human-readable version of whatever error occured.)
When we catch an exception, we always do it by reference. Not because we’re going to modify it, but for reasons we’ll see later…

Note that while <stdexcept> declares exception and kin, it is not required to use try-catch-throw. You can use try-catch without including <stdexcept>, you just can’t use any of the standard exception types. But note that an exception can be anything, of any type:

throw 1;        // The exception is an integer
throw "Hello";  // The exception is a string
throw 1.34;     // The exception is a float

vector<int> data = {1,2,3};
throw data;     // The exception is a vector<int>

In general, however, it’s better to use the standard exception types, or at least to build off of them, as other people’s code will expect yours to work that way.

Note that an exception is caught by the “nearest” catch that matches it. For example

int main() {
  try {
    f();
  }
  catch(exception& e) {
    cout << "Main" << endl;
  }
} 

void f() {
  try {
    g();
  }
  catch(exception& e) {
    cout << "f" << endl;
  }
}

void g() {
  throw exception{"Error"};
}

What will this print?

The standard exception hierarchy

Although we could just throw strings for every kind of exception, containing a description of what went wrong, that’s not very useful from the computer’s perspective. Remember that we don’t just want to print out a message saying “Error whatever occured”, we want to respond to the error, maybe do something to correct it and try again. If we’re going to do that, the exception needs to tell the computer (not just us humans) some information about what went wrong. Consequently, there are various standard exception types for various situations:

Trying to access an element of a vector or string that is after the end, or before the beginning.
Trying to resize a vector to a negative size
Trying to convert the string "abc" to an integer
Trying to divide by 0.
etc.

If we think about the different kinds of errors, we can begin to group them together:

Accessing something out of bounds in a container
Using an invalid size for a containiner (either when creating it, or when resizing it).
Errors related to passing an invalid parameter to a function (e.g., stoi expects a string that can be converted to a number). We’re not talking about passing a value of the wrong type; that will be caught at compile time. Rather, sometimes there are values of the right type, that are nonetheless not valid for that particular function.
Errors related to arithmetic

And all of these can be grouped together under the banner of “exceptions”. We would like for these logical groupings to be somehow represented in code. For example, maybe I care about all “logic errors”, those that come from using a value that is too big or too small, but I don’t particularly care whether its an out-of-bounds error on a container, or something else. What we want is a way to group the standard exception types into a hierarchy, so that we can say

A logic error is an exception
An out-of-bounds error is a logic error
A vector-out-of-bounds error is an out-of-bounds error
And so forth.

We want the freedom to be specific, or general, when talking about exception types.

Another way to think about this is in the example we looked at previously, with the dog class. Suppose we add an animal class. We would expect that dog would be related to animal in some way, to reflect the fact that every dog is also an animal. In fact, we want a class hierarchy, so we can say that corgis are dogs, dogs are animals, animals are organisms, and so forth. Then, we would know that whatever an animal can do, dog (or a corgi) can also do. C++ let’s us express this through class inheritance. We’re going to talk a lot about inheritance later, but for now, all you need to know is that it’s possible, and that the standard exception types use it to reflect the connections between the different types.

The standard exception hierarchy looks like this:

exception
|
+---- logic_error
|     +-------------- invalid_argument
|     +-------------- domain_error
|     +-------------- invalid_argument
|     +-------------- length_error
|     +-------------- out_of_range
|
+---- runtime_error
|     +-------------- range_error
|     +-------------- overflow_error
|     +-------------- underflow_error
|     +-------------- system_error
|                     +--------------- ios_base::failure
+---- bad_alloc
      +-------------- bad_array_new_length

plus a few other specialized exception classes for particular situations.

In order, we have

exception – the base of all exceptions. The only thing this provides is the ability to construct an exception with a string describing the error (e.g., exception{"Computer exploded"}) and then to later access that string through the .what() method.
logic_error – base class for logic errors, errors that reflect “violations to the internal logic of the program”.
invalid_argument – exception class for errors related to passing an invalid argument to a function (e.g., passing a string "abc" to a function that expects a string representing a number).
domain_error – a general-purposes class for errors relating the mathematical domain of a function. For example, passing a negative value to the sqrt function is a domain error.
length_error – class for errors related to the length (size) of containers. Trying to create a negative-sized containiner would be an example. If we had a version of vector that could only have its size set to a multiple of 10, then presumably it would throw length_error if we tried to resize it to size 11.
out_of_range – class for errors related to trying to access partiuclar elements of a container. If you try to do v.at(-1) or v.at(v.size()) this will be thrown. In more general terms, if you try to access an element of a container that does not exist, out_of_range will be thrown.
runtime_error – a class for errors that occur at runtime, due to events outside the program’s control. The difference between this and logic_error is that logic errors are things that, in theory, your program should have checked for. For example, if you accessed v.at(i) you should check to make sure i is not negative. Runtime errors are things you could not detect: in fact, the exception is often how you detect it! Examples include things like trying to open a file (that does not exist) or passing too big a value to a math function.
range_error – a class that is vaguely defined as concerning the mathematical range of a function. For example, if a function is defined to return an unsigned int but its arguments would require it to return a negative value.
underflow_error – a class that is used for arithmetic underflow situations, situations where we need to represent a number smaller than what the type can represent. For example, with float there is some “smallest number” that can be represented; a function that needed to deal with floats smaller than this might throw underflow_error.
overflow_error – a corresponding class for arithmetic overflow, trying to create or work with values too big for the type.
system_error – a catch-all class for “system errors”: file errors, etc. This adds an extra member to contain an error code, accessible via .code(), because many system functions would return an error code. (Note that this is defined inside the header <system_error>)
ios_base::failure – this is a specialized class which is defined inside the class ios_base (yes you can have classes inside classes!), hence the weird name. This exception is used for errors relating to input and output. There are various subclasses of it that are related to specific input/output errors.
bad_alloc – is thrown when you try to do new and there isn’t enough memory to create the object.
bad_array_new_length – is thrown when you try to use new to create an array and the length of the array is invalid for some reason (negative, bigger than memory, etc.).

You can catch exceptions of any of these types. Note that whenever you catch an exception you want to catch it by reference. The reason for this is something we’ll talk about later, but suffice it to say, if you write

catch(exception& e) {

}

then it will catch exceptions not just of type exception but of any type that is below it in the hierarchy. If, however, we write

catch(exception e) {

}

then we lose this behavior. (References have a bit of a “magical” behavior in that they can refer to things of classes lower down on the hierarchy. We’ll see why this happens later on.)

For example, suppose we do

try {
  g();
}
catch(logic_error& e) {
  cout << "Hello";
}
...
void g() {
  throw invalid_argument();
}

This will be caught by the catch clause because an invalid_argument exception is a kind of logic_error. This is the case even if a higher-up catch would have matched more specifically:

int main() {
  try {
    f();
  }
  catch(invalid_argument& e) {
    cout << "Main" << endl;
  }
}

void f() {
  try {
    throw invalid_argument{"Oops!"};
  }
  catch(logic_error& e) {
    cout << "f()" << endl;
  }
}

This will print out f() because the first catch in line to see the exception is the one in f, and it can handle invalid_argument exceptions. The fact that the catch in main might be a better “match” makes no difference: the exception never makes it that far.

This also illustrates another facet of exception handling: after we exit the catch clause in f(), the exception itself goes away; it evaporates. From the perspective of main, no exceptions occured, because f() handled it.

Catching multiple exception types

Because the same code might throw many different types of exceptions, and because we might want to response to more than one type, we can use multiple catch clauses to specify which types we are interested in:

try {
    // stuff
}
catch(out_of_range& e) {

}
catch(logic_error& e) {

}
catch(exception& e) {

}

Note that the order in which I listed them is important: C++ will try them in the order they are listed, so you want to put the most specific exception type first. If I had put exception first, then everything that derives from exception would be caught by it, too, and none of the other clauses would ever be touched.

If you want to catch all kinds of exceptions, and don’t care at all about what they are, you can write

...
catch(...) {

}

Obviously, this must be last in the list of catch clauses, because it will catch anything. Also note that because we don’t give a variable here, you get no information about the exception that occured; all you know is that it happened.

Generally, you should avoid writing catch(...) precisely because it throws away information about what error occured. The only time it is acceptible to write catch(...) is maybe in your main(), where you can use it to “clean up” before your program quits:

int main() {
    try {
        // Do stuff here
    }
    catch(...) {
        // Something bad happened here, clean up for the program.
        // ... 
        return 1;
    }

    return 0;
}

Here’s a larger example, where we have multiple catch clauses, at multiple levels:

int main() {
  try {
    f();
  }
  catch(invalid_argument& e) {
    cout << "Invalid argument!" << endl;
  }
  catch(runtime_error& e) {
    cout << "Runtime error!" << endl;
  }
  catch(...) {
    cout << "Something else!" << endl;
  }
}

void f() {
  try {
    g();
  }
  catch(domain_error& e) {
    cout << "Domain error" << endl;
  }
  catch(logic_error& e) {
    cout << "Logic error" << endl;
  }
}

void g() {
  throw ___;
}

What will be printed if g() throws invalid_argument?
What will be printed if g() throws out_of_range?
What will be printed if g() throws underflow_error?
What will be printed if g() throws bad_alloc?

Creating our own exception classes

As mentioned, you can throw anything as an exception, so that means we can create our own exception classes and throw them:

class potato {
  public:
    potato(int i) { number = i; }

    int number;
};

...
throw potato{1};
...
catch(potato& p) {
    cout << "Caught a potato!" << endl;
}

This will work, but it doesn’t really play nice with the standard exceptions, or code that is written to deal with them. Instead, we should make our classes be part of the same hierarchy. In other words, we need to say that potato is an exception:

#include <stdexcept>
class potato : public std::exception {
  public:
    potato(int i) { number = i; }

    int number;
};

The bit after the class name : public std::exception says that the class potato extends or inherits from the class std::exception. Effectively, we are saying that every potato is also an exception! Among other things, this means that every potato now gets the ability to be constructed with a message:

throw potato{"Incoming!"};

which can be accessed via .what():

catch(potato& p) {
    cout << p.what() << endl;
}

Also note that if we catch exception& e, it will now also catch potatos, although at that point we can’t tell the difference between any old exception and a potato. (Remember, if you catch(exception& e) you are saying you don’t care what particular kind of exception it is; if you want to know about particular exceptions, you have to catch them specifically!)

Ideally, when you create your own exception classes, you should think about where they fit in to the exception hierarchy, and make them extend the appropriate exception class, and not just always use exception. For example, let’s write a function that takes a string of numeric digits (e.g., "123") and converts it to an int (this already exists in the standard function stoi(), which is defined in <string>).

unsigned int string_to_int(string s) {
    unsigned int value = 0;
    unsigned int p10 = 1;
    for(int i = s.length() - 1; i >= 0; i--) {
        char d = s.at(s.length() - i);
        if(d >= '0' && d <= '9') {
            value += p10 * (d - '0');
            p10 *= 10;
        }
        else
            throw ???; 
    }

    return value;
}

(I’m using unsigned int because we’re not taking account any minus that might be before the number.)

This works by looping over the characters in s in reverse order. We do it in reverse because the low digits (ones, tens, etc.) are on the end of the string while the higher digits (hundreds, thousands) are at the beginning. For each digit, we need to multiply it by its corresponding power of 10. There is a function in <cmath> named pow() that will compute powers for us, but it’s not the most efficient way to do it. If we think about it, we don’t need to compute any power of 10, rather, given a power of 10 \(10^p\) we need to compute \(10^{p+1}\), which we can do by just mulitplying by 10. So we keep track of the “current” power of two in p10 and multiply it by 10 each time through the loop.

What kind exception should this throw? The problem is that we’ve received an invalid argument to this function: a string containing non-numeric characters, when only numeric characters are allowed. So we should throw bad_argument. Is this the only exception we could throw? No; we could also look for situations where the number represented by the string was too big for an int and throw an overflow_error in that case. (How would we detect that? Due to the way unsigned int overflow works, we could look for the situation where p10 * 10 < p10.) We might also throw bad_argument if the input string is empty. Maybe we could look for a - at the beginning of the string and throw a range_error, indicating that this function’s range does not include negative numbers. A well-designed function will use exceptions to signal errors in every situation that it has control over; other problems (e.g., caused by functions that it uses) will just be passed on unchanged, thanks to the exception “bubbling” mechanism.

unsigned int string_to_int(string s) {
    if(s.empty())
        throw bad_argument{"Empty string"};
    else if(s.front() == '-')
        throw range_error{"Negative value"};

    unsigned int value = 0;
    unsigned int p10 = 1;
    for(int i = s.length() - 1; i >= 0; i--) {
        char d = s.at(s.length() - i);
        if(d >= '0' && d <= '9') {
            value += p10 * (d - '0');

            if(p10 * 10 < p10)
                throw overflow_error{"Input number too big for uint"};

            p10 *= 10;
        }
        else
            throw bad_argument{"Non-numeric character"}; 
    }

    return value;
}

Assertions

Assertions are part of a practice known as “defensive programming”; the idea is to intentionally make our program crash, as soon as we detect a mistake that we’ve made. Let’s take the area functions above as an example:

#include "area.h"

float rectangle_area(float width, float height) {
  return width * height;
}

float triangle_area(float width, float height) {
  return width * height / 2.0;
}

What happens if the width or height is negative? Well, that should never happen. We should have checked the width/height before we passed it to the funciton. If the width/height should really, really never be negative, such that if it is negative, it means something has gone terribly wrong and there’s no way to recover, then we can write an assertion to express that:

#include <cassert>
#include "area.h"

float rectangle_area(float width, float height) {
  assert(width >= 0 && height >= 0);

  return width * height;
}

float triangle_area(float width, float height) {
  assert(width >= 0 && height >= 0);

  return width * height / 2.0;
}

The statement assert(condition) says that condition must be true at this point in the program; if it is false, then something has gone terribly wrong and the program should stop immediately. If condition == false then the program will print an error message (giving at least the file name and line number of the failed assertion) and stop immediately.

Assertions are useful for expression preconditions, things that must be true in order for the code to work correctly. Exceptions are used in situations where it might be possible to recover from the error, so we want to give the program a chance to do that. Assertions are for situations where things are so screwed up that the program must stop. Another way to put it is, ideally, your program should never trigger an assertion; your goal should be to write other code to ensure that the condition is always true, no matter what happens.

Another difference between assertions and exceptions is that assertions can be turned off. Suppose you’ve gone over your code with a fine-toothed comb and you’ve got everything setup so that no matter what happens, no assertion will ever be triggered. Testing all those assertions still takes a bit of time, so you can disable all assertions in your program, making them do nothing at all, by adding

#define NDEBUG
#include <cassert>
#include "area.h"

float rectangle_area(float width, float height) {
  assert(width >= 0 && height >= 0);

  return width * height;
}

float triangle_area(float width, float height) {
  assert(width >= 0 && height >= 0);

  return width * height / 2.0;
}

NDEBUG means “no debugging”, which turns off all assertions. (There’s a way that you can also do this from the compiler command-line, so you don’t have to modify your files at all.)

Again, assertions are designed to capture programmer errors, your own mistakes, while exceptions are intended to capture user errors (e.g., in input) or system errors (out of memory, file not found, etc.). If you’re trying to catch your own mistakes, you don’t want the program trying to intercept those mistakes and fix them: you just want it to crash with some indication of where the mistake is.