Review of last time

Logging in to the server. Hostname, port, username, password.

Splitting strings into words. States and inputs, tabular form, state machine diagram.

Implementing string splitting using strings and vectors

The `string` class

#include <string>
using std::string;

string s = "Andy";

If you declare a string variable without initializing it, it defaults to the empty string:

string e; // e is the empty string

You can initialize a string object from a char* literal. The basic string operations are:

Operation	Description
`s.at(i)`, `s[i]`	Access individual characters by index (`char`)
`s.front()`, `s.back()`	Access first/last characters
`s.length()`, `s.size()`	Length of string
`s.empty()`	True if length is 0
`s.substr(i)`	Extract substring starting at `i`, to the end
`s.substr(i,l)`	Extract substring, starting at `i`, of length `l`
`s1 + s2`	Concatenate two strings to form a new one
`s1 < s2`, etc.	Compare strings alphabetically
`s.c_str()`	Get a `char*` representation of a `string`
`s.find(c)`, `s.find(s2)`	Get position of the first occurrence of `char c` or `string s2` in `s`
`s.rfind(c)`, `s.rfind(s2)`	Get position of last occurrence

`s.push_back(c);`	Add a character `c` to the end of the string
`s.pop_back();`	Delete the last character of the string
`s.clear();`	Delete all characters from string (len. = 0)
`s1 += s2;`, `s1.append(s2);`	Append `s2` to the end of `s1`, modifying `s1`
`s1.insert(s2,i);`	Insert `s2` into `s1`, before position `i`
`s.erase(i,l);`	Erase characters starting at `i`, of length `l`
`s1.replace(s2,i,l);`	Replace `l` characters of `s1`, starting at `i`, with `s2`
`s1 = s2;`	Copy all of `s2` into `s1`

find and rfind will return the special value string::npos if the thing you are searching for cannot be found at all.

Note that strings do not have a nul character at the end! This means that the length of a string is strictly the number of characters in it, and the last character (s.at(s.length() - 1)) is not necessarily nul. (In fact, strings can have nul characters anywhere inside them!)

To read a single word from cin, you can use

string w;
cin >> w;

This reads a word because the normal behavior of >> is to stop at the first space character it sees.

To read an entire line, use

string l;
getline(cin, l);

You can, of course, print strings normally:

string name = "Andy";
cout << "Hello, " << name << endl;

For our purpose, we’re going to read in an entire line (using getline) and then process it one character at a time. We can do the latter using a loop:

for(unsigned i = 0; i < s.length(); ++i)
  // Use s[i] ...

Or we can use the fancy ranged-for-loop:

for(char c : s)
  // Use c ...

Vectors

You can think of a vector as a version of a string that stores any kind of element, not just characters. The operations supported by vectors are somewhat more limited than those supported by strings, just because vectors cannot make any assumptions about what kind of data they are holding.

vector<int>    vi;                                     // empty vector of ints
vector<char>   vc1(10);                                // Vector of 10 chars
vector<char>   vc2(10, 'x');                           // Vector of 10 'x' chars
vector<string> names = {"Bruce", "Richard", "Alfred"}; // vector of three strings

Vectors manage their own storage, so you don’t need to delete them.

To access the elements of a vector by index, use either .at(i) or [i]:

v.at(i) returns the element at position i (starting at 0 for the first position). If i is less than 0, or larger than the last position, then an out_of_range exception is thrown. (I.e., this is like our checked array.)
v[i] returns the element at position i. If i is out of range, then the behavior is undefined (same as for accessing out-of-range on a normal array). (Some compilers will still check [] vector accesses when compiling a project with debugging enabled, but you shouldn’t rely on this.)

The shortcut methods v.front() and v.back() provide easy access to the first and last elements. These methods are “checked”: if the vector is empty, they will throw an exception.

Other vector operations:

Operation	Description
`v.size()`	Size of the vector (number of elements)
`v.empty()`	True if size is 0
`v.clear()`	Reset size to 0, deleting all elements
`v1 == v2`	True if `v1` and `v2` are identical
`v.push_back(e);`	Add a new element to the end of the vector
`v.pop_back();`	Erase the last element of the vector
`v1 = v2;`	Copy all elements from `v2` into `v1`

Note that unlike arrays, vectors support copying, via assignment. This means that you can pass vectors as parameters to functions, and return from functions as well. This makes vectors much easier to work with than arrays, especially since you don’t need to worry about dynamically allocating them. The fact that vectors can grow, via push_back makes them even more useful; e.g., if you want to read some number of ints in from the user, and you don’t know how many, you can just do

vector<int> data;
int x;
while(cin >> x)
  data.push_back(x);

It’s possible to insert/erase elements in the middle of a vector, but doing so requires the use of an iterator. Here’s an example to get you started:

vector<int> vs = {1, 2, 3, 4};
vs.insert(vs.begin() + 2, 100); // vs = { 1, 2, 100, 3, 4 }
vs.erase(vs.begin() + 1);       // vs = { 1, 100, 3, 4 }

You can also insert the contents of one vector in the middle of another.

Splitting strings

Our table for states/inputs looks like this:

	Letter	Space
WORD	Add to cur. word Stay in WORD	Finish word Switch to SPACE
SPACE	Start new word Switch to WORD	Ignore Stay in SPACE

Implementing this as a function, it will take a string as its input, and return a vector<string> as its output:

vector<string> split(string input) {
  string word;            // Current word
  vector<string> output;  // List of words

  const int SPACE = 0;
  const int WORD = 1;
  int state = SPACE;

  for(char c : input) {

  }

  return output;
}

All of our work needs to be done inside the for loop. The current character is c, and the current state is state. Since we have four table entries, we’ll have an if-else for each of the four possibilities:

  for(char c : input) {
    if(state == WORD && c != ' ') {

    }
    else if(state == WORD && c == ' ') {

    }
    else if(state == SPACE && c != ' ') {

    }
    else { // state == SPACE && c == ' '

    }
  }

Taking the first case, what do we need to do to add the character c to the current word?
```
 word.push_back(c);
```
To “stay in WORD” we don’t need to do anything at all; if we don’t change the value of state then we will stay in the same state.
In the second case, we want to “finish the word”. This requires us to do two things: add the word to the vector of words, and then clear the word, resetting it to empty, so that when we start the next word, there won’t be anything in it.
```
 output.push_back(word);
 word.clear();
 state = SPACE;
```
Third, we want to “start a new word”. Because we cleared the word in the previous case, this just means adding the current character to it.
```
 word.push_back(c);
 state = WORD;
```
Finally, “ignoring” a character means doing literally nothing, so the else case is actually empty!

There’s one last thing we have to think about, and that’s what happens at the end of the input string:

If state == SPACE then there is no current word, so we can just return the vector of words.
If state == WORD however, then there is still a current word in-progress, and because we never saw a space, we haven’t yet “finished” it. We can handle this in two ways:
- We can add a space to the end of the input string, at the very beginning of the function:
```
  input.push_back(' ');
```
- We can check the current state at the end, and add the word to the vector if it is still WORD:
```
  if(state == WORD)
      output.push_back(word);

  return output;
```
Which method you use is up to you: some people think its bad manners to modify the parameter values to a function, but doing so makes the end of the function simpler.

Trace through this on the input string "get the rock". Run example.

References and pointers

A reference is just another name for an existing variable or location:

int  x = 1; // x is a variable, OK
int& y = x; // y is another name for x

Both x and y refer to the same thing. It’s impossible to distinguish them, because they are just different names for the same object. Any changes to x will be reflected in y, and vice versa.

Because a reference is an “alias” for something, you must initialize a reference variable with another variable/location (or another reference):

int& z = 1; // ERROR
int& q;     // ERROR

(The C++ name for things that you can get a reference to is “lvalue”, as in, values that can be on the left side of an assignment. Temporary objects are rvalues.)

You can, however, get a reference to an element of an array or vector (or string):

int         arr[] = {1, 2, 3};
vector<int> vec   = {5, 6, 7};

int& y = arr[1]; // OK, another name for arr[1]
int& z = vec[2]; // Also OK

With vectors, this can be a little dangerous, because the size of a vector can change, and thus there’s the possibility that the thing the reference is referring to might disappear. But as long as you are careful, everything is OK.

Functions can take references as parameters, in which case the formal parameter becomes “another name” for whatever argument is used when the function is called. Similarly, functions can return references (as long as the thing referred to still exists after the function exits!)

References can cause some weird behavior: Look at this function; can you see any way in which it might print out a 2, instead of a 1?

void f(int& a, int& b) {
  a = 1;
  b = 2;
  cout << a << endl;
}

References have some limitations:

There’s no way to change a reference, to make it refer to something else. Anything you do to the reference is interpreted as an action on the thing it refers to.
Because of this, there’s (almost) no way to “compare” references to see if they refer to the same thing or not.
Because you can’t change references, there’s no point in having a reference to nothing, because it would always be useless!

Pointers

Pointers remove these limitations, but add some complexity in doing so. A pointer is like a reference that can change what it refers to. However, the simplicity of references comes from the fact that they cannot change; hence, there are no operations on the reference. Anything you do to a reference variable is transparently done to the the on the other end of the reference. Because pointers can change, this means that we now need two different syntaxes:

We need a way of referring to the thing on the other end of the pointer.
We need a way of referring to the pointer itself.

In other words, while references have no “identity”, the don’t exist on their own and are just aliases for other things, pointers are objects in their own right, and thus we need some way of manipulating them, as opposed to the thing they point to. All of the complexity of pointers springs from this duality: we always have to be clear what we are doing, are we manipulating the pointer or the object it points to?

Semantically, pointers work by storing addresses. Every (lvalue) object in our program exists somewhere in the computer’s memory; every location in the computer’s memory has an address, a number. A pointer stores an address of another thing. “Dereferencing” the pointer means going to the address it contains.

Pointer syntax:

Type	Description
`T*`	“Pointer to T”
`T&`	“Reference to T”

Expression	Description
`*p`	Get object pointed to by `p` (look at the addr. in `p`)
`&v`	Get pointer to object `v` (get the addr. of `v`)

The connection between the two is

If p has type T* then *p has type T
If v has type T then &v has type T*

Thus, for expressions, * and & are kind of like opposites. & adds a layer of “pointer to-ness” while * removes a layer.

Although pointers are often introduced with dynamic memory, you don’t actually have to do that. Just like with a reference, you can get a pointer to any lvalue:

int  x = 1;
int* y = &x; // y points to x

x++;         // Increments x
(*y)++;      // Also increments x

(Note that ++ has higher precedence than *, so the parentheses are necessary!)

As with references, you can get a pointer to an element of an array, vector, or string:

int         arr[] = {1, 2, 3};
vector<int> vec   = {2, 3, 4};
string      str   = "Hello";

int*  ap = &arr[1]; 
int*  vp = &vec.front(); // Same as vec.at(0)
char* cp = &str[3];

Once again, with strings and vectors, because elements can be removed, you have to be careful to make sure that the pointers still point to something that exists! A pointer that points to a non-existent object is called dangling and using it results in undefined behavior!

If we do ap == vp will this be true or false? What are we asking? We are asking if ap and vp point to the same thing. We are not asking anything about the values pointed to by ap and vp. Similarly, if I do, ap = vp; what happens? Do any of the values in the array or vector change? No: only the pointer ap changes what it is pointing to. The most important thing about pointers is to get yourself clear on when we are talking about the pointers themselves, and when we are talking about the things pointed to. If you have some pointer variables, and there aren’t any *s in the expression, then you are talking about the pointers. If there are *s then you are probably talking about the objects pointed to.

Pointers to array/vector elements

If we have a pointer into an array, vector, or string, we can do some interesting things with it:

vector<int> vec = {1, 2, 3, 1};
int* p1 = &vec[0];
int* p2 = &vec[3];

Consider the following:

 p1 == p2     // True or false?
*p1 == *p2    // True or false?
 p1 <  p2     // True or false?

p1 + 3        // What does this mean?
*(p1 + 3)     // What about this?
p1[3]         // This?
p1+3 == p2    // True or false?
p1++;         // What happens if we do this?

Pointers into an array can have arithmetic done on them (adding/subtracting integers) and it will cause them to move around within the array. Adding 1 moves the pointer one element forward within the array. Comparisons between pointers effectively compare the the indexes within the array that they point to. Note that none of the above operations change the elements in the array!

Pointers to structure and class instances

There are a few extra things we can do with pointers if we have a struct or a class. Here’s an example struct:

struct thing {
  int a;
  char* b;
  string s;
};

char c = '?';
thing t1 = {1, &c, "Hello"};
thing* tp = &t1;

How do I refer to each of the members of t2?

(*tp).a     // == 1
(*tp).b     // == pointer to c
(*tp).c     // == "Hello"

This is a bit cumbersome to type out, so there’s a shortcut:

tp->a
tp->b
tp->c

What if I want to access the actual char pointed to by tp->b?

*((*tp).b)   // == '?'
*(tp->b)     // == '?'

We can also get a pointer to a member of the structure:

int* ip = &(t1.a); 
        = &(tp->a); // Same thing

Finally, nothing stops us from having multiple pointers to the same structure:

thing* tp2 = tp; 

tp == tp2    // True
tp->a = 2;   // Also changes tp2->a

All of this applies equally to classes, and to methods. If we have a class

class thing {
  public:
    void run() {
      cout << "WHATUP" << endl;
    }
};

and then we have a pointer to an instance of that class

thing* t = ... ;

Then we can use either (*t).run() or t->run() to dereference t and call the run method:

(*t).run();
t->run();    // Same thing

In this semester, we’ll use a lot of pointers, but they will usually be pointers to structs or classes, so we’ll use -> a lot. We’ll almost never use &, because we generally won’t need to get pointers to existing objects. We’ll get our pointers by creating objects dynamically, via new. (Remember that new T returns a T*, a pointer to a T.

The null pointer

nullptr is the value you should use for the null pointer (the pointer that points to nothing at all), not NULL. The null pointer is unique: no other, non-null pointer is == to it, and it is == only to other null pointers. Thus, you can use nullptr to signal that a pointer doesn’t point to anything at all. This gives pointers a bit of power that normal “values” don’t have. E.g., if we have an int variable, there is no way to say that it contains “nothing”; it always contains some int value. But if we have an int pointer then there are two possibilities:

It is non-null, so then it points to some int (and hence has a value)
It is null, and thus has no int value at all.

Dynamic memory allocation

I’ve intentionally separated pointers from dynamic memory because often students assume that whenever you have pointers to must also have dynamic allocation somewhere, but this is not the case. You can get a pointer to anything that “hangs around”. Dynamically allocated objects are particularly convenient for this, because they hang around until we delete them, but its not required to use them. Dynamic allocation is useful when we don’t how what or how many objects we will need (or how long they need to live), until our program is running.

Stack vs. heap. Activation records. Heap allocation. Lifetime of objects.

new finds space on the heap, big enough for an object of the given type, and then gives you a pointer to that space. (The array-new finds enough space for multiple objects of the same type.) Because the object is allocated on the heap, it will remain “alive” (taking up space) until we delete it.

A common mistake (coming from the assumption that pointers have to be used with dynamic allocation) is to write something like this:

int x = 12;
...

// Now we want a pointer to x
int* p = new int();
p = &x;
// carry on with p

What’s wrong with this? We create a new (dynamic) int, but then immediately discard the pointer to it, replacing it with a pointer to x. This means that we have no way to refer to the heap memory we just reserved, and hence no way to delete it. That memory is lost to our program, until our program exists; we have created a memory leak. The correct way to do this is

int* p = &x;
// carry on with p

If the thing you want to get a pointer to already exists, then just initialize your pointer to that; there’s no need to allocate any space. Only use new when you want to reserve some additional space on the heap.

Similarly, just because we are done with p does not mean we need to delete it. In this case, because the thing pointed to by p was not created via new, it should not be destroyed via delete.

An object can only be delete-d once; we saw last time that trying to delete something more than once crashes your program with a “double-free” error. This means that if we allocate something, we need to decide what part of our program “owns” it, and that part is responsible for delete-ing it. A pointer that points to an object that it will eventually delete is called an “owning” pointer. It “owns” the object on the other end, and somewhere, that object will be deleted via that pointer. There should only ever be one owning pointer to an object; as more than one would mean that the object would be deleted more than once. There can, of course, be any number of non-owning pointers to an object, because those won’t delete it when they are done.

Assignment 1

The first assignment is a kind of C++/CSci 123 review, but also hopefully gets you thinking about the kind of issues that will be important to us. The assignment is to implement an ordered array. A lot of the operations will be similar to those on the bag structure we implemented earlier, but a few are different.

Like a bag, an ordered array has a maximum size (capacity), set when it is created and then never changed after that. (The ordered_array class adds a method which returns the capacity.)
Like a bag, we can ask for the current size() of an ordered array, which will always be ≤ its capacity.
Like a bag, we can use at(i) to access the i-th element of the array. Unlike a bag, if i < 0 or i > size() then at(i) should throw std::out_of_range("..."); with a suitable message. at return a reference, but you don’t have to do anything special to return a reference, so don’t worry about it.
The combination of at and size means that we can loop over the elements of an ordered array arr with:
```
 for(int i = 0; i < arr.size(); ++i)
   cout << arr.at(i) << endl;
```
The key property of an ordered array is that a loop like this will always process the elements of the array in ascending order. In other words, arr.at(i) <= arr.at(j) whenever i <= j. As elements are added/removed to/from the array, the sorted order of the elements must be maintained.
New elements can be added to the array with insert(e). If the array is full, insert should do nothing.
Existing elements can be removed with remove(e). Note that unlike a bag, remove takes an element’s value and not its index. If the element does not exist, nothing happens.
We can check whether an element exist with exist(e) which returns true if it does, and false otherwise.

Because an ordered array has a size, a capacity, and its elements, you will probably need three data members: size, capacity, and a dynamic array of elements (you could also use a vector).

The description of how the ordered array works says that the special value -2147483648 cannot be stored in the array. E.g., if you try to insert it, nothing happens, exists(-2147483648) always returns false, etc. This seems arbitrary, but it’s done so that if you want, you can use -2147483648 as a “special” value internally. E.g., if you store the contents of the ordered array in a (dynamic) array, you could implement remove(e) by finding the location of e and then simply putting a -2147483648 there, to mark it as “deleted”. This method will require you to implement every other method to ignore -2147483648 entries. If you use this method, you don’t need to store the size: the size of the array is just the number of entries that are != -2147483648.

Alternatively, you could implement it the way we implemented the bag, by using the front portion of the array, starting at index 0, for the “used” elements, and the second half for unused elements. Using this method, you must store the size of the array, and remove must shift array elements around (we cannot simply swap to the end, because that would un-sort the array!).

The assignment asks you to think about how much time the various operations will take, relative to the size of the array. E.g., as the array grows, which operations will get slower? Which will take the same amount of time? We’ll investigate these ideas further in the next few lectures.

Analyzing algorithms

We’re going to talk about analyzing algorithms, how can we quantify the time-taken (or memory used) by a given algorithm? We could run it and measure it, but that doesn’t give us enough information: it only tells us how fast it runs, on one particular input, on one particular computer and compiler. What we want is a more general way of describing the behavior of an algorithm, not in terms of “seconds” or “bytes”, but more abstractly.

In fact, what we are going to consider is not “how long” an algorithm takes, but rather, the growth rate of runtime, as the size of the input increase. E.g., for one algorithm, doubling the size of the input might double the runtime; for another, it might quadruple. Clearly, the former is better than the latter, even though we don’t know anything about the actual runtimes.

Before we can do that, we need to review some math:

Review of summation notation

$$\sum_{i=j}^k f(i)$$

This means, add up all the values of $f(i)$ for every f between j and k (including $i = k$). In other words,

$$\sum_{i=j}^k f(i) = f(j) + f(j+1) + \ldots + f(k-1) + f(k)$$

The simplest summation is just

$$\sum_{i=j}^k 1 = 1 + k - j$$

Because it’s just a big sum, there are some things we can do with it:

Factoring

What is

$$\sum_{i=j}^k 2 f(i)$$

This sum works out to

$$\sum_{i=j}^k f(i) = 2 f(j) + 2 f(j+1) + \ldots + 2 f(k-1) + 2 f(k)$$

which we could factor into

$$2 ( f(j) + f(j+1) + \ldots + f(k-1) + f(k) )$$

but the thing inside the parens is our original sum, so we can say that

$$\sum_{i=j}^k 2 f(i) = 2 \sum_{i=j}^k f(i)$$

That is, for any constant c (and by “constant” we mean anything that does not depend on i), we can factor

$$\sum_{i=j}^k c f(i) = c \sum_{i=j}^k f(i)$$

Splitting

What is

$$\sum_{i=j}^k f(i) + g(i)$$

The sum works out to

$$f(j) + g(j) + f(j+1) + g(j+1) + \ldots + f(k-1) + g(k-1) + f(k) + g(k)$$

but we can rearrange the terms into

$$f(j) + f(j+1) + \ldots + f(k-1) + f(k) + g(j) + g(j+1) + \ldots + g(k-1) + g(k)$$

and this is just

$$(\sum_{i=j}^k f(i)) + (\sum_{i=j}^k g(i))$$

In other words, if the body of a summation is itself a sum, we can split it up into multiple summations.

Removing terms

Finally, we can remove a number of terms from a summation by adjusting the ends, either as

$$\sum_{i=j}^k f(i) = f(j) + \sum_{i=j+1}^k f(i)$$

or as

$$\sum_{i=j}^k f(i) = f(k) + \sum_{i=j}^{k-1} f(i)$$

Of course, we could remove more than one term if that was useful, or remove terms from both the beginning and the end. We just have to make sure we adjust the endpoints properly!

Change of variable

Finally, sometimes it’s useful to adjust the endpoints themselves, by pushing them up or down. In order to do this, we have to replace the summation variable $i$ with a suitable adjusted $i’$. For example,

$$\sum_{i=0}^k f(i) = \sum_{i’=1}^{k+1} f(i’ - 1)$$

where $i = i’ - 1$. (If the presence of the final $k+1$ term in the new sum was problematic, we could use the term-removal technique to strip it off.)

Basics of complexity analysis

Example 1: largest element

int largest(vector<int> values) {
    assert(values.size() > 0);

    int l = values[0];
    for(int i = 1; i < values.size(); i++)
        l = values[i] > l ? values[i] : l;

    return l;
}

We’re first going to do a fine-grained analysis of the work that this function does, in terms of the number of operations. We’re then going to take that and derive an asymptotic bound for the work, as the size of the input (i.e., values.size()) gets large.

How long does this take to run, for a vector of a given size()? Well, that largely depend on things like:

How long does it take to compare things?
How long does assignment take?
How long does it take to evaluate values.size()
How long does it take to increment i?

Note that none of these things depend on the size of the vector. Each of these things will take some constant, but unknown, amount of time, regardless of how big of a vector we give it, so we can represent them as constants:

Compare: $C$
Assign: $A$
Compute size: $S$
Increment: $I$

Now, if values.size() == 1 how long will the program take to run? We do

1 compare (in the assert)
1 assignment (from values[0] to l)
1 assignment (i = 1)
1 size compute (in i < values.size())
1 compare (i < values.size())

Giving us a total time of $2C + 2A + S$. If values.size()$= n$, then we can figure out the total time taken, by looking at how much work is involved in the loop body:

1 compare (values[i] > l)
1 assignment (l = ...)
1 increment (i++)
1 size lookup (i < values.size())
1 compare (i < values.size())

So we have a base time of $2C + 2A + S$, plus a per-iteration time of $2C + A + I + S$ giving us a total time of

$$2C + 2A + S + n(2C + A + I + S)$$

Note that for all but the smallest vectors, the second term is going to dominate the the sum, so much so that we could really say that the total time is approximately proportional to $n$, the size of the vector. A function like this one, that takes time proportional to the size of its input for “big enough” inputs, is generally said to run in “linear time”, meaning that the run time grows linearly with the size of the input: double the size of the input, and the runtime (roughly) doubles, too.

Note that this classification (linear time) is true regardless of what computer we run this function on. Different computers will have different constants, thus it makes no sense to ask, in the abstract, how many seconds will largest take to run for some input, but we can abstractly ask, how fast will its runtime increase, relative to an increase in the size of its input? In this case, there is a linear correlation.

We generally care more about this kind of classification of an algorithm, than the raw “run time” sum we computed above. This allows us to classify algorithms by how fast their run time grows, as the input size increases. If algorithm A grows more slowly than algorithm B we can say that A is “better” than B.

Example 2: finding an element

bool find(vector<int> values, int v) {
    for(int i = 0; i < values.size(); ++i)
        if(values[i] == v)
            return true;

    return false;
}

We can analyze the work of this function in terms of increments, equality comparisons, and calls to values.size() (there are no simple assignments in this function). Here, however, we run into a problem: the number of times that the loop runs depends on the values inside the vector. That is, even for two vectors with the same size, the function might take dramatically different amounts of time. In this case, instead of just asking for the work of the function, we analyze the best and worst cases. (The function from the previous example has its best and worst cases identical, so there’s no point in looking at them separately.)

In order to determine what the best/worst cases are, we need to figure out how to “rig” the contents of the vector so that a) the loop exits as early as possible (best case) and b) the loop runs all the way to its natural end (worst case). (Note that the one thing we cannot do is choose particular values for $n$, the size of the vector. You cannot simple say, “the best case is when the vector is empty”, as that would be the best case for any algorithm, and thus doesn’t give us any useful information. Remember: you can never simply pick a value for $n$!)

To make the loop exit as early as possible, we want the if condition values[i] == v to be true the first time through the loop, when i == 0. So we simply say that the first element of the vector is equal to v. (This is the case where the searched-for element is found right away.)
To make the loop run all the way to the end, the if condition must never be true. Thus, none of the elements of the vector must be equal to v.

So our best case occurs when the first element of the vector == v and the worst cast occurs when none of the elements of the vector == v. Note that the worst case always occurs when the function returns false.

	Best case	Worst case
Calls to `size()`: S	1	$n+1$
Increments: I	0	$n$
Comparisons (`==`): C	1	$n$

So the total work of the function would be

$$S + C$$

in the best case and

$$S(n+1) + nI + nC$$

in the worst case.

Often you’ll need to determine whether a function has different best/worst cases, or are they the same. Look for loops that have a possibility of an early exit, either by return or by break.

Example 3: quadratic time

As an example, take a look at this function: It tries to determine whether any pair of elements from a pair of vectors multiply to equal a particular value:

bool has_product(vector<int> a, vector<int> b, int p) {

    for(int v1 : a)
        for(int v2 : b) 
            if(v1 * v2 == p)
              return true;                

    return false;
}

Once again, this function has both best and worst cases. The best case is simply when the first elements of both vectors, multiply to p. The worst case is more interesting, and occurs when the product p cannot be found by multiplying any two elements. Assuming that a and b have the same size, n:

The outer loop will be executed $n$ times.
For each time the outer loop is executed, the inner loop will be executed $n$ times.
Thus, the total number of times the inner loop body will be executed is $n^2$.

The runtime of this algorithm increases proportionally to the square of the size of its input. That is, if we double the size of the vector, the runtime will roughly quadruple. This is much worse that a linear-time algorithm, where doubling the input size only doubles the runtime.

An algorithm like this is said to run in polynomial time, indicating that its runtime is proportional to its input size $n^p$, where $p \ge 2$. Obviously, $n^3$ is worse than $n^2$, and so forth.

Note that our analysis assumes that the input is “large”. For any problem, there will be some small inputs where a theoretically worse algorithm may perform better. The reason why is in the word “proportionally”. If the runtime is “proportional” to $n^2$ then that means there is some unknown constant $k$ such that the runtime $t \approx k n^2$. But $k$ varies depending on the implementation of the algorithm, so it’s possible for a polynomial algorithm to have a small $k$, while the corresponding linear algorithm has a larger $k$. In this case, the “slower” polynomial algorithm will be faster, for inputs up to some size. But there is always going to be a cutoff, a point where the linear version is always faster.

Big-O Notation

We refer to the function that an algorithm’s runtime is proportional to as its order and say that an algorithm is $O(f(n))$ to mean that its order is $f(n)$ (i.e., is roughly proportional to $f(n)$). This is colloquially known as “big-O” notation.

An algorithm is of order $O(f(n))$ if there exist constants $k$ and $n_0$ such that for any input of size $n > n_0$, the runtime of the algorithm is no more than $k f(n)$.

Or, mathematically, a function $f(n)\ \in O(g(n))$ iff

There exists $k > 0$ such that for all $n > n_0$:

$$f(n) \le k g(n)$$

($n_0$ captures the notion of our analysis applying to “big enough” inputs. Only if the size of the input is larger than some minimum does the proportionallity apply. Similarly, $k$ captures proportionality.) We are assuming that both $x(n)$ and $f(n)$ are strictly positive.

If you want a calculus-style definition, we can also use a limit:

$$f(n) \in O(g(n))\; \text{iff}\; \lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} \in [0, \infty)$$

(That is, the limit must exist and be finite.)

Intuitively, you should think of $f(n) \in O(g(n))$ as meaning “$f(n) \le K g(n)$ for some K and really big n”. (There are other “big-letter” notations which correspond to ≥, <, equal-to, etc.)

Summations of Big-O

There is a summation identity that will make working with big-O notation much easier: if $f(i) \in O(i^p)$.

$$\Sigma_{i=1}^n f(i) = O(n^{p+1})$$

This basically says that if you sum up a function which is $O(n^p)$ then the resulting function will be $O(n^{p+1})$. $n^2$ becomes $n^3$ and so forth.

How Big-O works

Analyze how big-O works: what can we conclude from the definition of $O$?

Well, for one thing:

$$a_1 n^p + a_2 n^{p-1} + \ldots + a_p n^{1} + a_{p+1} \in O(n^p)$$

that is, we can drop all but the highest degree term in a polynomial.

For example, suppose we want have

$$100 n + n^4 \in O(n^4)$$

To show that this is true, all we have to do is figure out appropriate $k$ and $n_0$ to make the definition true. If we let $k = 100, n_0 = 2$ then we have

$$100 n + n^4 \le 100 n^4, \quad n > n_0$$

Note that if we try to show that $100 n + n^4 \in O(n^2))$ we will fail; there is no $k, n_0$ that will make the definition true.

Some other properties:

$$c f(n) \in O(f(n)) \quad\mathrm{if}\quad c \ne 0$$

This says that we can ignore constant multiples (makes sense, because we can always fold them into $k$).

$$\text{if}\;f(n) \in O(F(n)), \quad g(n) \in O(G(n))$$ $$f(n) + g(n) \in O(F(n) + G(n)))$$

If we add two functions together, then the order of the sum is just the sum itself, but because in a sum we can drop all but the fastest growing term, we can rewrite this as

$$f(n) + g(n) \in O(\text{max}(F,G)(n))$$

On the other hand, if we take a product:

$$\text{if}\;f(n) \in O(F(n)), \quad g(n) \in O(G(n))$$ $$\text{then}\; f(n) g(n) \in O(F(n) G(n))$$

Note that this means that if we have some algorithm which we know is of order $O(F(n))$, and we run it $n$ times, then the result is of order $O(n F(n))$.

Because big-O is a kind of generalized $\le$, there is something like the transitive property for it:

$$\mathrm{if}\quad f(n) \in O(g(n))\quad\mathrm{and}\quad g(n) \in O(h(n))\quad\mathrm{then}\quad f(n) \in O(h(n))$$

Finally, adding a constant has no effect on a function’s order (because a constant is the lowest order, and hence always drops off).

$$f(n) \pm c \in O(f(n))$$

Question: which grows faster, $n \log n$ or $n^2$? The multiplication rule means that we can factor out a common $n$, thus we are left with the question of which grows faster, $n$ or $\log n$. This should hopefully be easier.

Which grows faster, $n^{100}$ or $2^n$?

Complexity Classes

Complexity class	Name
$O(1)$	Constant
$O(\log n)$	Logarithmic
$O(n)$	Linear
$O(n \log n)$	N-log N (“linerithmic”)
$O(n^2)$	Quadratic
$O(n^3)$	Cubic
$O(n^p)$	Polynomial
$O(2^n)$	Exponential
$O(n!) \equiv O(n^n)$	Factorial

	Best case	Worst case
Calls to `size()`: S	1	\(n+1\)
Increments: I	0	\(n\)
Comparisons (`==`): C	1	\(n\)

Complexity class	Name
\(O(1)\)	Constant
\(O(\log n)\)	Logarithmic
\(O(n)\)	Linear
\(O(n \log n)\)	N-log N (“linerithmic”)
\(O(n^2)\)	Quadratic
\(O(n^3)\)	Cubic
\(O(n^p)\)	Polynomial
\(O(2^n)\)	Exponential
\(O(n!) \equiv O(n^n)\)	Factorial