Basic I/O, and the string and float types

Review of last time

Write up the operator table, go over any that we didn’t get to last time.

Operator(s)	Name	Arity	Pre/Post	Assoc.
`- +`	Unary minus/plus	Unary	Pre.	N/A
`!`	Logical NOT	Unary	Pre.	N/A
`* / %`	Times, div, mod	Bin.		Left
`+ -`	Plus, minus	Bin.		Left
`<< >>`	Ins/Ext, Shift	Bin.		Left
`< > <= >=`	Comparison	Bin.		Left
`!= ==`	(In)equality	Bin.		Left
`&&`	Logical AND	Bin.		Left
`\|\|`	Logical OR	Bin.		Left

Assignment operators

The simplest assignment operator is =:

int x;
x = 1;      // Value stored in x becomes 1
x = x + 1;  // Value stored in x becomes 2

int y;
cin >> y;
x = y; // Value stored in x becomes whatever the user entered

Operations like x = x + 1 are common enough that there are shortcut assignment operators for them: x += 1 is exactly the same as x = x + 1. The shortcut operators are

x += y;    // Same as x = x + y
x -= y;    // x = x - y
x *= y;    // x = x * y
x /= y;    // x = x / y

Like all assignment operators, these are right associative. This means that you must fully compute the right-hand input to the assignment before performing it. This means that

x -= 1 + 2;

is interpreted as x = x - (1 + 2) and not as x = x - 1 + 2. Don’t forget the implicit parentheses!

x += 1 and x -= 1 are common enough that there are even shortcuts available for them:

++x; // Value stored in x increases by one
--x; // Value stored in x decreases by one

These are called the increment and decrement operators. They are available in both prefix (++x) and postfix (x++) forms, although the difference is only apparent when they are used in larger expressions, not as a single statement. If you’re curious, try running this code:

int x = 1;
cout << "++x = " << ++x << endl;
cout << "x   = " <<   x << endl;

int y = 1;
cout << "y++ = " << ++y << endl;
cout << "y   = " <<   y << endl;

and see what it prints.

Today

Some new types, more than just int. float, string, and char
Lower-level I/O: get() for reading single characters, getline() for reading an entire line, ignore() for skipping over some characters.
Operations on strings (string-style statements and expressions)

More about boxes (variables)

Proper name for a box is “variable”. It’s unfortunate that we’ve “borrowed” the name from algebra, because just as = in C++ is very different from in algebra, a “variable” in C++ is very different from an algebraic variable. As we’ve seen, a variable holds a value, but that value can change over the lifetime of our program. (As opposed to algebra, where it’s value may be unknown, but is always the same in a particular system of equations.)

One thing C++ does have in common with algebra is that the type of a variable remains the same over its lifetime. Thus, we can talk about the type of a variable/box, or the type of the value within it interchangeably, they are guaranteed to be the same.

Speaking of the “lifetime” of a variable, how long does a variable live? (Note, as we’ll see later, this isn’t necessarily the same question as how long the value in it lives!) In this program

int main() {
  // ...some stuff
  int a;
  // ...more stuff
  return 0;
}

it is an error to try to access a in the “some stuff” section. We call the region of the program’s source code where the name of a variable can be used the scope of that variable. There’s a very basic rule that governs the scope of variables:

The scope of a variable extends from its declaration to the end of the current block.

Blocks

What’s a block? A block is a kind of “compound statement” consisting of zero-or-more statements enclosed in curly-braces. This means that the “body” of main is in fact a block, but we can use a block anywhere where a single statement is expected:

int main() {
  cout << "Hello";
  {
    cout << "World";
  }
  cout << "Goodbye";
  return 0;
}

Blocks don’t change the order in which things happen; we still move from one statement to the next, moving inside and out of blocks as necessary. Note that the empty block is a thing, and it’s sometimes useful:

{ }

You can think of the empty block as a statement that doesn’t do anything. Sometimes C++ will tell you you must put a statement somewhere; if you don’t want anything to happen, you can use an empty block.

Variable scoping and blocks

This means that variables are actually local to the block that contains their declaration. That is, a variable’s box is accessible from the point in the program where we say int a (or whatever) to the closing curly brace that ends the block it’s in. For example, in this program:

int main() {
  int a = 1;
  {
    int b = 2;
    ...
  }
  int b = 3;
  ...
  return 0;
}

Normally it’s an error to have two variables with the same name in the same scope, but here the two b‘s are in different scopes. Let’s see how this works out (annotate text with scope starting and endings).

In particular, the “inner b”’s scope ends before the outer b comes into existence, so there is no conflict.

There’s one more subtle feature of variables and scopes. If we do this, it’s an error (only one declaration of a name per-scope allowed):

{
  int a = 5;
  ...
  int a = 2;
  ...
}

but if we wrap the second declaration in a block, there’s no error!

{
  int a = 5;
  ...
  {
    int a = 2;
  }
  ... 
}

What’s going on here? Let’s add some couts and see what things look like inside:

#include <iostream>
using namespace std;

int main() {
  int a = 5;

  cout << "a outside = " << a << endl;
  {
    int a = 2;
    cout << "a inside = " << a << endl;
  }
  cout << "a outside = " << a << endl;

  return 0;
}

What’s happening is that the inner declaration of a “shadows” the outer one. This means that the outer a is temporarily hidden, a new box named a is created, and within the inner block, a refers to this new box. When we reach the end of this block, the new a goes away and the old a (which never really went anywhere) is available again.

It’s considered very bad form to intentionally shadow names like this, because it’s confusing. You can’t tell at a glance what a particular name is referring to.

The important thing to remember is that variables “die” at the end of their enclosing blocks. Right now we’re not using blocks, but later on we will, so you’ll need to remember that if you create a variable within a block, you won’t be able to use it outside the block.

When we declare a variable inside a function definition or block we call it a local variable and say that it has local scope. Remember that local scopes (except for nested blocks) are independent of each other

There is another kind of scope that lets us create variables which are shared by all definitions, whose scope in fact extends over the entire file. These are global variables.

To create a global variable, simply put its declaration outside of main, before it:

int x = 1;

int main() {
  cout << x; // x is accessible here

  return 0;
}

In fact, for a global variable, there is a single box labeled x for the entire program. (If you create an x within a function, it will shadow the global x, just as it does with scopes.) Global variables can be used not just by main but by other definitions that we create.

What is the scope of a global variable? It extends from the point of its declaration to the end of the file. (Later on, when we look at multi-file programs, we’ll see that this means that global variables from one file are not normally accessible from other files.) Note that this means, just like local variables, that you must declare a global variable before you use it anywhere.

What is the lifetime of a global variable’s value? It is the entire duration of your program: global variables come into existence even before main starts, and don’t go away until after main returns. This is necessary because otherwise it would not be safe for main to use global variables. E.g., imagine if we had

int version = 113;

int main() {
  cout << "My Program (c) " << version << endl;
}

If version is not properly setup before main starts, it would not be safe for main to print it out!

To summarize, we now have two kinds of scopes:

Local. Begins at declaration, ends at the end of the block.
Global. Begins at declaration, ends at the end of the file.

More types of boxes

Just to reiterate, what does this do:

int weight = 2;

(Creates a box called weight, that can hold an int, and sets the value in it to 2.)

The only two types we’ve seen so far are int (for integers) and bool (for true/false). Here is the next type: float is the type of “floating point values”, I.e., numbers that have a decimal point in them. When we write 1 in our program, C++ knows that it’s an int, but if we write 1.0 then C++ knows that it’s a float.

If we want to our BMI program to not round things off, all we need to do is change the boxes to floats. Note that we don’t change the int before main, because that determines what kind of output main will give.

(Demonstrate)

Note that we can also enter values with a decimal point now, when prompted for height/weight. cin now knows that we are expecting a floating point value, will accept “1.0” as an input (if we were extracting into an int, it would read 1 and then stop.)

Question: what happens if we leave height/weight as int, and just change bmi to being a float? Is that sufficient to make the result into a decimal value?

(Demonstrate)

No, in fact, it is not. This is because the calculation for BMI is still composed entirely of ints, so C++ will still round it off. It doesn’t know where the result of that calculation will be stored until after it has completed it and rounded off. If we want to make the result actually be a float, we have to ensure that the intermediate results are floats also.

One easy way to make this happen is to write

703.0 * weight / (height * height)

Just adding a decimal point to the constant causes everything else to be upgraded to a float.

This is the result of something called “promotion”. C++ has a hierarchy of number types. For now, all you need to know is that if you mix floats and ints, the result will be a float. For example, 3 / 2 is totally int and will be rounded off; 3 / 2.0 has one float in it, so both sides will be implicitly “promoted”, as if you had written 3.0 / 2.0.

int is promoted to float because float is “bigger”. int can only hold whole numbers, but float can hold both whole numbers and fractions. So we can always store an int-value in a float box without losing anything, but the opposite is not necessarily true.

Some weirdness with floats

The name “floating point” comes from how floats are stored in the computer: as a certain number of significant digits (around 7 errors start creeping in), and then, separately, the position of the decimal point as an exponent. This means that we can represent very large, or very small numbers, just by moving the decimal pointer very far to the right or the left (demonstrate), but we can’t really store more than 7-8 actualy digits before information starts to get lost.

E.g., try doing this:

float f = 1.234567891233455;
cout << f;

you’ll find that it’s been “shortened” to 1.23457; the remaining digits were lost. C++ will always chop off the least significant digits, because they represent the part of the number closest to 0. Notice that this happens when we try to add a very small and very large number together:

float f = 123456.0 + 0.0000123456;
cout << f;

The output is 123456; the smaller digits were discarded completely.

There are other quirks to dealing with floats, but they are beyond this scope (ha!) of this class. If you’re going to use them in some kind of scientific application, you’ll want to be aware of all their weird quirks and limitations.

Expressions vs. Statements

We’ve seen statements and expressions, so let’s take a look at how things work when they are combined:

cout << 1 + 2;

1 + 2 is an expression, and the whole thing is a statement, but what are the steps within the statement that will be taken?

Evaluate the expression 1 + 2 to get 3.
Execute the statement cout << 3.

It’s important to distinguish expressions from statements, because expressions actually come up a lot, probably even more than statements!

What kinds of expressions are there? Here are the ones we’ve seen so far:

"text" – A string literal is an expression, that evaluates to itself. This is true of any kind of literal. 1 is an int literal that evaluates to itself.
(..) – Any expression in parentheses is also an expression. This works like in arithmetic, to override the normal precedence rules about what gets done first.
Variables – The name of a variable is itself a kind of expression, which evaluates to the value stored in the variable.
1 + 2 – We’ve seen all the operators, and you can use all of them to make larger expressions.
int(1.2) – This is a conversion. Normally, if we try to do something like
```
 int x = 1.2;
```
C++ will warn us about the fact that this will throw away information. But we can tell it that we intentionally want to do this by writing a conversion: int(1.2) means convert 1.2 to an int, by rounding it off. This is probably the most common conversion you’ll use.

The interesting thing about expressions is how we can combine them to build larger expressions. With statements, the only ways we have of putting them together is to string them together in a straight line, or put some inside a block. Either way, they still get executed from beginning to end. With an expression, we can build it out of subexpressions. This means that in something like _ + _, both the left and right sides can be other expressions. We can build quite complex expressions by starting with simple expressions (literals, variables) and then joining them with operators into larger and larger expressions.

The only reason I mention this is so that you don’t think the expressions I show you are somehow special. Pretty much any place where I use a variable name, or a literal value, you could put in an expression of the same type.

Expressions have types, just like variables. The type of an expression generally depends on the operation, and on the inputs to it. For example, we’ve seen that the type of 1 / 2 is int, but the type of 1.0 / 2 is float. The conversion expressions can be used to manually convert between types, and some conversions are done in the background, implicitly (as in float a = 1;) But sometimes the types don’t make sense, and that’s when you’ll get an error.

Strings and characters

We mentioned that "..." is called a string literal and that C++ just stores it’s contents verbatim (hence the name, literal). There’s not much we can do with string literals, besides print them; to do more, we need the string type. This type is not built-in, we have to bring it in via

#include <string>

This lets us do

string name = "Your Name Here";

i.e., string is another kind of variable (box) we can create. What can we do with strings? (Note that I don’t expect you to memorize all these, though you should write them down. I’m just trying to show you all the interesting things you can do with strings.) (All of these should be drawn out, using the example name = "Andy Clifton").

Expression: We can concatenate them together:
```
 string first = "Andy";
 string last = "Clifton";
 string name = first + " " + last;
```
(Unfortunately, due to a quirk in the language, we can’t write just "Andy" + " " + "Clifton"; at least one of the things involved must be a “real” string, either by being a string-typed box, or by using string("Andy").)

Note that concatenation is an expression; it doesn’t change the contents of first or last, it doesn’t actually do anything with the resulting string unless you put it inside a statement (or variable declaration, like above). Expressions don’t do anything unless you put them inside a statement (you can almost always put an expression inside a cout << statement to see what value it would compute).
Statement: We can append some text to the end of an existing string:
```
 string name = "Andy";
 name.append(" Clifton");
 // name = "Andy Clifton";
```
Note that while concatenation is an expression (it gives back a string), appending is a statement (it does something, modifies an existing string). If you want to, you can think of a.append(b) as being like
```
 a = a + b;
```
There’s no statement to “prepend” some text onto the beginning of a string, but we can fake it with +:
```
 name = "Mr. or Ms. " + name;
```
Expression: We can get the length of a string:
```
 string name = "Andy Clifton";
 cout << name.length(); // Prints 12
```
(What is name.length()? an expression.)

The type of .length() is int (or something int-like)

The length of a string is the number of characters in it. Note that spaces count as characters!

Note that the length of a string is a int; so what goes in (a string) is not necessarily the same as what comes out (and int).

(.size() is a synonym for .length().)
Expression: We can search for some text:
```
 string name = "Andy Clifton";
 cout << name.find("on");
```
Positions within a string start at 0, so this will print 10. If the text can’t be found at all, .find will return a value (much!) larger than the length of the string.

(Draw out the string and number the positions.)

find searches starting at the beginning, and gives you the position of the first match. There’s also rfind, which gives you the position of the last match.

A variant of find/rfind let’s you tell C++ where to start searching; useful if you don’t want to start at the very beginning or end:
```
 int x = name.find("on", 5); // Start at position 5
```
Note that you can also use find/rfind to search for single characters.

Like .length(), .find() and .rfind() give back ints.

If the thing you’re looking for cannot be found in the string, then find and rfind both give back the magical value string::npos which is short for “no position”. You can use == to compare find’s result to this, e.g.,
```
 s.find("hello") != string::npos
```
Statement: Now that I’ve mentioned how positions in strings work, we can insert another string at the specified position:
```
 name.insert(0, "Instructor ");
```
Statement: We can replace a portion of a string with another string:
```
 name.replace(0,4,"Andylicious"); // My nickname
```
(The replacement starts at position 0, and is of length 4.)
Statement: We can delete a portion of a string completely, by giving a starting position and a length.
```
 name.erase(3,4); // Erase 4 characters, starting at pos. 3
```
Expression: Finally, we can extract a “substring”, a portion of a string, by giving a starting position and a length:
```
 string name = "Andy Clifton";
 string last = name.substr(5,7); // last = "Clifton"
```
Note that substr takes a starting position and a length. You might expect that it would take a starting and ending position, but it doesn’t.
Expression: A variant of substr only takes the starting position, and automatically goes to the end of the string. Thus, this is equivalent to the previous:
```
 string last = name.substr(5);
```
substr always gives you back a string.
Expression: If you need to, you can construct a string that consists of just one character, repeated some number of times:
```
 string(10, '*')
```
will give us the string "**********"

Expression: Finally, all of the comparison operators work on strings:

 string s1 = "Hello";
 string s2 = "Goodbye";

 bool result = s1 < s2; // Is this true or false? What about ==, != , etc.

You might expect that string(10) would convert the int 10 into the string "10" but that doesn’t work. You have to use to_string(10) to convert ints (and floats, and doubles) into strings.

For a complete list of all the things strings can do, see http://www.cplusplus.com/reference/string/string/.

Although I said that string literals treat everything between the quotation marks literally, that’s not strictly true. A number of escapes are allowed to encode special characters. For example, if you want to write a string that has double-quotes in it, you’d do

string line = "The word is \"potato\"!";

Within a string literal, \" will not end the literal, but rather will insert a single double-quote. The backslash (NOT a forward slash) 'escapes’ the double-quote. Other escapes include:

"\n" – The equivalent to endl, but can be placed anywhere in the string.
"\\" – An actual backslash
"\t" – A tab
"\a" – A “bell” (if printed, either a tone will sound or something else will be done to get your attention).
"\b" – Backspace, deletes the character before it.
"\"" – An actual double quote
"\'" – A single-quote

One final note: the empty string "" is a thing! It has length 0, and contains no characters.

Converting between strings and ints

Suppose we have a string that consists of a number, e.g., "12". How can we read this into an int? We’ll see a more general way later, but for now, we can use stoi:

int i = stoi("23"); // i = 23

Expression: stoi(s) gives the int that is represented by the numeric digits in s.

stoi is part of #include <string>.

The string given must at least start with something that looks like a number. E.g., stoi("101 dalmations") will evaluate to 101. If no number can be recognized, stoi will produce an error.

We can convert an int (or float, double, bool, etc.) to a string using to_string:

string s = to_string(12); // s == "12"

What are strings made of? Characters

The individual letters, numbers, and symbols that make up a string are called characters, and there’s a type for them, too. It’s char:

char c = 101; // c = capital A

As this shows, chars can be numbers, but they can also be symbols. A “char literal” is the symbol form, between single quotes. This is equivalent to the above:

char c = 'A';

A single symbol, written between single-quotes, is a char literal, whose value is the ASCII value of that symbol. The semi-complete ASCII chart can be found here: https://en.wikipedia.org/wiki/ASCII#Printable_characters.

(Note that all of the above string escapes can also be used as single-character literals: \n is the character corresponding to pressing Enter. The escape \' is allowed so you can represent the single-quote character.)

If you print a char, you’ll get the symbol version, not the numeric version:

char c = 101;
cout << c;

This will print A. If you want to see the numeric code, you have to convert it to an int:

cout << (int)('A'); // Prints 101

Similarly, since characters are numbers, you can do math on them, provided you remember to convert them back into characters:

cout << (char)('A' + 1); // Prints B

You can compare characters using ==, <, etc. although for < and friends, the order is a little weird:

'A' < 'Z' < 'a' < 'z'
'0' < '9'

That is, uppercase and lowercase letters are “in order”, but all uppercase characters are “less than” all lowercase letters. Similarly, numeric characters compare “in order”.

If we have a string, we can use at() to access the individual characters at particular positions:

string name = "Andy Clifton";
cout << name.at(5); // Prints 'C'

Expression: s.at(n) gives the char in the string s at position (int) n. While s.at is an expression, it can also be used on the left side of an assignment statement:

We can even use this to replace a single character:

name.at(5) = 'B'; // my cousin, Andy Blifton

How does this work, doesn’t the thing on the left of an = have to be a variable (i.e., a box in which to store something)? Well yes, but at() actually looks inside the string and gives us the char-sized box that’s holding the particular character we specify.

Now, suppose I had a string whose contents I did not know, and I want to find the word “peach” in it and replace the lower-case ‘p’ with an upper-case one. How would I do this?

string text = ...;
???

Obviously, we’re going to have to use text.at(...):

string text = ...;
text.at(...) = 'P';

what should we give to at? (Use find to find the string we are looking for).

The dot operator

The dot before .at is actually another operator:

name.substr(0,5);
name.length();
name.erase(0,3).append("Potato");

The dot operator is special in that its left operand must be a variable (box), and its operand must name a member of that box. String-type boxes have members substr, length, append, etc. The dot operator associates to the left, so

name.erase(0,3).append("Potato"); // means
(name.erase(0,3)).append("Potato");

Note that this works because name.erase() evaluates to name, which is still a variable (box).

IO for strings and characters

How can we read a single character from cin? If you do

char c;
cin >> c;

it will read a single character (not its numeric code). But note that cin’s normal behavior, where it will skip over spaces, will still apply. What if you want to read the next character, regardless of whether it’s a letter, a symbol, a space, or whatever (even Enter is a character). For this you need get:

char c;
cin.get(c);

At this point, you can press any key that will generate a character (some, like Ctrl, do not, while others, like Insert, generate more than one!) and it will be stored in c.

We can use this to construct a program which reads a single character and then prints out its code:

#include <iostream>
using namespace std;

int main() {
  cout << "Enter a character: ";

  char c;
  cin.get(c);

  cout << "The ASCII code for '" << c << "' is " << (int)(c) << endl;

  return 0;
}

There is some surprising behavior around get, however. Suppose I do this:

int i;
cin >> i;
char c;
cin.get(c);

What happens if we type something like “221B Baker St.” at the first prompt?

cin uses what’s called buffered input, which means that it reads what you type into a buffer. Subsequent accesses will continue to use the buffer, until it is empty; then it will prompt you to enter more text. So in this case, i will be set to 221 but we will not be prompted to enter a character for c! It will be set to the next character in the buffer, which is ‘B’.

If you want to throw away whatever is left in the buffer (e.g., to make sure that the get prompts the user and reads a single character) use ignore:

int i;
cin >> i;
cin.ignore(10000, '\n');
char c;
cin.get(c);

What this says to do is ignore up to 10000 characters from the buffer, but stop if we see a \n. The \n will be left over after we press Enter after typing the int, so this will remove it and then be ready to receive new input.

Another thing we can do with cin is “peek” one character ahead. We do this by reading it (with get) and then putting it back:

char c;
cin.get(c);
// Examine c...
cin.putback(c);

What if we want to read an entire line of text, all the way up to the point where the user pressed Enter, including spaces? We can do this with getline, which comes from #include <string>:

string s;
getline(cin,s);

This will store the entire line entered by the user (not including the Enter character at the end, however) into s, where we can then use all our string manipulation tricks on it.

Note that the above about throwing away whatever is left in the buffer applies. If you do:

int i; 
string s;
cin >> i;

getline(cin, s);

then C++ will read an int, leave the \n in the buffer, and then when getline look at it it will see (and “get”) a blank line! If you want getline to wait for a full line, again, we have to ignore:

int i; 
string s;
cin >> i;
cin.ignore(10000, '\n');
getline(cin, s);