Let’s look at an example, to motivate what we’re going to talk about today. Suppose we want to build an application that deals with employee records. What kind of information will we need to maintain about employees?
ID number
Name
Years of service
Position
Salary
We can store this information in several parallel vectors:
vector<int> employee_id;
vector<string> employee_name;
vector<int> employee_years;
vector<string> employee_position;
vector<float> employee_salary;
Here, a particular employee is identified by their row number (different from their ID number). E.g., we might have
employee_id.at(0) = 1234;
employee_name.at(0) = "John Smith";
employee_years.at(0) = 12;
employee_position.at(0) = "CEO";
employee_salary.at(0) = 140000.00
When we want all the information about a particular employee, we look at the same row in all these vectors. If we want to, we can write some functions to manipulate a single employee in various ways:
int find_employee_by_id(int id) {
for(int i = 0; i < employee_id.size(); i++)
if(employee_id.at(i) == id)
return i;
return -1;
}
void promote_employee(int emp, string new_position, float new_salary) {
employee_position.at(emp) = new_position;
employee_salary.at(emp) = new_salary;
}
void fire_employee(int emp) {
employee_id.erase(employee_id.begin() + emp);
employee_name.erase(employee_name.begin() + emp);
employee_years.erase(employee_years.begin() + emp);
employee_position.erase(employee_position.begin() + emp);
employee_salary.erase(employee_salary.begin() + emp);
}
// etc.
Having to treat every attribute of each employee in a separate vector is
cumbersome. What we really want is a way to group all the information that
makes up a single employee into one “object”. Then, we can just create a single
vector of employee
objects and everything will be in the right place. The
idea of putting things together that belong together is called the principle
of encapsulation.
We’re going to build an entirely new type of our own, like int
or float
or
vector
or string
. This means that anything we could do with those types,
we can do with ours: store them in a vector, create them dynamically via
new
, pass them to functions as parameters, and return them from functions.
Here’s how we start:
class employee {
public:
// ...
};
Breaking this down:
class
– A “class” is a new type that is built up as a collection of values.employee
– This is the name for the new class. From this point forward,employee
will be the name of a new type.{ ... };
– The body of a class is a block, with a semicolon at the end.public:
– This says that whatever follows will be “public”, meaning, exposed to the rest of the program, outside of theemployee
class. Later, we’ll see how we can “hide” information inside a class, so that only the class itself can access it.A class is a definition. This means it goes outside of
main
, and other definitions, you can’t use an employee until after you’ve defined it. Thus, we usually put class definitions at the top of a file, before the function declarations (because functions might takeemployee
s as arguments, or return them, so the type will need to be defined before that).
Thus, the overall structure of our file now looks like this:
/* opening comment */
#includes
...
using namespace std;
// class definitions...
// function declarations...
int main() {
...
}
// function definitions
To describe what makes up an employee, we just list the various values, as if they were variables:
class employee {
public:
int id;
string name;
int years_of_service;
string position;
float salary;
};
These variable-like things are called data members of the class. Every object
that is an employee
is guaranteed to have all of these values associated with
it.
We can create an employee-variable just like any other type:
employee jsmith;
but how do we access the values inside it? We use the dot .
operator:
jsmith.name = "John Smith";
If thing
is the name of a class instance, and whatever
is the name of a
data member of that class, then thing.whatever
is the name of thing
‘s
particular whatever
. Remember that every instance “owns” its own values,
so a different instance will have a distinct whatever
.
Now we have created the new type employee
. What can we do with it?
We can create individual employee objects (statically):
employee jsmith; jsmith.id = 1234; jsmith.name = "John Smith"; jsmith.years_of_service = 12; jsmith.position = "CEO"; jsmith.salary = 140000.00;
Initially, the
employee
objectjsmith
is created with all default values (id
,years_of_service
andsalary
will all be 0,name
andposition
will be the empty string). We can then fill in the details after creating it. Because this is often inconvenient, we can also fill in the details when we create the object, like so:employee jsmith{ 1234, "John Smith", 12, "CEO", 140000.00 };
Note that the order of values here must correspond exactly with the order we listed them in in the class definition. Whereas if we list out each data member individually (as above) we can initialize them in any order.
Another way to create an
employee
and store it in an employee variable is to create the object, and then assign it:employee jsmith; jsmith = employee{ 1234, "John Smith", 12, "CEO", 140000.00 };
For any class
thing
, you can create a standalonething
by doingthing{ member values... }
Note that the result is a value; this doesn’t store it any where, just creates a value with the given attributes.
We can also dynamically create
employee
s:employee* jsmith = new employee(); jsmith->id = 1234; jsmith->name = "John Smith"; jsmith->years_of_service = 12; jsmith->position = "CEO"; jsmith->salary = 140000.00;
or
employee* jsmith = new employee{ 1234, "John Smith", 12, "CEO", 140000.00 };
(Note the similarity with the above:
employee{...}
create a newemployee
object, and evaluates to anemployee
, statically allocated, whilenew employee{...}
creates a dynamically allocatedemployee
and returns anemployee*
.)We can store
employee
s in a vector:vector<employee> employees;
Assuming this is a global variable, we can use it as the collection of all employees in the system.
We can write functions that manipulate and take or return employees:
int find_employee_by_id(int id) { for(int i = 0; i < employees.size(); i++) if(employees.at(i).id == id) return i; return -1; } void promote_employee(employee& emp, string new_position, float new_salary) { emp.position = new_position; emp.salary = new_salary; } void fire_employee(int emp) { employees.erase(employees.begin() + emp); }
Classes vs. Instances
employee
is the class. It doesn’t actually have any data associated with it
at runtime, which means that things like this:
employee.salary
make no sense. It makes no sense to ask, 'what is the salary of employee
’,
only to ask, ‘what is the salary of a particular employee?’. We call
particular employees (or, more generally, particular objects belonging
to some class) the instances of a class. E.g., in the following,
employee jsmith;
we would say that jsmith
is an instance of employee
. Instances have
data associated with them (the data members that we gave in the class definition).
To summarize, when we say
class whatever { ... };
we are defining a new class (type), but to create instances of it, we still need to do
whatever thing;
to create a whatever
variable named thing
.
Functional abstraction vs. data abstraction
We mentioned before that functions are a kind of abstraction; they allow us
to give a name to a sequence of statements, and thus “abstract” over the
details of how some operation is done. When I use promote_employee
I just
give it the employee, position, and salary; I don’t have to worry about how it
works. Similarly, when I am writing promote_employee
, I don’t have to worry
about where the employee, or position, or salary came from, I just use them.
Classes provide another kind of abstraction, abstraction over data. Now that
I’ve defined employee
, I don’t need to care about what goes into an employee.
I can just pass them around and manipulate them. In particular, suppose we wanted
to add another value to every employee, benefit
. We can just add it to
employee
:
class employee {
public:
int id;
string name;
int years_of_service;
string position;
float salary;
float benefits;
};
and now every employee gets that data member. Furthermore, any functions we’ve
written to deal with employees don’t need to worry about benefits
unless they
have a particular interest in that value. E.g., when we fire_employee
we
don’t need to do anything extra (in the parallel vectors version, we’d have
another vector for benefits, and we’d have to remember to modify the
fire_employee
function to deal with it as well).
Whereas functions let us abstract over behavior, classes let us abstract over data. If there are only two things you really learn in this class, make them be functions and classes. At the highest level, programming is nothing more than creating and working with abstractions.
A good way to think about this is that when we do
employee jsmith;
we are creating a box named jsmith
which contains within it, sub-boxes named
id
, name
, years_of_service
, etc.
Examples
Let’s look at some more examples of classes at work.
A class for dogs. For some reason, let’s suppose we need to represent information about dogs. Maybe we’re making an obedience training application or something. The first question is, what kind of information do we want to associate with each dog? Remember that every instance of a class gets all the data members, so we need to know what kind of values every dog will have, so that we can define the members of the class.
class dog { public: string name; string breed; int age; string color; char gender; // 'm' or'f' };
A class for dog owners. Make sense that we’d need this next, right?
class owner { public: string name; string address; vector<dog> dogs_owned; };
Here, we have a class, that contains a vector of instances of another class! After we create a dog
owner
, we can adddog
s to him/her:owner jsmith; jsmith.name = "Jane Smith"; jsmith.dogs_owned.push_back(dog{"Fido", "Welsh Corgi", 5, "brown", 'm'});
If we look at
jsmith
now, we’ll find:jsmith.dogs_owned.size() == 1; jsmith.dogs_owned.at(0).name == "Fido"; // etc.
A larger example
Let’s write a class to represent polynomials, e.g.,
We can represent a polynomial as a list of its coefficients, one for each power of \(x\). E.g., the above could be represented as the vector
vector<float> p = {3, 2, -5, 0, 3};
In the polynomial above, there is no term for \(x^3\), so it has a coefficent of 0 (i.e., the term is \(0 x^3)\). As a class, this will just be
class polynomial {
public:
vector<float> coeffs;
};
What kind of operations can we perform on a polynomial?
We can create a polynomial with a specific degree, although all the coefficients will be 0 to start with.
We can multiply it by a constant value. This has the effect of multiplying every coefficient by that constant.
We can multiply it by a power of \(x^k\). This has the effect of inserting \(k\) 0s at the beginning of the vector (i.e., shifting the coefficients up).
We can ask for the degree of a polynomial. This is the highest power of \(x\) that does not have a 0 coefficient. Note that a vector such as
{1, 2, 0, 0}
has degree 1, because it represents \(1 + 2x\). In fact, we should chop off those higher 0s so that we can just use the size of the vector, which leads us to
We can “normalize” a polynomial. This just means chopping off any 0s at the end, so that the above would become
{1, 2}
Then, the degree is just
.size() - 1
.We can add two polynomials together. This is done by matching up coefficients with the same powers, and adding them together. E.g.,
$$(1 + 2x + 3x^2) + (2 - 4x - 5x^2) = $$ $$(3 - 2x - 2x^2)$$We can similarly subtract two polynomials. If the polynomials do not have the same degree, then the shorter one is extended with 0s.
We can multiply polynomials, but that’s complicated, so we’ll skip it.
We can evaluate a polynomial, by plugging in a
float
for \(x\) and seeing what value we get out.We can print a polynomial, as (for example)
1 + 2x + 3x^2 + 4x^3
Let’s write functions for some of these:
#include<cmath>
#include<iostream>
#include<utility>
polynomial create(int degree) {
polynomial output;
output.coeffs.resize(degree + 1, 0);
return output;
}
int degree(polynomial p) {
normalize(p);
return p.coeffs.size() - 1;
}
void normalize(polynomial& p) {
int i;
for(i = p.coeffs.size() - 1; i >= 0; i--)
if(p.coeffs.at(i) != 0)
break;
p.coeffs.resize(i + 1);
}
void shift(polynomial& p, int powers) {
p.coeffs.insert(p.begin(), powers, 0);
}
polynomial multiply(polynomial p, float s) {
polynomial output = create(degree(p));
for(int i = 0; i < p.coeffs.size(); i++)
output.coeffs.at(i) = p.coeffs.at(i) * s;
return output;
}
polynomial add(polynomial a, polynomial b) {
int larger = max(degree(a), degree(b));
polynomial output = create(larger);
for(int i = 0; i < larger; i++) {
float x = i < a.coeffs.size() ? a.coeffs.at(i) : 0;
float y = i < b.coeffs.size() ? b.coeffs.at(i) : 0;
output.coeffs.at(i) = x + y;
}
return output;
}
float evaluate(polynomial p, float x) {
float output;
for(int i = 0; i <= degree(p); i++)
output += p.coeffs.at(i) * pow(x,i);
return output;
}
void print(polynomial p) {
for(int i = 0; i <= degree(p); i++) {
if(i == 0)
cout << p.coeffs.at(i) << " ";
else if(i == 1)
cout << p.coeffs.at(i) << "x ";
else
cout << p.coeffs.at(i) << "x^" << i << " ";
}
cout << endl;
}
And now we can use these to do some interesting things:
int main() {
polynomial a = create(3), b = create(4);
// a = 1 - 2x + 3x^2
a.coeffs.at(0) = 1;
a.coeffs.at(1) = -2;
a.coeffs.at(2) = 3;
// b = 2x + 4x^2 + 8x^3
b.coeffs.at(0) = 0;
b.coeffs.at(1) = 2;
b.coeffs.at(2) = 4;
b.coeffs.at(3) = 8;
polynomial c = add(a,b);
shift(c,1);
print(c);
return 0;
}
More encapsulation: member functions
We can use a class to wrap up some values inside a single object. But what
about the operations on the class, all those functions above? If they are
tied to the polynomial
class, shouldn’t they be attached to it somehow? It
turns out that the answer is Yes. Encapulsation means not just that a class
contains the values it needs, but that it should contain everything it needs,
values and operations. Let’s see how this works; going back to our dog
class:
class dog {
public:
string name;
string breed;
int age;
string color;
char gender; // 'm' or'f'
};
Let’s give this class an operation speak
which will print "Woof"
to cout
:
class dog {
public:
string name;
string breed;
int age;
string color;
char gender; // 'm' or'f'
void speak() {
cout << "Woof!" << endl;
}
};
We call a function like this, defined inside a class a “member function”. Just
as you can only refer to the name
of a particular dog (i.e., you must have a
dog
instance), so you can only ask a dog
instance to speak()
:
dog puppers;
puppers.name = "Puppers";
// etc.
puppers.speak(); // Prints "Woof!"
We can add all the things a dog
can do as member functions of the class, and
then we no longer need them as individual functions. We’ve encapsulated all
the operations and behaviors of dog
into the dog
class itself. A class is
not just a container for values, rather, it’s a container for everything
that concerns that particular type. Ideally, we want a class’s instances
to stand on their own, and not rely on external functions for any important
behavior. (Some behaviors will live in external functions.)
Let’s modify the dogs-and-owners system to have some more methods:
class dog {
public:
...
void speak() {
cout << "Woof!" << endl;
}
void sleep() {
cout << "Snooze-time" << endl;
}
vector<dog> have_puppies() {
if(gender == 'm')
return vector<dog>(); // Empty vector
else
return vector<dog>(4); // 4 puppies
}
};
What’s going on in have_puppies
, what is gender
refering to? The answer is,
it’s referring to the gender of the current dog
. E.g., when we do
dog fido = {"Fido", 5, "Black", 'm'};
vector<dog> puppies = fido.have_puppies();
Within fido.have_puppies()
, gender
refers to fido.gender
. When you call
a method on an instance, that instance becomes the current instance,
and any uses of the data member names within its methods refer to that, the
current instance. This gives us a way to simplify functions that operate on
a particular dog
: just make them methods, and we can access the attributes
of the current dog
easily!
Another way to put it is to imagine that we wrote have_puppies
as a function:
vector<dog> have_puppies(dog parent) {
if(parent.gender == 'm')
return vector<dog>(); // Empty vector
else
return vector<dog>(4); // 4 puppies
}
For a method of a class, the current instance is essentially passed as a hidden,
invisible parameter. (Technically, it’s passed as a hidden pointer to the
current instance, and that pointer is named this
.) Every method gets access
to it, and whenever we use gender
,
name
, or any other name of a data member, it will access the current instance.
Polynomials, continued
Now that we have methods, we can clean up our polynomial class, moving some of the operations inside it.
class polynomial {
public:
vector<float> coeffs;
void normalize() {
int i;
for(i = coeffs.size() - 1; i >= 0; i--)
if(coeffs.at(i) != 0)
break;
coeffs.resize(i + 1);
}
int degree() {
normalize();
return coeffs.size() - 1;
}
void shift(int powers) {
coeffs.insert(p.begin(), powers, 0);
}
polynomial multiply(float s) {
polynomial output = ???
for(int i = 0; i < coeffs.size(); i++)
output.coeffs.at(i) = coeffs.at(i) * s;
return output;
}
polynomial add(polynomial b) {
int larger = max(degree(), b.degree());
polynomial output = ???
for(int i = 0; i < larger; i++) {
float x = i < coeffs.size() ? coeffs.at(i) : 0;
float y = i < b.coeffs.size() ? b.coeffs.at(i) : 0;
output.coeffs.at(i) = x + y;
}
return output;
}
float evaluate(float x) {
float output;
for(int i = 0; i <= degree(); i++)
output += coeffs.at(i) * pow(x,i);
return output;
}
void print() {
for(int i = 0; i <= degree(); i++) {
if(i == 0)
cout << coeffs.at(i) << " ";
else if(i == 1)
cout << coeffs.at(i) << "x ";
else
cout << coeffs.at(i) << "x^" << i << " ";
}
cout << endl;
}
};
I’ve left out both the create
function and put ???s in the lines where it was
called. We’ll deal with the proper way to create objects in the next section.
Some general observations here:
Every function that used to take a single
polynomial
parameter now doesn’t need to take one: it can just use the current instance directly. Similarly, functions likeadd
that used to take two, now just take one.When a function used to take a parameter
polynomial p
, we just remove the parameter, and then remove every copy ofp
in the body of the function. Effectively,p
is being replaced by the current instance.Notice how
degree()
refers tonormalize()
? As with members, methods can call each other on the current instance, implicitly.multiply
can access the coefficients ofoutput
directly, even though they are private. This is becausemultiply
is part ofpolynomial
, even though it is not part of the same instance asoutput
.