Please note that these notes are not yet complete. I hope that they may still useful as as a supplement to the in-class code along with the content of the lectures and the labs.
First of all, what is a vector? By now, we have been using them a lot. They have handy methods, like push_back, pop_back, size, back and clear. Furthermore, we can use square brackets to access and modify elements in the vector.
But what does a vector look like inside?
Somehow it can "contain" lots of elements. And that number can grow.
Hm, if we use the function sizeof
on a vector, we are
going to get the same result no matter how many elements our vector
holds. On my machine I just tried it, running the following line:
cout << sizeof(vector<int>) << endl;
and the result I saw was 24. That means that a
vector object takes up 24 bytes. And that won't change, no
matter how much the vector grows.
So where does all the data go? The answer is that the vector object keeps a pointer to the data, which is stored on the heap. To hold the data, we allocate an array on the heap. Because it is on the heap, and so can be created and destroyed, it is often referred to as a dynamic array.
What's an array? Just a contiguous block of memory that can hold a fixed number of things of the same type. We haven't bothered using arrays so far, simply because vectors are so much easier to deal with. Arrays are fairly "dumb". Well perhaps "ignorant" would be a more fair description. For example, you can't ask them how many elements they are holding. You can't make them grow when you add items. Simply put, vectors are easier to use, and that's why we haven't required that you use this more primitive type of thing.
Again, why do we need them now? Because we want to understand how that vector thing gets created.
Ok, the data is kept in an array and a pointer to the array is kept in the vector object. What else is there? As we noted above, I can't ask a dynamic array how big it is, so the vector object will have to store that information, too. We call this the capacity.
It may seem strange right now, be we keep two counters for the vector. How big the dynamic array is and separately how many elements we've placed into it. We'll discuss a little further down why we they aren't the same. (If you remember what you learned in data structures, this actually should be very familiar...)
So what is this data pointer? It is pointing to an array. But what
is its type? Turns out its type is actually the same as the
type of a pointer to a single item in the array. So, if we have a
dynamic array of integers, then the data pointer will be
an int*
. (And if it is an array of int pointers, then
it will be an int**
, i.e. a pointer
of int*
's.)
Now that we know the type of the data pointer, and that we need a field tracking how big the array is and how many items we have placed in it so far, we can get a start on writing the class definition.
class Vector {
public:
private:
int* data;
size_t theSize;
size_t theCapacity;
};
If I want a dynamic array that can hold 10 ints, how do I allocate it?
new int[10]
So allocating a dynamic array is very similar to allocating a single item, but we also specify the size in square brackets.
And storing that address away in a pointer variable? We said that if
the dynamic array holds ints, then the type of the pointer that
points to that array should be int*
, resulting in:
int* p = new int[10]
And just one more, if it was to be an array of int pointers the our variable would be defined and initialized as:
int** p = new int*[10]
Note the two places that there is an extra asterisk, compared to the
prior example.
Now that we know how to create the dynamic array, we are ready to initialize our Vector. Let's start out by defining a constructor that allows us to specify an initial size for the Vector and a common value for all of the entries. Using c++'s vector class, we could write:
vector<int> v(10, 17) // a c++ vector holding 10 ints, all of which are 17
Our Vector class will always hold ints so we won't be specifying
what kind of thing. We will talk later on (in a few weeks)
about how to give our type the ability of specifying what type it
will hold. So, the equivalent line using our class will be:
Vector v(10, 17) // our vector holding 10 ints, all of which are 17
How do we write that constructor? We need to allocate an array of the specified size, fill it with the specified value, and set our fields theSize and theCapacity.
Vector(size_t size, int value) {
theSize = size;
theCapacity = size;
data = new int[size];
for (size_t i = 0; i < theSize; ++i) {
data[i] = value;
}
}
You may not have noticed, but there's a somewhat surprising bit of
syntax hiding in that function.
data[i] = value;
Why do I say it is surprising? What is data? It's a pointer. But we are indexing off of it, just as if it were a vector itself. (Or as if it were the name of an array.) We will talk later about how this actually works, when we discuss pointer arithmetic.
However now our line of code to define a "default" Vector no longer works since we no longer have a default constructor. How can we get one back? We could separately define a default constructor. Or we could provide default values for the parameters of the constructor we wrote.
Vector(size_t size=0, int value=0) {
theSize = size;
theCapacity = size;
data = new int[capacity];
for (size_t i = 0; i < theSize; ++i) {
data[i] = value;
}
}
Consider what happens when we use both of the defaults. Then the array
that we create has a length of zero. That seems wierd. But C++ can
deal with it. There really is a dynamic array. It is not the
same as using a nullptr
.
What should the destructor do? The one thing we have on the heap is the dynamic array. How do we free up a dynamic array? With delete, yes, but a little bit more. Look at the next two lines.
int* p = new int[10];
delete [] p;
Above we allocated our array as before. Then when we went to free it
up, we used delete []
. The []
tells the
compiler that it is not just a single int
that we want
to delete, but instead a whole array of them.
Therefore the destructor is simply:
~Vector() {
delete [] data;
}
Now a reminder of what a copy constructor looks like:
Vector(const Vector& rhs) { // Work in progress
// Any copying code that couldn't go in the initialization list
}
What does the copy constructor need to do?
Notice that all three fields in the class are primitive, so it doesn't matter whether we set them in the initialization list or in the body of the constructor.
Handling the first bullet, our constructor will look like:
Vector(const Vector& rhs) { // Work in progress
// Copying over the size and capacity
theSize = rhs.theSize;
theCapacity = rhs.theCapacity;
}
Next we need to allocate the dynamic array to hold the data
Vector(const Vector& rhs) { // Work in progress
// Copying over the size and capacity
theSize = rhs.theSize;
theCapacity = rhs.theCapacity;
// Allocating the array
data = new int[theCapacity];
}
And finally, a loop to copy the actual values over.
Vector(const Vector& rhs) { // Done!!!
// Copying over the size and capacity
theSize = rhs.theSize;
theCapacity = rhs.theCapacity;
// Allocating the array
data = new int[theCapacity];
// Copying the data values
for (size_t index = 0; index < theSize; ++index) {
data[index] = rhs.data[index];
}
}
To start, what is the outliine that we developed for assignment operators before?
Vector& operator=(const Vector& rhs) { // In progress
// Check for self assignment
// Free up the old
// Allocate new and Copy
// Return yourself
}
How do we check for self-assignment? Compare addresses! We compare our address with that of the righthand side. Filling that in, we get:
Vector& operator=(const Vector& rhs) { // In progress
// Check for self assignment
if (this != &rhs) {
// Free up the old
// Allocate new and Copy
}
// Return yourself
}
Then jumping to the end, how do we return ourself? We need to return
the dereference of this
?
Vector& operator=(const Vector& rhs) { // In progress
// Check for self assignment
if (this != &rhs) {
// Free up the old
// Allocate new and Copy
}
// Return yourself
return *this;
}
What is it that we have to "free up"? The dynamic array that is
pointed to by data. How do we free it up? We delete it. But remember
that it is an array, so use delete []
. And also, note
that this is exactly the same thing that we did in the destructor.
Vector& operator=(const Vector& rhs) { // In progress
// Check for self assignment
if (this != &rhs) {
// Free up the old
delete [] data;
// Allocate new and Copy
}
// Return yourself
return *this;
}
And finally, what does it mean here to "allocate and copy"? Same thing that we did back in the copy constructor, so I am just going to grab that code.
Vector& operator=(const Vector& rhs) { // In progress
// Check for self assignment
if (this != &rhs) {
// Free up the old
delete [] data;
// Allocate new and Copy
// Copying over the size and capacity
theSize = rhs.theSize;
theCapacity = rhs.theCapacity;
// Allocating the array
data = new int[theCapacity];
// Copying the data values
for (size_t index = 0; index < theSize; ++index) {
data[index] = rhs.data[index];
}
}
// Return yourself
return *this;
}
The push_back method is used for a vector to allow us to add an item to the end of the vector. But above I have said that we cannot make our arrays grow. How do we deal with this?
If there isn't enough space in the the old array, then just make a new one that is big enough! And of course copy all the values from the old array to the new one. Don't forget to free up the old one...
When we are done ensuring that the array is big enough, we can insert the new item and increase the size (as opposed to the capacity that we may have just increased.
void push_back(int val) {
// In case we defined a vector with zero capacity
if (theCapacity == 0) {
++theCapacity;
data = new int[theCapacity];
}
// Have we run out of room?
if (theSize == theCapacity) {
int* oldData = data;
data = new int[2*theCapacity];
theCapacity *= 2;
for (size_t i = 0; i < theSize; ++i) {
data[i] = oldData[i];
}
delete [] oldData;
}
data[theSize] = val;
++theSize;
}
Having written the push_back method, we can now build up a vector, as in the following test code:
int main() {
Vector v;
v.push_back(17);
v.push_back(42);
v.push_back(6);
v.push_back(28);
}
However, that doesn't give us the ability to see what's now in the vector. We need to be able to write a loop and print out the values.
Later on we will explore what is needed to allow the ranged-for on our Vector. For now we will just loop using indices:
int main() {
Vector v;
v.push_back(17);
v.push_back(42);
v.push_back(6);
v.push_back(28);
for (size_t i = 0; i < v.size(); ++i) {
cout << v[i] << endl;
}
}
What needs to be implemented? We need:
size_t size() const { return theSize; }
Next we have to deal with the indexing, i.e. the square
brackets: v[i]
.
v.operator[](i)
int operator[](size_t index) const { return data[index]; }
With all of that defined, our loop should correctly print out the contents of the vector.
There are two methods that reduce the size of the
vector, pop_back
and clear
. People are
often surprised to learn that in the C++ vector class that these
functions do not change how large the dynamic array is, but rather,
only modify the value of the field that tracks how many items we are
holding, theSize
. The pop_back method will decrement it
by one and the clear method will zero it.
void pop_back() { --theSize; }
void clear() { theSize = 0; }
It is worth noting that pop_back is "dangerous" as implemented. We could, first check if theSize is already zero, to decide whether or not to decrement. Your call.
Maintained by John Sterling (jsterling@poly.edu). Last updated Mar. 3rd, 2018