CS 1124 — Object Oriented Programming

Copy Control / Big 3

Topics

Concept

What are the "Big 3"?

The Big 3 refers to the destructor, the assignment operator and the copy constructor.

Why do we care?

Most of the time you don't. At least you haven't so far. But sometimes you need to. Suppose we have a simple class such as the following:

class SimpleClass {
public:
    SimpleClass() { p = new int(17); }
private:
    int* p;
}; 

Now suppose we create an instance of SimpleClass in a function:

void aFunction() {
    SimpleClass simp;
}

What happens when we call aFunction? More importantly, what happens when it finishes? The variable simp goes "out of scope". What happens to the the memory on the heap that simp.p was pointing at? It has become garbage. We now have a memory leak.

Solution? Write a destructor. But after you write the destructor, it will turn out that you want to write a copy constructor and an assignment operator, too. These three functions tend to go together - if you need one, you likely need the other two.

Destructor

The destructor is pretty simple. It's job is to free up whatever resources need to be freed up when an object is about to be "destroyed". For us, the "resources" just refer to items or arrays that were allocated on the heap. If the object being destroyed is responsible for the memory on the heap, then it should have a destructor to free it up.

How do you write a destructor? First, what's its name? The destructor's name is the same as the name of the class, execept it starts with one extra character, know as "tilde" (rhymes with "Hilda").

Is there anything else special about the function? Glad you asked. Like the constructor(s), it does not have a return type. At all. Unlike the constructors, there is only one destructor. You can't overload it based on passing different parameters, because you don't ever pass parameters to a destructor.

In fact, you don't call the destructor. The "system" does.

So, here's a suitable destructor for our simple class.

~SimpleClass() { delete p; }

That's it. Free up the memory that was allocated for this object.

Copy Constructor

What does a constructor do? Initializes the member variables of an object when it is created. Sometimes constructors have parameters so that we can pass arguments to tell the constructor how to initialize the new object.

When the argument is another object of the same type, then we call the constructor a copy constructor. This is a very important constructor. It is called a lot. Here are four somewhat different times that the copy constructor is used:

  1. SomeClass a;
    SomeClass b(a);
  2. SomeClass a;
    SomeClass b = a;
  3. void someFunc(SomeClass passedByValue) {}
  4. SomeClass anotherFunc () { SomeClass returnedByValue; return returnedByValue; }

Items 1 and 2 above are both creating an object and initializing based on another object of the same type. In both cases, it is the copy constructor that is called.

Item 3 shows a function with a parameter that is passed by value. Passing by value uses the copy constructor to initialize the parameter.

Item 4 shows a function with a value that is returned by value. Returning by value uses the copy constructor to return a copy of the value that is being returned. (That all looks cyclical, but it actually says the right thing.)

Here's a suitable copy constructor for our simple class:

SimpleClass(const SimpleClass& rhs) {
     p = new int;  // Allocate
    *p = *rhs.p;   // Initialize (well, actually "set").
}

The constructor is allocating space on the heap, and initializing/setting that space to hold a copy of the value in the original. (There's gotta be a simpler way to say these things.).

Assignment Operator

Now for the last of the Big 3. But it's the biggest.

Responsibilities

The assignment operator has more responsibility than the copy constructor. Why?

The copy constructor is only initializing an object and doesn't have to worry about any information or resources that the object already has.

The assignment operator, on the other hand, is changing an object. It has to, possibly:

  1. get rid of things the object already has
  2. replace them with new things
  3. copy over the values from the object on the "right-hand side" to the "left-hand side".

Step one is what the destructor does. Steps two and three together are what the copy constructor does. Sure it would be nice if we could just "call" those two functions - but we can't. However, it sure makes this function easier, knowing that it just consists of the same work we would have done for the other two.

In addition the assignment operator has to

Member or non-member?

Member. The language doesn't give us a choice.

Return Type and Value

The assignment operator has a return type. (Neither the destructor nor the copy constructor did, so we didn't have to worry about them.) What should it be? Void? NOOOOO!!!! (No matter what you may read in "some" books.)

C++ programmers expect to be able to write things like:

x = y = z;

That means the same as the parenthesized:

x = ( y = z ) ;

Guess what would happen if the value of the expression y = z was void. That line up there could not compile! That's why it has to have a value. What value? The same as what's in y after the assignment in parentheses.

That tells us that the type could be SomeClass. But should we return it by value or by reference? In other words, should the type be SomeClass or SomeClass&. The answer is SomeClass&. I'm not going to clutter this page with the reason. You can look at the gory details if you like. Otherwise, just remember to do it the right way.

Self-Assignment

What if a programmer writes:

x = x;

What would the code we've outlined above do? The first thing we've said we should do is "get rid of the things the object already has". Hm, in our example with SomeClass, that means get rid (i.e. delete) the int on the heap. Well, that would be bit of a disaster. We could possibly arrange things so that we didn't end up destroying all our information, but we would still be doing a lot of unneccessary work.

We shouldn't be doing much of anything, other than recognizing that this is a simple no-brainer and returning x as the value of the expression.

How do we check for self-assignment? We need to know if the current object, since we said this is a member function, is the same object as the right-hand side. How can we check if two objects are exactly the same object? Check if their addresses are the same! What is the current object's address? this.

Example:

SimpleClass& operator= (const SimpleClass& rhs) {
    if (this != &rhs) {
        // Free up resources (as needed)
        delete p;
        // Allocate new resources (as needed)
        p = new int;
        // Copy over all data
        *p = *rhs.p;
    }
    return *this;
}

Inheritance?

How does copy control work when we mix it with inheritance?

If Derived Does Not Do Copy Control?

The first question really is, if I am writing a derived class, let's call it Derived, and my class does not need to do copy control for itself, is there anything special I need to do in Derived? Does it depend on whether or not my parent class, call it Base, does do its own copy control?

Happily the answer is no. Assuming your parent class was written correctly, you don't have to know what your parent class is actually doing. All the right stuff gets done for you in that case. (Note, further down we will discuss a responsibility of the parent class...)

So how does inheritance effect how you implement copy control in your derived class? Let's give ourselves a Base and Derived class to discuss. In the Base class, I will just put some print statements for the bodies of the Big 3. This will help in tracking what's going on. In the Derived class, I will just put a default constructor, again containing only a print statement.
class Base {
public:
    Base() {
        cerr << "Base()\n";
    }
    Base(const Base& rhs) {
        cerr << "Base(const Base&)\n";
    }
    ~Base() {
        cerr << "~Base()\n";
    }
    Base& operator=(const Base& rhs) {
        cerr << "Base op=\n";
        return *this;
    }
};

class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
};

What happens if we create a Derived object

int main() {
    cerr << "Derived der:" << endl;
    Derived der;
    cerr << "main finished." << endl;
}
The output will be:
Derived der:
Base()
Derived()
main finished.
~Base()

What happened?

  1. We created a Derived object. As we already knew, in its constructor's initialization list it calls the Base constructor. That's why the first thing we see is Base() and after that Derived().
  2. The Derived object gets "destroyed" when its scope, main, is finished. Then, even though we didn't write any code to call it, the Base destructor gets called, resulting in the output ~Base()

Derived Destructor

Ok, so what's next? Let's start with the easiest of the Big 3, the destructor. We will add a destructor to Derived and observe what changes in the output. I won't repeat the Base class as it is not changing (for now).

class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
    ~Derived() {
        cerr << "~Derived()\n";
    }
};
Using the same test code as before, now the output will be:
Derived der:
Base()
Derived()
main finished.
~Derived()
~Base()

What happened this time? Simple, when the Derived object was being destroyed, its destructor was called, resulting in the output ~Derived(). The Derived destructor, when it was done, automatically called the Base class destructor. We didn't have to do a thing!

Derived Copy Constructor

Ok, so what's next? Next we will implement a copy constructor for the Derived class. To begin with, let's just write Derived's copy constructor as we did Base's copy constructor. Then we'll want to write some test code to see how well it worked.
class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
    ~Derived() {
        cerr << "~Derived()\n";
    }
    Derived(const Derived& rhs) {
        cerr << "Derived(const Derived&)\n";
    }
};

int main() {
    cerr << "Derived der;" << endl;
    Derived der;
    cerr << "Derived der2(der);" << endl;
    Derived der2(der);
    cerr << "main finished." << endl;
}
The output will be:
Derived der;
Base()
Derived()
Derived der2(der);
Base()
Derived(const Derived&)
main finished.
~Derived()
~Base()
~Derived()
~Base()

How should a Derived object get copied? Same idea here as when we asked how a Derived object should get initialized back when we started talking about inheritance. Always take care of the Base portion first. Idea is that we want a "firm foundation" to build the Derived portion on top of.

But how did the Base portion get initialized here? Notice the call to Base(). The Base portion got initialized using the Base class's default constructor. But that can't be right! The Base portion of the new object is supposed to be a copy of the Base portion of the original. We needed to use the Base's copy constructor!

Where does the parent's constructor get called? Same as in our previous discussions of inheritance. It gets called in initialization list of the child's constructor. Since we did not specify which constructor in Base to use, it used Base's default constructor. To fix this, we just have to put a call to Base's copy constructor in the initialization list of Derived's copy constructor. (That's a lot to say but not much to actually do.)

Fixing our Derived constructor:

class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
    ~Derived() {
        cerr << "~Derived()\n";
    }
    Derived(const Derived& rhs) : Base(rhs) {
        cerr << "Derived(const Derived&)\n";
    }
};
Using the same test code, our output becomes:
Derived der;
Base()
Derived()
Derived der2(der);
Base(const Base&)
Derived(const Derived&)
main finished.
~Derived()
~Base()
~Derived()
~Base()

The only change is that now we get the correct consturctor for Base being called when we make a copy of our Derived object.

Derived Assignment Operator

One more of the Big 3 to go, the assignment operator. As we did with the copy constructor, first we will write our Derived assignment operator the same way we did the one in the Base class and provide a test program.

class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
    ~Derived() {
        cerr << "~Derived()\n";
    }
    Derived(const Derived& rhs) : Base(rhs) {
        cerr << "Derived(const Derived&)\n";
    }
    Derived& operator= (const Derived& rhs) {
        cerr << "operator=(const Derived&)\n";
        return *this;
    }
};

int main() {
    cerr << "Derived der;" << endl;
    Derived der;
    cerr << "Derived der2;" << endl;
    Derived der2;
    cerr << "der = der2;" << endl;
    der = der2;
    cerr << "main finished." << endl;
}
The output will now be:
Derived der;
Base()
Derived()
Derived der2;
Base()
Derived()
der = der2;
operator=(const Derived&)
main finished.
~Derived()
~Base()
~Derived()
~Base()

What happened? The Derived assignment operator was called, so that part worked. But what happened to the Base portion of the target object, der? Nothing! No code was run to copy the Base portion of der2 into the Base portion of der. How can we fixt that? As with the copy constructor, we have to make our Derived assignment operator call the corresponding function in the Base class.

But how?

In this case, there are a few ways we can do it. I think the most obvious way is to explicitly call the Base class's assignment operator. How exactly? Since an operator is implemented by a function we will just call the function operator= in the Base class. Remember that when we want to call a function from the parent that has the same name as a function in the child class, we need to qualify the function name. Therefore, in this example, the function we want to call is Base::operator=.

Our modified class definition becomes:
class Derived : public Base {
public:
    Derived() {
        cerr << "Derived()\n";
    }
    ~Derived() {
        cerr << "~Derived()\n";
    }
    Derived(const Derived& rhs) : Base(rhs) {
        cerr << "Derived(const Derived&)\n";
    }
    Derived& operator= (const Derived& rhs) {
        cerr << "operator=(const Derived&)\n";
        Base::operator=(rhs);
        return *this;
    }
};
And the corresponding output, using the same test program, is:
Derived der;
Base()
Derived()
Derived der2;
Base()
Derived()
der = der2;
operator=(const Derived&)
operator=(const Base&)
main finished.
~Derived()
~Base()
~Derived()
~Base()

We wee the same call to Derived's assignment operator and an additional call to Base's assignment operator.

One More Thing

We have one more job to take care of. Consider the following test code, using our current class definitions:

int main() {
    cerr << "Derived* p = new Derived();" << endl;
    Derived* p = new Derived();
    cerr << "delete p;" << endl;
    delete p;			   
    cerr << "main finished." << endl;
}
The output shows that all the necessary functions get called as we would expect:
Derived* p = new Derived();
Base()
Derived()
delete p;
~Derived()
~Base()
main finished.

But what happens if we change our test program just slightly? Instead of storing the address of our Derived object in a Derived pointer variable we will use a Base pointer variable.

int main() {
    cerr << "Base* p = new Derived();" << endl;
    Base* p = new Derived();
    cerr << "delete p;" << endl;
    delete p;			   
    cerr << "main finished." << endl;
}
The resulting output is:
Base* p = new Derived();
Base()
Derived()
delete p;
~Base()
main finished.

The output has changed. How? There's a line missing! Only the Base destructor got called, not the Derived constructor. This could be really bad. If the Derived class needed a destructor, then whatever it was supposed to be doing is not getting done.

How can we fix this? Well, why is it happening? Consider the line delete p. What is the type of p? It is a Base*. From the compiler's point of view we are destroying something whose type is Base and so it calls the Base destructor. We instead want the Derived destructor to get called, which in turn would call the Base destructor.

Does this situation look familiar? We have a Base pointer that is pointing to a Derived object. When we try to have a function called, here the destructor, instead of the Derived version getting called, it's the Base version that is used. This situation should remind you of when we introduced polymorphism. Base pointer. Derived object. Base method gets called instead of Derived object. What was the solution? Mark the method as virtual in the Base class. What method are we discussing here? The destructor. So, that's the solution. In the Base class, mark its destructor as virtual. No change at all is required in the Derived class.

So, here is the modified Base class:

class Base {
public:
    Base() {
        cerr << "Base()\n";
    }
    Base(const Base& rhs) {
        cerr << "Base(const Base&)\n";
    }
    virtual ~Base() {
        cerr << "~Base()\n";
    }
    Base& operator=(const Base& rhs) {
        cerr << "Base op=\n";
        return *this;
    }
};
With no change to our last test program, the output is now (correctly):
Base* p = new Derived();
Base()
Derived()
delete p;
~Derived()
~Base()
main finished.

That's it!

That's it for dealing with copy control and inheritance:

Home


Maintained by John Sterling (jsterling@poly.edu). Last updated Jan. 6, 2013