Writing your own vector in C++

01

Writing your own vector in C++

Written with ❤️ by:

If you ask me, the Standard Template Library (STL) is one of the best things that happened to C++. Containers like vector , stack , queue , deque , map , priority_queue and others have made life so much easier for cpp programmers.

Back in the day, we had to focus on the intricacies of implementing the data structures used in our programs. But most of it has been abstracted away by introducing STL, so that programmers can focus more on designing and coding the important features of their applications.

But today, we’ll take some time off and dive into the basics once again. How about we take a shot at implementing our own *vector* -like container from scratch?

Wait. What’s a vector?

Well, in the most simplest of terms, we can think of a vector as a resizable array, where the user is completely unaware of how and when it’s resized. One can keep on adding or removing elements from the vector without a care in the world!

Moreover, these resizable arrays can be of any data type, be it primitive data types like int , char , float or user defined data types like an object or structure variable. Kinda cool, right?

Thinking about design

As dictated by the principles of software engineering, Design comes first. So, let us decide how we want our container to look like. By that, I mean how people can use it in their code.

/* DATA_TYPE can be primitive data type like int,float,char or any user defined data type (like your own class or structure) as well */

mycontainer<DATA_TYPE> con;
mycontainer<DATA_TYPE> con(size_of_container);
mycontainer<DATA_TYPE> con(size_of_container,default_value);
con[i] = some_value;
some_other_value = con[j];
con.push_back(some_value);
con.pop_back();
some_value = con.back();
auto length = con.size();
con.empty();  // returns true when it has zero elements

Getting down to work

Now that we know what we want and how it should look like , let’s move ahead to the implementation. Hmmmm…. So, we want our container to support any primitive or user defined data type, like a set of integers or a set of objects. How do we do that??

Templates!!

Templates in C++ allow us to write generic code. Consider this example.

Every time we call this function with a new data type, the compiler creates a new instance of this function with the given data type. Thus, if we have fun(5) and fun(&#x27;A&#x27;) in our code somewhere, the compiler creates two instances of fun , namely fun(int) and fun(char) . So yes, code is duplicated , but the compiler takes over the hassle. Moreover, the best part is that it’s done lazily . So, unless you call fun with a different data type, no new instance of the resource is created.

Well, now that we have a very brief idea about what templates are, let’s see how we can use them in our very own mycontainer . As far as mycontainer is concerned, it’s just going be a class having a generic array as one of it’s data members. This is how the declaration will look.

Data members

  • len : The number of elements in mycontainer at any instant. This is the value returned by the size() method.
  • capacity : The amount of memory allocated for the array data.
    Umm. Wait. What’s the difference between len and capacity?
    Well. Consider this. We might have allocated memory for 20 elements, but inserted only 10. At this point, capacity = 20 and len = 10. Clearly_, capacity >= len_. Always!
  • data : The array where our elements will actually be stored. It’s generic, thanks to C++ templates!!

Phew. Done with the data members. Let’s move ahead to the constructors now.

Constructors

The Default Constructor

It creates an array with space for 16 elements, but the length is still zero. This is invoked when we declare our container in the following manner

mycontainer&lt;int&gt; con;

Umm. Wait. What’s with that 16 out of nowhere? Perfectly valid question. Which leads us to the most important part. Reallocation .

We’ll cover that soon under insertion of new elements.

The Parameterized constructor

This constructor is pretty much same as the default constructor, except for the fact that the programmer can specify the initial length, capacity and default value. In default_value = T() , T() represents the default value of type T. For example, int() and float() would mean 0 and 0.0f respectively. For user-defined objects, it’ll be initialized with the default constructor.

mycontainer&lt;int&gt; con(50,-1);

Yaay! Done with the constructors :)

The Destructor

Can’t forget to free up space, can we??

Methods

The push_back method just appends a new element to the end of our container.

Every time we want to insert a new element, we check if the data array is full (i.e. when len = capacity) . If so, we create a new array with double the current capacity, copy the existing elements into the new array and free up the memory for the current array.

Assuming this reallocation makes sure that we won’t be making out-of-bounds accesses, the rest is easy. Insert the new element and increment len . Here’s the snippet.

The hero in this snippet is the reallocate() method. It does all the hard work so that our container can get arbitrarily large (within memory constraints, of course) , and we don’t encounter any out-of-bounds exceptions.

This is what our hero does behind the scenes.

Whoa. That looks like a pretty expensive operation. It has a complexity of O(n) every time we reallocate. But, notice that we’re not doing it often enough to weigh down the program. To be precise, if we have N items, the number of relocations is roughly log2(N) . During the ith relocation, the cost is proportional to 2^i . If you work out a bit of math (sum up the costs and divide the result by N) , you’ll end up getting a constant. Thus, it turns out that insertion has an average or amortized complexity of O(1) .

This means that as our container gets larger, we can ignore this cost completely!

So that’s why we allocate an initial memory for 16 elements, in order to prevent a few reallocations.

This method deletes an element from the end of the container. How do we do that? Well. Check if the container is already empty. If not, decrement len . Couldn’t be simpler, right?

Here, it gets slightly interesting . Take a look at this snippet. What’s happening here is known as operator overloading . It allows CPP operators to have user-defined meanings on user-defined classes, thereby providing an intuitive interface to the users of that class. For example, you can overload the + operator on std::string so that myString + yourString concatenates the two strings.

Looks fine, right? Let’s try to get the value at a particular index.

auto i = con[5];

Works just fine!! Hurray!! Now it’s time to set the value at a particular index.

con[5] = 10;

Umm. Things didn’t work out as expected. The code didn’t even compile. 😢

It just threw an error and came to a halt.

error: lvalue required as left operand of assignment con[5] = 10;

Evidently, we’re missing something. When we wrote con[5] = 10 , what we wanted was a value of 10 to be stored in the memory location pointed to by con[5] . But take a look at what our operator[] method returns. It just returns data[index] , which is nothing but a value and there’s no way on earth your program can find the memory address for that returned value.

Umm. Wondering what to do?

C++ has got something to save us from this trouble. References. We should be returning a reference from the method in order to set the variable. You can think of it as returning an implicit pointer to a particular variable. Be careful with returning references though. For example, make sure you never ever return a reference to a local variable. NEVER. That’s just pure evil, because the stack-space allocated for the local variable will go away and you’ll be referring to nothing.

Notice the ampersand after T. That’s all you have to add.

If you’ve been following me till now, then implementing the three other methods size() , back() and empty() should be fairly easy. I’ll leave that to you, code warrior!!

Drawing to a close (but not really)

Once you put all of this together, you’ve implemented your own vector -like container from scratch, as promised! You can check here for the complete implementation. But this is nowhere near the end. There’s a lot more functionality that you can add, like implement the operator= (so that we can assign one vector to another), the erase method which can remove any arbitrary element in the vector and so on.

Written by Deep Bhattacharyya