ADT: Data structures and encapsulation together

A data structure is a better way to store and organise data elements within the program (i.e. we do not need to resort to going to a file on the hard drive).

Just like we defined types (e.g. a Person class) to represent data, we can define types to collectively associate many Person instances.

You have heard a lot about encapsulation in this course. An Abstract Data Type (ADT) encapsulates a given data structure:

  • We define the structure in terms of its data and what operations we can perform on that data
  • We do not care about the implementation details

You have already two data structures:

  • Array
  • ArrayList




Arrays and ArrayLists

Arrays have constant sizes, programmers should ensure the index is valid when accessing an element. When creating an array developers need to specify the length of the array (number of elements) and this size cannot change during execution (runtime).

Person[] heroes = new Person[20];
heroes[0] = new Person("Ken Williamson");
heroes[1] = new Person("Virat Kohli");
...
for (int i = 0; i < heroes.length; i++){
    System.out.println("My heroes are: "+ heroes[i].getName());
}

ArrayLists have a dynamic size that can change at runtime.

import java.util.ArrayList;

ArrayList<Person> heroes = new ArrayList<>();
heroes.add(new Person("Gerard Berry"));
heroes.add(new Person("Amir Pnueli"));
heroes.add(new Person("Paul Caspi"));
for (int i = 0; i < heroes.size(); i++) {
    System.out.println("My heroes are:" + heroes.get(i).getName());
}




Abstract Data Types

An ADT contains data elements, and specifies what operations maybe performed on those elements (and the corresponding effects of those operations). NOTE: there can be many different ways of implementing a given operation.

For example, assume an airline would like to manage its booking. We can define an abstract data structure FlightSeatingwith operations:

  • reserveSeats(numberToReserve)
  • getNumberOfAvailableSeats()
  • cancelSeatReservation(seatID)

There is (purposely) no mention of the implementation. All we care about is the interface of how it gets used, and what its effects will be:

FlightSeating nz263 = new FlightSeating();
if (nz263.getNumberOfAvailableSeats() == 0){
    System.out.println(This flight is fully booked);
}
...
int reservationID = nz263.reserveSeats(numAdults + numChildren);

ADT in this course

We will be looking at standard data types, ones that you are already (conceptually) familiar with:

  • singly/doubly linked-lists
  • stacks
  • queues
  • graphs
  • binary trees

Each of these ADTs have their own set of operations that operate on the ADT itself (e.g. adding, removing elements)

You can visualize many of these using this website

We will then look at common algorithms to perform on those ADT:

  • searching
  • sorting

And get a feel for the “cost” of various operations and algorithms

Today, let us define those ADTs. In the following lectures, we will dig further into them!




Limitation of arrays

Although arrays are useful data structures, they have limitations:

  • size of the array is fixed at creation time
  • the data elements are physically arranged in memory next to each other. This is actually an advantage when it comes to performance (cache prefetching) if the array is accessed a lot for reading. However, inserting/deleting an element involves moving other elements to maintain the array, and this comes at a cost.
  • Play around with the array activities in the InteractiveDS app!

Inserting an element into heroes[4]

Edit




The Linked-List

The Linked-List solves the limitations of the array.

Each data element is associated with a node. To add a new data element, we create a new corresponding node for it.

To collectively manage the data, we link the nodes together.

Since we have access to the first node, and that node links to subsequent nodes, we have access to all the elements.

Although conceptually we visualise the nodes next to each other, they (and the data they contain) may be anywhere in memory. This way, we do not need to rearrange data in memory.

The nodes might link only to the next node (singly linked-list), or it might point to the next and previous nodes (doubly linked-list)

Edit




The singly linked-list

Edit

We have a reference (named head) to the first node in the list.

Ultimately, the head will represent the entire list and is the sole access to the data. It is through the head we operate on the list.

Typical operations on a singly linked-list include:

  • insert(), prepend(), append() new value at respective position
  • remove() the value at a particular position
  • fetch() a value at a particular position
  • get the current size() of the list

We can also have a reference (named tail) to the last node in the list




The doubly linked-list

Edit

The singly linked-list only had access to elements via the next pointer

  • At any given position in the list, we could not go “backwards”. Operations that require the node before are costly, especially as the list gets longer and longer.

The doubly linked-list overcomes this by requiring a node to store a pointer to the previous node, so we can travel up and down the list.

We also have a pointer to the head(first node) and tail(last node).

Typical operations on a doubly linked list include those of the singly linked-list, and maybe some more specific ones, e.g. insertToTail(), deleteFromTail()




The stack

Edit

A stack is a data structure that follows a Last In/First Out (LIFO) policy (unfair).

Operations are only allowed to manipulate “the top” of the stack:

  • You make a pancake, add it to the top of the other pancakes.
  • You are hungry, you eat the pancake on the top (yes, be greedy and leave the coldest ones at the bottom for someone else!).

Typical operations include:

  • push() add a new element to top of stack.
  • pop() remove top element off the stack.
  • peek() return the top element without actually removing it from the stack.
  • clear() remove all elements.
  • isEmpty() check if the stack is empty




The queue

A queue is a data structure that follows a First In/First Out (FIFO) policy (fair).

The queue is essentially a “waiting in line” scenario

  • The queue grows from the back as more people arrive.
  • The queue shrinks as the front gets served.
  • No cheating! No pushing in line! No pulling out of queue!

Typical operations include:

  • enqueue() add a new element to the end of the queue.
  • dequeue() remove first element from front of queue.
  • peek() return the first element without removing it.
  • clear() remove all elements.
  • isEmpty() check if the queue is empty.

Edit




The [binary] tree

The Linked-List was more flexible than arrays.

Stacks and queues were more controlled with specific characteristics on how the structure evolved, but were still one-dimensional.

A Tree provides an added level of hierarchy, even relationship, amongst elements.

The Tree is defined by the root node (which has no parent), and each node has some children nodes (unless it is a leaf node).

A Binary Tree is one where each node has at most 2 children.

Edit