Fundamental Data Structures: Linked Lists

Fundamental Data Structures: Linked Lists

Implementing a Linked List in Python

Linked lists are a fundamental data structure used in many algorithms and applications. In this Python tutorial, we will go over how to implement a basic linked list and some common operations associated with them. This blog covers a few abstract concepts of data structures and is based on a foundation of basic understanding of arrays, lists, dictionaries, and other fundamental pieces. If you are not familiar or very comfortable with the basics, check out the documentation here: Python Lists and study up!

Why Linked Lists?

Linked lists are one of the fundamental data structures used in computer science and programming. Unlike lists which store data in sequential order in memory, linked lists consist of nodes that contain data as well as a reference to the next node in the list. This allows for efficient insertion and removal of nodes as it does not require shifting elements as a list would.

Some key benefits of using linked lists are:

  • Efficient insertion and removal - Adding and removing elements from a linked list is very fast and constant time O(1) since you only need to update the references between nodes.

  • Flexible size - There is no need to pre-allocate a fixed amount of memory up front like lists. Linked lists can continue to grow dynamically as needed.

  • Efficient memory usage - Linked list nodes can be scattered throughout memory and don't need to take up a single contiguous block. This allows optimization of memory allocation.

Linked lists shine for applications that require frequent insertion/deletion but do not need random access to elements. One way to look at a linked list is a 'conga line' at a wedding. Everyone holds the hips of the person in front of them and their hips are held in turn by the person to their rear, excepting only those in the front and the back. The only way to add people to the line is to find the right spot and decouple that connection, then insert the new person or group of people. By using a linked list you can efficiently add and remove nodes, or dancers, as they are created and removed. The nodes point backward and forward to maintain the sequential order of commands.

Glossary of Linked List Terms

Here are a few quick definitions of some common linked list terminology:

  • Node - Each element in a linked list is called a node. It contains data and pointers to other nodes.

  • Next - The pointer in each node that refers to the next node in the list.

  • Head - The first node in a linked list. This serves as the entry point.

  • Tail - The last node in a linked list. Its next pointer refers to null/None.

  • Traversing - The act of visiting each node in sequence from head to tail by following the next pointers.

  • Pointer - The reference to another node in the list. Allows forming the chain of elements.

  • None - A special value indicating the end of the list or an empty pointer reference.

  • Index - Linked lists typically do not support indexed access unlike lists. Nodes must be traversed sequentially.

  • Insertion - Adding a new node involves updating a few pointers to insert it in the right place.

  • Deletion - Removing a node updates the previous node's next pointer to skip over the removed node.

These key terms help define the structure and operations of a linked list data structure. Understanding them is essential for implementing and working with linked lists.

Three Main Types of Linked Lists

Singly Linked List

  • Each node only has a reference to the next node.

  • Traversal is only possible in one direction.

  • Simplest implementation but limited functionality.

Doubly Linked List

  • Each node has a reference to both the next and previous nodes.

  • Can traverse backward and forward.

  • Takes more memory with the additional pointer.

Circular Linked List

  • The last node's next pointer refers back to the first node, forming a circle.

  • Useful for implementations like buffers or queues.

  • No real endpoint - can loop continually.

The main differences have to do with the number of references each node has and whether the list forms a continuous circle. Singly linked lists are the simplest, while doubly linked and circular linked provide more flexibility and power. The tradeoff is higher memory usage for the additional references. This blog deals with the most basic, a singly linked list.

Creating a Node Class

First, we need to create a Node class that will represent each element in the linked list. The node will contain the data value and a pointer to the next node:

class Node:
  def __init__(self, data):
    self.data = data
    self.next = None

This Node class initializes with the data value and next pointer set to None by default.

Implementing the Linked List Class

Now we can implement the linked list class itself:

class LinkedList:
  def __init__(self):
    self.head = None

  def print_list(self):
    curr = self.head
    while curr:
      print(curr.data)
      curr = curr.next

The linked list class contains a head pointer that points to the first node in the list. The printList() method iterates through the list and prints each node's data value.

Common Linked List Methods

Insert

To insert a new node into the linked list, we'll define an insert method:

class LinkedList:
    # ...
  def insert(self, index, data):
    if index < 0 or index > self.length():
      raise Exception("Invalid index")

    newNode = Node(data)  
    if index == 0:
      newNode.next = self.head
      self.head = newNode
      return

    curr = self.head
    for i in range(index - 1):
      curr = curr.next

    newNode.next = curr.next
    curr.next = newNode

Validate the index is within the bounds of the list length. If index is 0, insert at head.

Otherwise, traverse to the node before the insertion index. Set the new node's next value to the node at the index.

Set the previous node's next value to the new node.

Add to Tail

Adds a new node to the end of the linked list.

class LinkedList:
    # ...
  def add(self, data):
    newNode = Node(data)
    if self.head is None:
      self.head = newNode
    else:
      curr = self.head
      while curr.next is not None:
        curr = curr.next
      curr.next = newNode

Iterate to the tail node, then set next to the new node.

Push to Head

Adds a new node at the beginning of the linked list.

class LinkedList:
    # ...
  def push(self, data):
    newNode = Node(data)
    newNode.next = self.head
    self.head = newNode

Update the head to point to the new node.

Pop Head

Removes the first node in the linked list.

class LinkedList:
    # ...
  def pop(self):
    if self.head is None:
      return None

    head = self.head
    self.head = self.head.next
    return head.data

Store the head node, update head to the next node, and return data.

Remove by Value

Searches for and removes a node by its value.

class LinkedList:
    # ...
  def remove(self, value):
    curr = self.head
    prev = None

    while curr is not None:
      if curr.data == value:
        if prev is None:
          self.head = curr.next
        else:
          prev.next = curr.next
        return

      prev = curr
      curr = curr.next

Iterate over nodes, updating the previous node's next pointer to skip over the removed node.

Big O Notation and Linked Lists

Here is a comparison of the Big O runtimes for common operations on linked lists vs Python lists:

  • Access by index:

    • Linked list: O(N) - Must traverse from head to reach index

    • Python list: O(1) - Direct access by index

  • Insert/delete at index:

    • Linked list: O(1) - Only update a few pointer references

    • Python list: O(N) - Must shift all subsequent elements

  • Insert/delete at head:

    • Linked list: O(1) - Only update head pointer

    • Python list: O(N) - Must shift all elements

  • Insert/delete at tail:

    • Linked list: O(1) - Update tail pointer and prev node reference

    • Python list: O(1) - Append to end of list

  • Search for value:

    • Linked list: O(N) - Sequential search

    • Python list: O(N) - Sequential search

In summary, linked lists have better insertion and deletion performance compared to Python lists because they don't have to shift elements around. However, they lack fast indexing and random access.

Choosing between a linked list or a Python list depends on the specific operations needed. Linked lists work well for queues, stacks, and other use cases requiring frequent insertions and deletions.

Final code:

class Node:
  def __init__(self, data):
    self.data = data
    self.next = None

class LinkedList:
  def __init__(self):
    self.head = None

  def push(self, new_data):
    new_node = Node(new_data)
    new_node.next = self.head
    self.head = new_node

  def pop(self):
    if self.head is None:
      return None
    popped = self.head
    self.head = self.head.next
    return popped.data

  def add(self, new_data):
    new_node = Node(new_data)
    if self.head is None:
        self.head = new_node
        return
    last = self.head 
    while last.next:
        last = last.next
    last.next = new_node

  def remove(self, value):
    curr = self.head
    prev = None
    while curr is not None:
      if curr.data == value:
        if prev is None:
          self.head = curr.next
        else:
          prev.next = curr.next
        return
      prev = curr
      curr = curr.next

  def printList(self):
    curr = self.head
    while curr:
      print(curr.data)
      curr = curr.next

myList = LinkedList()
myList.push(1)
myList.push(2)
myList.add(3)

myList.printList()
print("Popped: " + str(myList.pop()))
myList.printList()