Fundamental Data Structures: Linked Lists
Implementing a Linked List in Python
Linked lists are a fundamental data structure used in many algorithms and applications. In this Python tutorial, we will go over how to implement a basic linked list and some common operations associated with them. This blog covers a few abstract concepts of data structures and is based on a foundation of basic understanding of arrays, lists, dictionaries, and other fundamental pieces. If you are not familiar or very comfortable with the basics, check out the documentation here: Python Lists and study up!
Why Linked Lists?
Linked lists are one of the fundamental data structures used in computer science and programming. Unlike lists which store data in sequential order in memory, linked lists consist of nodes that contain data as well as a reference to the next node in the list. This allows for efficient insertion and removal of nodes as it does not require shifting elements as a list would.
Some key benefits of using linked lists are:
Efficient insertion and removal - Adding and removing elements from a linked list is very fast and constant time O(1) since you only need to update the references between nodes.
Flexible size - There is no need to pre-allocate a fixed amount of memory up front like lists. Linked lists can continue to grow dynamically as needed.
Efficient memory usage - Linked list nodes can be scattered throughout memory and don't need to take up a single contiguous block. This allows optimization of memory allocation.
Linked lists shine for applications that require frequent insertion/deletion but do not need random access to elements. One way to look at a linked list is a 'conga line' at a wedding. Everyone holds the hips of the person in front of them and their hips are held in turn by the person to their rear, excepting only those in the front and the back. The only way to add people to the line is to find the right spot and decouple that connection, then insert the new person or group of people. By using a linked list you can efficiently add and remove nodes, or dancers, as they are created and removed. The nodes point backward and forward to maintain the sequential order of commands.
Glossary of Linked List Terms
Here are a few quick definitions of some common linked list terminology:
Node - Each element in a linked list is called a node. It contains data and pointers to other nodes.
Next - The pointer in each node that refers to the next node in the list.
Head - The first node in a linked list. This serves as the entry point.
Tail - The last node in a linked list. Its next pointer refers to null/None.
Traversing - The act of visiting each node in sequence from head to tail by following the next pointers.
Pointer - The reference to another node in the list. Allows forming the chain of elements.
None - A special value indicating the end of the list or an empty pointer reference.
Index - Linked lists typically do not support indexed access unlike lists. Nodes must be traversed sequentially.
Insertion - Adding a new node involves updating a few pointers to insert it in the right place.
Deletion - Removing a node updates the previous node's next pointer to skip over the removed node.
These key terms help define the structure and operations of a linked list data structure. Understanding them is essential for implementing and working with linked lists.
Three Main Types of Linked Lists
Singly Linked List
Each node only has a reference to the next node.
Traversal is only possible in one direction.
Simplest implementation but limited functionality.
Doubly Linked List
Each node has a reference to both the next and previous nodes.
Can traverse backward and forward.
Takes more memory with the additional pointer.
Circular Linked List
The last node's next pointer refers back to the first node, forming a circle.
Useful for implementations like buffers or queues.
No real endpoint - can loop continually.
The main differences have to do with the number of references each node has and whether the list forms a continuous circle. Singly linked lists are the simplest, while doubly linked and circular linked provide more flexibility and power. The tradeoff is higher memory usage for the additional references. This blog deals with the most basic, a singly linked list.
Creating a Node Class
First, we need to create a Node
class that will represent each element in the linked list. The node will contain the data value and a pointer to the next node:
class Node:
def __init__(self, data):
self.data = data
self.next = None
This Node
class initializes with the data
value and next
pointer set to None
by default.
Implementing the Linked List Class
Now we can implement the linked list class itself:
class LinkedList:
def __init__(self):
self.head = None
def print_list(self):
curr = self.head
while curr:
print(curr.data)
curr = curr.next
The linked list class contains a head
pointer that points to the first node in the list. The printList()
method iterates through the list and prints each node's data value.
Common Linked List Methods
Insert
To insert a new node into the linked list, we'll define an insert
method:
class LinkedList:
# ...
def insert(self, index, data):
if index < 0 or index > self.length():
raise Exception("Invalid index")
newNode = Node(data)
if index == 0:
newNode.next = self.head
self.head = newNode
return
curr = self.head
for i in range(index - 1):
curr = curr.next
newNode.next = curr.next
curr.next = newNode
Validate the index is within the bounds of the list length. If index is 0, insert at head.
Otherwise, traverse to the node before the insertion index. Set the new node's next value to the node at the index.
Set the previous node's next value to the new node.
Add to Tail
Adds a new node to the end of the linked list.
class LinkedList:
# ...
def add(self, data):
newNode = Node(data)
if self.head is None:
self.head = newNode
else:
curr = self.head
while curr.next is not None:
curr = curr.next
curr.next = newNode
Iterate to the tail node, then set next to the new node.
Push to Head
Adds a new node at the beginning of the linked list.
class LinkedList:
# ...
def push(self, data):
newNode = Node(data)
newNode.next = self.head
self.head = newNode
Update the head to point to the new node.
Pop Head
Removes the first node in the linked list.
class LinkedList:
# ...
def pop(self):
if self.head is None:
return None
head = self.head
self.head = self.head.next
return head.data
Store the head node, update head to the next node, and return data.
Remove by Value
Searches for and removes a node by its value.
class LinkedList:
# ...
def remove(self, value):
curr = self.head
prev = None
while curr is not None:
if curr.data == value:
if prev is None:
self.head = curr.next
else:
prev.next = curr.next
return
prev = curr
curr = curr.next
Iterate over nodes, updating the previous node's next pointer to skip over the removed node.
Big O Notation and Linked Lists
Here is a comparison of the Big O runtimes for common operations on linked lists vs Python lists:
Access by index:
Linked list: O(N) - Must traverse from head to reach index
Python list: O(1) - Direct access by index
Insert/delete at index:
Linked list: O(1) - Only update a few pointer references
Python list: O(N) - Must shift all subsequent elements
Insert/delete at head:
Linked list: O(1) - Only update head pointer
Python list: O(N) - Must shift all elements
Insert/delete at tail:
Linked list: O(1) - Update tail pointer and prev node reference
Python list: O(1) - Append to end of list
Search for value:
Linked list: O(N) - Sequential search
Python list: O(N) - Sequential search
In summary, linked lists have better insertion and deletion performance compared to Python lists because they don't have to shift elements around. However, they lack fast indexing and random access.
Choosing between a linked list or a Python list depends on the specific operations needed. Linked lists work well for queues, stacks, and other use cases requiring frequent insertions and deletions.
Final code:
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
def push(self, new_data):
new_node = Node(new_data)
new_node.next = self.head
self.head = new_node
def pop(self):
if self.head is None:
return None
popped = self.head
self.head = self.head.next
return popped.data
def add(self, new_data):
new_node = Node(new_data)
if self.head is None:
self.head = new_node
return
last = self.head
while last.next:
last = last.next
last.next = new_node
def remove(self, value):
curr = self.head
prev = None
while curr is not None:
if curr.data == value:
if prev is None:
self.head = curr.next
else:
prev.next = curr.next
return
prev = curr
curr = curr.next
def printList(self):
curr = self.head
while curr:
print(curr.data)
curr = curr.next
myList = LinkedList()
myList.push(1)
myList.push(2)
myList.add(3)
myList.printList()
print("Popped: " + str(myList.pop()))
myList.printList()