Kruskal Algorithm : All you need to know

Kruskal’s Algorithm: A Greedy Approach to Minimum Spanning Trees

Spanning trees are fundamental to network design, cluster analysis, and routing protocols. In this guide, we will break down Kruskal’s Algorithm, one of the most elegant and widely-used greedy algorithms in computer science.

Before moving to understanding more of Kruskal Algorithm, you should be aware of below tersm

Term	Definition
Spanning Tree	A tree that connects all vertices using minimum edges (V-1 for V vertices)
Minimum Spanning Tree (MST)	A spanning tree with minimum total edge weight
Cycle	A path that starts and ends at the same vertex
Acyclic	A graph with no cycles

1. The Spanning Tree Problem

Let’s define the problem we are trying to solve:

Given: A connected, weighted, and undirected graph (G = (V, E)), where (V) is the set of vertices (nodes) and (E) is the set of edges.
Goal: Find a subset of edges that connects all vertices in the graph together without any cycles, such that the total sum of the edge weights is minimized.
Output: A set of (V – 1) edges forming the Minimum Spanning Tree (MST).

What exactly is a Spanning Tree?

A spanning tree of a graph is a subgraph that:

Spans: It connects all (V) vertices.
Is a Tree: It is acyclic (contains no cycles).
Has minimal edges: It contains exactly (V – 1) edges.

A Minimum Spanning Tree (MST) is simply a spanning tree whose sum of edge weights is the smallest possible among all potential spanning trees.

2. The Core Insight

Kruskal’s algorithm works greedily. It sorts all edges in ascending order of their weights and adds them one by one, skipping any edge that would form a cycle.

This greedy strategy is mathematically justified by the Cut Property:

[!IMPORTANT]
The Cut Property:
For any cut (partition) of the vertices in a graph, the minimum-weight edge crossing the cut must belong to some Minimum Spanning Tree of the graph.

By always selecting the cheapest available edge that connects two previously disconnected components, Kruskal’s algorithm builds up the optimal global solution through locally optimal choices.

3. Intuition with an Analogy: The Cable Network Planner

Imagine you are a network engineer tasked with connecting a series of remote cities with fiber-optic cables:

You have a map of all cities and a list of all potential connection paths along with their costs.
Your objective is simple: Connect all cities so they can communicate with one another using the lowest possible budget.

Your Strategy:

Sort your list of potential connections from cheapest to most expensive.
Buy the cheapest cable on your list.
Move to the next cheapest. If it connects a city that is not already connected to your network (either directly or indirectly), buy it.
If a cable would connect two cities that already have a path between them (e.g., City A is connected to B, and B to C, so buying a cable A-C is redundant), skip it! Doing so would waste your budget and create a redundant loop (cycle).
Stop once all cities are connected.

This is exactly how Kruskal’s algorithm behaves.

4. DSU: The Engine of Kruskal’s Algorithm

To quickly check if adding an edge will form a cycle, Kruskal’s algorithm relies on a highly efficient data structure: the Disjoint Set Union (DSU) or Union-Find.

The DSU manages a collection of disjoint (non-overlapping) sets. Each set represents a connected component in our graph. It supports three core operations:

make_set(x): Initializes a new set containing only element (x).
find(x): Returns the representative (or “root”) of the set containing (x). If two elements have the same representative, they belong to the same component.
union(x, y): Merges the set containing (x) with the set containing (y).

5. Step-by-Step Algorithm

Sort: Sort all edges in the graph in ascending order of their weights.
Initialize: Initialize the DSU such that each vertex (v \in V) is in its own independent set.
Iterate: For each sorted edge (e = (u, v)) with weight (w):

Call find(u) and find(v) to locate their components.
If the components are different (find(u) != find(v)):
- Add edge ((u, v)) to your MST.
- Call union(u, v) to merge the sets.
If they are the same (find(u) == find(v)):
- Skip the edge to prevent a cycle.

Terminate: Stop when your MST has exactly (V – 1) edges.

6. Pseudocode

def kruskal_mst(graph):
    # Initialize MST
    mst = []
    mst_weight = 0

    # 1. Retrieve and sort all edges by weight
    edges = graph.get_all_edges() # Returns list of (weight, u, v)
    edges.sort()

    # 2. Initialize DSU
    dsu = DisjointSet(graph.num_vertices)

    # 3. Iterate through sorted edges
    for weight, u, v in edges:
        # Check if u and v are in different components
        if dsu.find(u) != dsu.find(v):
            mst.append((u, v, weight))
            mst_weight += weight
            dsu.union(u, v)

            # Optimization: Stop early if MST has V-1 edges
            if len(mst) == graph.num_vertices - 1:
                break

    return mst, mst_weight

7. Step-by-Step Execution Example

Let’s trace the algorithm on a graph with 5 vertices (A, B, C, D, E) and 8 weighted edges.

The Initial Graph

The graph is composed of 5 vertices and 8 weighted edges:

Vertices (V): $A, B, C, D, E$
Weighted Edges (E):
- B-C: Weight 1
- A-C: Weight 2
- D-E: Weight 2
- A-B: Weight 4
- B-D: Weight 5
- B-E: Weight 6
- C-D: Weight 8
- C-E: Weight 10

Sorted Edges List

Priority	Edge	Weight	Action
1	B-C	1	Evaluated first
2	A-C	2	–
3	D-E	2	–
4	A-B	4	–
5	B-D	5	–
6	B-E	6	–
7	C-D	8	–
8	C-E	10	Evaluated last

Step-by-Step Trace

Step 1: Initialize

Current components: {A}, {B}, {C}, {D}, {E}
MST edges: []
MST total weight: 0

Step 2: Process Edge (B-C, Weight 1)

find(B) = B and find(C) = C.
Since (B) and (C) are in different components, we ADD this edge to the MST.
Merge components: union(B, C).
Current components: {A}, {B, C}, {D}, {E}
MST edges: [B-C]
MST total weight: 1

Step 3: Process Edge (A-C, Weight 2)

find(A) = A and find(C) = C (represented by component {B, C}).
Since (A) and (C) are in different components, we ADD this edge to the MST.
Merge components: union(A, C).
Current components: {A, B, C}, {D}, {E}
MST edges: [B-C, A-C]
MST total weight: 3

Step 4: Process Edge (D-E, Weight 2)

find(D) = D and find(E) = E.
Since (D) and (E) are in different components, we ADD this edge to the MST.
Merge components: union(D, E).
Current components: {A, B, C}, {D, E}
MST edges: [B-C, A-C, D-E]
MST total weight: 5

Step 5: Process Edge (A-B, Weight 4)

find(A) = C (representative of {A, B, C}) and find(B) = C (representative of {A, B, C}).
Since (A) and (B) belong to the same component, adding this edge would create a cycle ((A-B-C-A)).
We SKIP this edge.
MST edges: [B-C, A-C, D-E]
MST total weight: 5

Step 6: Process Edge (B-D, Weight 5)

find(B) = C (representative of {A, B, C}) and find(D) = E (representative of {D, E}).
Since (B) and (D) are in different components, we ADD this edge to the MST.
Merge components: union(B, D).
Current components: {A, B, C, D, E}
MST edges: [B-C, A-C, D-E, B-D]
MST total weight: 10
Status: MST is complete because it has selected (V-1 = 4) edges.

Final Result

All vertices are now fully connected with a minimal cost path:

[A \longleftrightarrow C \longleftrightarrow B \longleftrightarrow D \longleftrightarrow E]

Selected Edges: B-C (1), A-C (2), D-E (2), B-D (5)
Total weight: 10

8. Complexity Analysis

Time Complexity: (O(E \log E)) or (O(E \log V))

Sorting Edges: Sorting (E) edges takes (O(E \log E)) time.
DSU Operations: We perform at most (2E) find operations and (V-1) union operations. With path compression and union by rank, these take (O(E \cdot \alpha(V))) time.
Dominant term: Since (\alpha(V)) is practically a constant, the sorting step dominates the runtime. Thus, the total complexity is:
[O(E \log E)]
Since (E \le V^2), we have (\log E \le \log(V^2) = 2\log V). This means the time complexity can also be written as:
[O(E \log V)]

Space Complexity: (O(V + E))

(O(E)) to store the list of edges.
(O(V)) for the DSU parent and rank arrays.

9. Kruskal’s Algorithm: Pros and Cons

Advantages & Strengths

Optimal for Sparse Graphs: Runs extremely fast when (E) is relatively small (e.g. (E \approx V)).
Simple Intuition: Very straightforward to understand, explain, and write.
Negative Edge Weights: Works perfectly with negative weights (unlike Dijkstra’s shortest path algorithm).
Minimum Spanning Forests: If the graph is disconnected, it naturally terminates by finding the MST for each isolated component (producing a Spanning Forest).

Limitations & Constraints

Undirected Graphs Only: Cannot be applied directly to directed graphs (finding an MST on directed graphs requires Edmonds’ Algorithm).
Inefficient for Dense Graphs: When (E \approx V^2), sorting all edges takes (O(V^2 \log V)). Prim’s algorithm with a Fibonacci heap runs in (O(E + V \log V)), which is faster.
Not for Shortest Paths: It is crucial to remember that an MST minimizes total network cost, not the distance between specific pairs of vertices. If you need the shortest path from a source to other vertices, use Dijkstra’s Algorithm.

10. Kruskal’s vs. Prim’s Algorithm

Feature	Kruskal’s Algorithm	Prim’s Algorithm
Approach	Edge-centric (greedily adds sorted edges)	Vertex-centric (grows a tree from a source)
Best Suited For	Sparse graphs ((E \approx V))	Dense graphs ((E \approx V^2))
Time Complexity	(O(E \log E))	(O(E \log V)) (with Binary Heap) (O(E + V \log V)) (with Fibonacci Heap)
Cycle Detection	Handled by Disjoint Set Union (DSU)	Handled by tracking visited nodes
Data Structure	Disjoint Set Union (DSU), Array	Priority Queue / Heap

Kruskal’s Algorithm Step-by-Step Visualizer

Step 1 / 7

Speed:

Graph Canvas

Disjoint Set (Union-Find) State

Element	A	B	C	D	E
Parent	A	B	C	D	E
find(x)	A	B	C	D	E

Current execution

Step 1: Initialization

Initialize Disjoint Set (Union-Find) structures. Each vertex starts as its own component. Sort all edges by weight.

Connected Components Partition

{A} {B} {C} {D} {E}

MST weight 0

MST Edges Selected

None

Sorted Edges Queue

1. B-C (weight 1) Unprocessed

Key Strengths

✔
Optimal for Sparse Graphs: Time complexity O(E log E) makes it highly efficient when the edge count is low.
✔
Simple Union-Find Logic: Cycle detection takes nearly constant time O(α(V)) using path compression and union by rank.
✔
Disconnected Graphs: Can easily find Minimum Spanning Forests if the input graph is not fully connected.

Limitations & Complexity

✘
Dense Graph Slowdown: In very dense graphs where E ≈ V², Prim’s algorithm with a Fibonacci heap can outperform Kruskal’s.
✘
Sorting Overhead: The sorting phase takes O(E log E), which dominates the overall execution time.
i
Space Complexity: Needs O(V) space for disjoint-set representatives and ranks, and O(E) for storage.

WellAlgo