# Kruskal's Algorithm in Ruby

Last week I wrote a couple of posts showing different implementations of Prim's algorithm - an algorithm using to find a minimum spanning tree in a graph - and a similar algorithm is Kruskal's algorithm.

Kruskal's algorithm also finds a minimum spanning tree but it goes about it in a slightly different way.

Prim's algorithm takes an approach whereby we select nodes and then find connecting edges until we've covered all the nodes. Kruskal's algorithm, on the other hand, drives from the edges of lowest costs and **makes sure that we don't create a cycle** in our spanning tree by adding an edge to it.

The pseudocode for Kruskal's algorithm reads like so:

- sort edges E in order of increasing cost
- let T = minimum spanning tree, m = number of edges
- for i=1 to m:
- let e = selected edge (u,v)
- if there’s no way to go between u and v using edges in T:
- add e to T

- return T

The full code is on github and the outline of the algorithm reads like so:

```
@minimum_spanning_tree = []
edges = file.drop(1).
map { |x| x.gsub(/\n/, "").split(" ").map(&:to_i) }.
map { |one, two, weight| { :from => one, :to => two, :weight => weight}}.
sort_by { |x| x[:weight]}
edges.each do |edge|
@minimum_spanning_tree << edge unless has_cycles edge
end
```

The main bit of code is a variation on depth first search to check if adding an edge creates a cycle which would mean we're adding an edge which isn't needed as part of a minimum spanning tree.

```
def has_cycles(edge)
node_one, node_two = edge[:from], edge[:to]
@minimum_spanning_tree.each { |x| x[:explored] = false }
cycle_between(node_one, node_two, @minimum_spanning_tree.dup)
end
def cycle_between(one, two, edges)
adjacent_edges = edges.select { |edge| edge[:to] == one || edge[:from] == one}
return false if adjacent_edges.empty?
adjacent_edges.reject {|edge| edge[:explored] }.each do |edge|
edge[:explored] = true
other_node = (edge[:from] == one) ? edge[:to] : edge[:from]
return true if other_node == two || cycle_between(other_node, two, edges)
end
false
end
```

cycle_between is the depth first search but we're using it to tell us whether there's already a path between two nodes in the graph and we exit as soon as we determine that.

I added some code on line 3 to reset the explored status of edges each time that has_cycles is called but I'm not sure why this is necessary because I am making a copy of @minimum spanning_tree (on line 4) before mutating it.

Otherwise I think this is a fairly standard solution. If we wanted to go from node 1 to node 3 and we had the following edges: 1 -> 4 -> 5 -> 3 we'd make these method calls:

```
# excluding weights and explored status for brevity
cycle_between(1,3, [{:from => 1, :to => 4}, {:from => 4, :to => 5}, {:from => 5, :to => 3})
cycle_between(4,3, [{:from => 1, :to => 4}, {:from => 4, :to => 5}, {:from =>; 5, :to => 3})
cycle_between(5,3, [{:from => 1, :to => 4}, {:from => 4, :to => 5}, {:from => 5, :to => 3})
```

The function would exit on that last iteration because we'd match our target node '3'.

This algorithm wasn't one of the assignments for the class but I used the same data set that was provided for Prim's algorithm and was able to get the same output so I figure it works!

##### About the author

Mark Needham is a Developer Relations Engineer for Neo4j, the world's leading graph database.