# Sorting algorithms in Ruby, Part 2

Welcome back. This is part two of three in my “Sorting algorithms in Ruby” series. If you haven’t checked out the first one, by all means go take a quick peek: Sorting algorithms in Ruby, Part 1.

By the way, constructive comments are always welcome!

Heapsort

Heapsort is a great recursive sort algorithm. It starts by creating a heap, which is a data structure (more on that shortly).

The heap is created recursively and then used by the sort algorithm to create the new sorted array.

The algorithm has a time complexity of O (n log n). The algorithm can be implemented out of place or in-place. The latter being a little more trickier but requiring less memory to perform the sort. I went with the out of place approach in this post.

While quicksort is generally faster than heapsort, heapsort has a better worst case scenario and requires relatively few memory.

To use heapsort you must first build a heap. I had to go back and read about heaps since I had forgotten most of the details from my university computer science classes. Once I did though, everything started coming back pretty quickly. Heaps are a data structure, more precisely a binary (or ternary) tree structure. Not all trees are heaps, but all heaps are trees (binary trees actually).

A tree is a data structure where nodes have a parent (except for the root node) and can have 0 to n children nodes. In a binary tree, the nodes have between 0 to 2 children.

To qualify as a heap, a binary tree must follow these two rules:

• For a node B with parent node A. The key of B is smaller than the key of A.
• For all levels of the tree except the last one, all levels must be fully-filled. If the last level isn’t fully filled, it must filled from left to right.

Here are some illustrations of these data structures.

Heaps are typically represented in a linear array. A simple function using a formula is used to find the child and parent nodes of a node in array position i.

If the array has a zero-based index, the children of the node at index i are at:

index = 2 * i + 1
index = 2 * i + 2

To find the parent of a given node, we can use the formula:

index = floor((i – 1) / 2)

Which can be implemented in Ruby as : heap[((i – 1) / 2).floor]

Using an array seems counter intuitive at first. When heapsort was first designed, object oriented programming was still in embryonic stages. Also, using clever tricks to keep data storage space at a minimum was the norm. Even today, using an array to store the heap is a good solution on embedded systems with low memory.

To someone used to think in object-oriented principles, like myself, the heap begs to be implemented as a class. For this blog entry I decided to try out both approaches. I will present the more classical array based method. The article initially contained an object-oriented approach but as an astute reader commented the implementation was flawed.

Looking into it I noticed implementing heapsort was quicker and made more sense in the traditional approach.

I started by writing the heap code before the sort algorithm. I began by writing a first test, making it pass and then writing a second test case. Here is my test class:

```require 'test/unit'
require './heap.rb'

class HeapTest < Test::Unit::TestCase
@@unsorted = [6, 3, 5, 7, 1, 2, 8, 39, 23, 12, 34, 11, 32, 4, 13]
# This is an already generated heap, verified by hand.
@@generated_heap = [39, 34, 32, 8, 23, 11, 13, 3, 6, 1, 12, 2, 7, 4, 5]

def test_heapify_will_put_greatest_number_at_root_node
test_heap = Heap.new
test_heap.heapify(@@unsorted)
greatest_number = @@unsorted.max

assert_equal greatest_number, test_heap.heap.first
end

# test for heap property
def test_heapify_heap_will_follow_heap_property
# Heap property: child nodes have keys which are smaller or equal to their parent node's keys
test_heap = Heap.new
test_heap.heapify(@@unsorted)

(test_heap.heap.length - 1).downto(1) do |i|
node_value = test_heap.heap[i]
parent_node_value = test_heap.heap[((i - 1) / 2).floor]

assert_operator node_value, :<=, parent_node_value
end
end

def test_heapify_heap_is_equal_to_manually_calculated_heap
test_heap = Heap.new
test_heap.heapify(@@unsorted)

assert_equal @@generated_heap, test_heap.heap
end

def test_remove_removes_and_replaces_root_node
test_heap = Heap.new
test_heap.heapify(@@unsorted)

test_heap.remove()

assert_not_equal @@generated_heap.first, test_heap.heap.first
assert_equal @@generated_heap.length - 1, test_heap.heap.length
end

def test_remove_resulting_heap_will_have_34_as_root_node
test_heap = Heap.new
test_heap.heapify(@@unsorted)

test_heap.remove()

assert_equal 34, test_heap.heap.first
end

def test_remove_resulting_heap_satisfies_heap_property
# Heap property: child nodes have keys which are smaller or equal to their parent node's keys
test_heap = Heap.new
test_heap.heapify(@@unsorted)

test_heap.remove()

(test_heap.heap.length - 1).downto(1) do |i|
node_value = test_heap.heap[i]
parent_node_value = test_heap.heap[((i - 1) / 2).floor]

assert_operator node_value, :<=, parent_node_value
end
end

def test_empty_will_return_false_if_heap_contains_elements
test_heap = Heap.new
test_heap.heapify(@@unsorted)

result = test_heap.empty?()

assert !result
end

def test_empty_will_return_true_if_heap_contains_no_elements
test_heap = Heap.new
test_heap.heapify([1])
test_heap.heap.delete_at(0)

result = test_heap.empty?

assert result
end
end
```

Here is the source code for my heap implementation. This implementation builds a heap contained in an array.

```class Heap
attr_accessor :heap

def initialize()
@heap = []
end

# Creates a max-heap
def heapify(data)
data.each do |element|
@heap.push element
up_heap(@heap.length - 1)
end
end

def remove()
@heap.delete_at(0)

if not empty?
last_element = @heap.pop
@heap.insert(0, last_element)
down_heap(0)
end
end

def empty?()
return false if @heap.length > 0
return true
end

private
# method to restore heap property after adding an element to a heap
def up_heap(index)
if parent_exists?(index)
parent_index = ((index - 1) / 2).floor

if @heap[parent_index] < @heap[index]
swap(index, parent_index)
up_heap(parent_index)
end
end
end

# method to restore heap property after removing an element from the heap
def down_heap(index)
if smaller_than_children?(index)
index_of_greatest_child = index_of_greatest_child?(index)
swap(index, index_of_greatest_child)
down_heap(index_of_greatest_child)
end
end

def swap(first_index, second_index)
@heap[first_index], @heap[second_index] = @heap[second_index], @heap[first_index]
end

def parent_exists?(index)
return index != 0
end

def smaller_than_children?(index)
if @heap.length < 2 * index + 2
# element doesn't have any children
return false
elsif @heap.length == 2 * index + 2
# element only has one child
return @heap[index] < @heap[2 * index + 1]
else
# element has two children
return @heap[index] < @heap[2 * index + 1] || @heap[index] < @heap[2 * index + 2]
end
end

# This method assumes there is at least one child for the specified index.
def index_of_greatest_child?(index)
# We only have one child, it is thus the greatest
return 2 * index + 1 if @heap.length == 2 * index + 2

# We have two children
if @heap[2 * index + 1] > @heap[2 * index + 2]
return 2 * index + 1
else
return 2 * index + 2
end
end
end
```

The heap class contains the array which is accessible through an accessor.

The heapify method will take an unsorted array and will store it as a heap in the array. The implementation creates a max-heap, meaning a heap where the root node is the greatest element in the heap. The algorithm could also be created with a min-heap.

Remove, on the other hand will remove the first element on a heap (to be used by the sorting algorithm) and then recreate the heap using the recursive down_heap method.

Now for the sort algorithm and it’s test…

For the tests, I decided to create a reusable SortTest class. Heapsort was more complex to implement than my earlier sorts. Because of this, I decided to have a much more rigorous test suite. Since I’m feeling confident I will be able to use the same test cases for quicksort, I decided to put them away in a reusable SortTest class.

```require 'test/unit'
require './heap_sort.rb'
require './sort_test.rb'

class HeapSortTest < Test::Unit::TestCase
def setup
heap_sort = HeapSort.new
@sort_test = SortTest.new(heap_sort)
end

def test_empty_array
assert @sort_test.test_empty_array, "Empty array not corectly sorted"
end

def test_array_with_one_element
assert @sort_test.test_array_with_one_element, "Array with one element not correctly sorted"
end

def test_ordered_array
assert @sort_test.test_ordered_array, "Ordered array not correctly sorted"
end

def test_array_with_even_number_of_elements
assert @sort_test.test_array_with_even_number_of_elements, "Array with even number of elements not correctly sorted"
end

def test_array_with_odd_number_of_elements
assert @sort_test.test_array_with_odd_number_of_elements, "Array with odd number of elements not correctly sorted"
end

def test_medium_sized_array
assert @sort_test.test_medium_sized_array, "Medium sized array not correctly sorted"
end

def test_reversed_sorted_array
assert @sort_test.test_reversed_sorted_array, "Reversed sorted array not correctly sorted"
end

def test_large_size_array
assert @sort_test.test_large_sized_array, "Large sized array not correctly sorted"
end

def test_test_array_with_repeating_elements
assert @sort_test.test_array_with_repeating_elements, "Array with repeating elements not correctly sorted"
end
end
```
```require './heap.rb'

class HeapSort
def sort(to_sort)
heap_to_sort = Heap.new
heap_to_sort.heapify(to_sort)

sorted_array = []

until heap_to_sort.empty?
sorted_array.insert(0, heap_to_sort.heap.first)
heap_to_sort.remove()
end

return sorted_array
end
end
```

As you can see, most of the logic is in the heap class. The out of place sort algorithm is dead simple. Since the root node of a max-heap will always be it’s greatest element, we iterate over the heap taking the largest element and placing it in our new array.

Then we just have to remove the first element and make sure we reconstruct the heap correctly. After which we start over again.

That is all for now. See you next time for quicksort.

## 2 thoughts on “Sorting algorithms in Ruby, Part 2”

1. Hi,

Thanks for the detailed explaination and the lucid, self explanatory code.It has helped me a lot while I was revising all the sorting algorithms

When I tried to dry run the object based HeapSort program above, it struck to me that
a “last_inserted _node” is always going to have its left and right child nil. And since you check this for the left_child node first
– “if @last_inserted_node.left_child.nil? ”
its going to be true always . So , this function never adds a node to the right side. It keeps on adding nodes to the left subtree, thus making it more of a linked list than a tree/heap

Thanks
A.

1. Hi,

First of all thank you for your great positive comments. Secondly thanks for pointing out my mistake, I am always glad for a chance to improve.

I will correct the code and post a new version shortly.

Thanks 🙂