Heap Sort

Posted by Beetle B. on Tue 19 August 2014

One way to sort a list is to use the heap data structure.

Given an array of \(N\) unsorted elements, build a max heap in place. Start from the “bottom”, and work your way up. Initially all leaf nodes are 1-heaps. Nothing to do with those. Then go up each parent and go through all 3-heaps, swapping elements to ensure the heap invariant holds. Then go up another level and repeat until you do it with the root.

Now that we have the heap structure, keep popping the top element until your array is sorted. This is all done in-place.

Complexity

  • Building the heap structure takes \(O(N)\).
  • Sorting after you have the heap structure takes \(\Theta(N\lg N)\) - basically each pop of the max is \(O(\lg N)\) and we do it \(N\) times. The worst and best case is \(\Theta(N\lg N)\).

Properties

  • It is usually slower than quicksort.
  • It is in-place.
  • It is unstable.
  • Because it is a binary heap, it is poor with the cache.

Implementations

Python

from random import shuffle
from heapq import heapify, heappop

def heapsort(lst, comparator=simple_compare):
    """
    Heap sort.

    Note that this does /not/ sort in place! It creates temporary memory.
    
    Keyword Arguments:
    lst        -- List to sort
    comparator -- Comparator function. It should take in two arguments, and
                  return -1 if the left is smaller, 1 if the right is smaller,
		  and 0 if they are equal. If none is provided, then it will
		  assume a numerical sort.
    is_shuffled -- Flag on whether to shuffle list.
    """
    heapify(lst)
    return [heappop(lst) for x in xrange(len(lst))]

C++

One can use the heap_sort function in the algorithms library. But you have to call make_heap first!

#include <iostream>
#include <cstdlib>
#include <vector>
#include <algorithm>

bool verify_correctness(std::vector<int>& lst)
{
  for(std::vector<int>::iterator iter = lst.begin(); 
      (iter + 1) != lst.end(); ++iter) 
    {
      if (*iter > *(iter+1))
        {
          return false;
        }
    }
  return true;
}

int comp_function(const void* a, const void* b)
{
  if ((*(int*)a) < (*(int*)b))
    return -1;
  if ((*(int*)b) < (*(int*)a))
    return 1;
  return 0;
}

int main(void) {

  std::vector<int> lst;
  unsigned int N = 10;
  for (unsigned int i; i < N; i++) 
    {
      lst.push_back(std::rand() % N);
    }

  for (std::vector<int>::size_type i=0; i!=lst.size(); ++i)
    std::cout << lst[i] << std::endl;

  std::cout << std::endl;

  std::make_heap(lst.begin(), lst.end());
  std::sort_heap(lst.begin(), lst.end());

  for (std::vector<int>::size_type i=0; i!=lst.size(); ++i)
    std::cout << lst[i] << std::endl;

  if (!verify_correctness(lst))
    {
      std::cout << "Failed!" << std::endl;
    }

  return 0;
}