Shell Sort

Posted by Beetle B. on Tue 05 August 2014

A shell sort works somewhat recursively. It first sorts every \(k\) elements (i.e. the sublist formed by elements 0, \(k\), \(2k\), \(3k\), etc. Then it starts again with a smaller \(k'\), and so on until \(k=1\) at which point you have the usual sort.

The advantage is that by the time you get to small \(k\), the list is mostly sorted.

The actual sorting algorithm used for each sublist can vary. Insertion sort is a good candidate as its performance is good for small lists (when \(k\) is large), as well as for partially sorted lists (when \(k\) is small).

A list is said to be h-sorted if every \(h\) sublist is sorted. When you then sort in the next iteration using \(k<h\), the list will be both k-sorted and h-sorted.

Every h1-sorted and h2-sorted array is also (\(a_{1}h_{1}+a_{2}h_{2}\))-sorted, for any nonnegative integers \(a_{1}\) and \(a_{2}\).

Complexity

The gap sequence is the set of \(k\) elements one uses, with 1 always being in that set.

There are many open problems related to complexity, and it is nontrivial to determine the complexity for a given gap sequence. However, a good gap sequence usually performs very well (better than quadratic).

The Wikipedia page has a list of gap sequences with their complexity.

Properties

  1. The sorting is in-place. No extra memory is needed, and is thus useful in tight memory situations.
  2. It is not stable.
  3. What gap sequence you use will impact the performance heavily.
  4. For various reasons (cache mishits as an example), it is not often used in practice. However, due to its simplicity, it is used in embedded applications.

Implementations

Python

def insertion_subsort(lst, k, start, comparator):
    """
    Perform the insertion sort on the sublist, stepping by k, and
    starting at index start.
    
    Keyword Arguments:
    lst   -- List to sort
    k     -- Step size.
    start -- Starting index.
    comparator -- Comparator function. It should take in two arguments, and
                  return -1 if the left is smaller, 1 if the right is smaller,
		  and 0 if they are equal. If none is provided, then it will
		  assume a numerical sort.
    """
    for index in xrange(start, len(lst), k):
	value = lst[index]
	previous = lst[index-k]
	offset = 0
	while comparator(value, previous) < 0:
	    lst[index - offset] = previous
	    lst[index - offset - k] = value
	    offset += k
	    if index - offset < k:
		break
	    previous = lst[index - offset - k]

def shell_sort(lst, comparator=None):
    """
    Shell sort. It uses the 3k+1 rule. It only works with a list.
    
    Keyword Arguments:
    lst        -- List to sort
    comparator -- Comparator function. It should take in two arguments, and
                  return -1 if the left is smaller, 1 if the right is smaller,
		  and 0 if they are equal. If none is provided, then it will
		  assume a numerical sort.
    """
    if not comparator:
	def comp(left, right):
	    if left < right:
		return -1
	    elif left > right:
		return 1
	    else:
		return 0
	comparator = comp

    kmax = 3 * ((len(lst) - 1)/3) + 1
    k = kmax
    while (k >= 1):
	for start in xrange(0, min(k, len(lst)-k+1)):
	    insertion_subsort(lst, k, start, comparator)
	k -= 3

C++

#include <algorithm>
#include <vector>

void shell_sort(std::vector<int>& lst);

void insertion_subsort(std::vector<int>& lst, unsigned int k, unsigned int start)
{
  for(std::vector<int>::iterator iter = lst.begin() + start + k; 
      std::distance(lst.end(), iter) < 0; iter += k) 
    {
      int value = *iter;
      int previous = *(iter - k);
      unsigned int offset = 0;
      while (value < previous) 
        {
          std::iter_swap(iter - offset, iter - offset - k);
          offset += k;
          value = *(iter - offset);
          previous = *(iter - offset - k);
          if (std::distance(lst.begin(), iter - offset - k) < 0)
            {
              break;
            }
        }
    }
}



void shell_sort(std::vector<int>& lst)
{
  int k = 3 * ((lst.size() - 1)/3) + 1;
  while (k > 0) 
    {
      for (unsigned int start=0; 
           start < std::min(k, static_cast<int>(lst.size())-k); ++start) 
        {
          insertion_subsort(lst, k, start);
        }
      k -= 3;
    }
}