Lecture 7: Faster sorting: Merge Sort and Quicksort

\( \newcommand\bigO{\mathrm{O}} \)

1 Divide and conquer

  • Subdivide a larger problem into smaller parts
  • Solve each smaller part
  • Combine solutions of smaller sub problems back into the larger problem
  • We see this pattern in recursive problems where they can be subdivided

2 Divide and conquer sorting algorithms

  • We've talked about quadratic sorting algorithms
    • Bubble sort, selection sort, insertion sort
      • Runs in \( \bigO(n^2)\) in the worst and the average case. Better sorting algorithms exist
      • Can improve our run time to \( O(n\log n) \) in the worst case.

3 Mergesort

  • Idea:
    • Break an array into sub arrays where size = 1.
    • Merge each small sub array together to form sorted larger array.
    • Apply technique to the entire array.
    • Sorting is done "bottom-up"

4 Mergesort algorithm

// We need to use `typename vector<T>::iterator` to tell the compiler
// that `iterator` is a type member. See the quicksort implementation for
// and alternative.
template<typename T>
void Merge(typename vector<T>::iterator begin,
           const typename vector<T>::iterator middle,
           const typename vector<T>::iterator end) {
  // a temporary vector that will contain the sorted vector, we will move the
  // elements back after merging.
  vector<T> tmp;
  tmp.reserve(end - begin);

  // maintain two needles: left will go begin->middle, right will go
  // middle->end, we will move which element is smaller to the appropriate spot.
  auto left = begin;
  auto right = middle;

  while (left != middle && right != end) {
    if (*left <= *right) {
      tmp.push_back(move(*left));
      ++left;
    } else {
      tmp.push_back(move(*right));
      ++right;
    }
  }

  // one of left or right might run out before the other, finish up the
  // remaining work
  while (left != middle) {
    tmp.push_back(move(*(left++)));
  }
  while (right != end) {
    tmp.push_back(move(*(right++)));
  }

  // move the values back
  for (auto it = tmp.begin(); it != tmp.end(); ++it, ++begin) {
    *begin = move(*it);
  }
}

template<typename T>
void MergeSortImpl(typename vector<T>::iterator begin,
                   typename vector<T>::iterator end) {
  if (end - begin > 1) {
    auto middle = begin + (end - begin) / 2;

    MergeSortImpl<T>(begin, middle);
    MergeSortImpl<T>(middle, end);
    Merge<T>(begin, middle, end);
  }
}

template<typename T>
void MergeSort(vector<T>& v) {
  MergeSortImpl<T>(v.begin(), v.end());
}

// download the code from the lecture for the main function

5 Mergesort Analysis

  • Best-case: \(\bigO(n\log n)\)
  • Average-case: \(\bigO(n\log n)\)
  • Worst-case: \(\bigO(n\log n)\)
  • Requires \(\bigO(n)\) additional space to merge the unsorted arrays into a sorted array
    • Time / space tradeoff

6 Quicksort

  • Idea:
    • Can subdivide array based on a "pivot" value.
      • Place elements < pivot to the right-side of the array
    • Place elements >= pivot to the left-side of the array
      • Repeat for each left / right portion of the array
    • When sub array sizes are all = 1, then entire array is sorted
    • Sorting is done "top-down"
  • Not stable!

7 Quicksort Algorithm

template<typename Iterator>
Iterator Partition(Iterator begin, Iterator end) {
  // Choose the first value as the pivot
  Iterator pivot = begin;
  // We will use two iterators to go through the array in opposite direction,
  // and find some elements out-of-order-relative-to-the-pivot.
  Iterator left = begin + 1;
  Iterator right = end - 1;

  while (left <= right) {
    // move right while *left <= *pivot
    while (left != end && *left <= *pivot) {
      ++left;
    }
    // move left while *right > *pivot
    while (right != begin /* unnecessary */ && *right > *pivot) {
      --right;
    }

    // swap *left and *right if they are in the wrong order
    if (left < right) {
      swap(*left, *right);
    }
  }

  // the final pivot should be at the index of `right`, put it in place
  swap(*pivot, *right);
  return right; // the location of the pivot
}

template<typename Iterator>
void QuicksortImpl(Iterator begin, Iterator end) {
  if (end - begin <= 1) {
    return;
  }

  Iterator pivot = Partition(begin, end);

  // sort the smaller array first, then the larger one
  if (pivot - begin > end - pivot - 1) {
    QuicksortImpl(pivot + 1, end);
    QuicksortImpl(begin, pivot);
  } else {
    QuicksortImpl(begin, pivot);
    QuicksortImpl(pivot + 1, end);
  }
}

template<typename T>
void Quicksort(vector<T>& v) {
  QuicksortImpl(v.begin(), v.end());
}

8 Quicksort Analysis

  • Best-case: \(\bigO(n\log n)\)
  • Average-case: \(\bigO(n\log n)\)
  • Worst-case: \(\bigO(n^2)\)
    • Quicksort performs poorly depending on the pivot value chosen.
      • Run this algorithm with an array already in sorted order.
    • If the pivot is the least or greatest value in the array, then the sub arrays aren't evenly divided.
    • An optimization to try and prevent this scenario is to select a few pivot values in the array randomly and selecting the medium of these.
  • Unlike (our version of) Mergesort, Quicksort does not require additional buffer space and can sort the array in-place.
  • It still needs \(\bigO(\log n)\) extra space. Why?

Author: Mehmet Emre

Created:

The material for this class is based on Prof. Richert Wang's material for CS 32