Lecture 7: Faster sorting: Merge Sort and Quicksort
\( \newcommand\bigO{\mathrm{O}} \)
1 Divide and conquer
- Subdivide a larger problem into smaller parts
- Solve each smaller part
- Combine solutions of smaller sub problems back into the larger problem
- We see this pattern in recursive problems where they can be subdivided
2 Divide and conquer sorting algorithms
- We've talked about quadratic sorting algorithms
- Bubble sort, selection sort, insertion sort
- Runs in \( \bigO(n^2)\) in the worst and the average case. Better sorting algorithms exist
- Can improve our run time to \( O(n\log n) \) in the worst case.
- Bubble sort, selection sort, insertion sort
3 Mergesort
- Idea:
- Break an array into sub arrays where size = 1.
- Merge each small sub array together to form sorted larger array.
- Apply technique to the entire array.
- Sorting is done "bottom-up"
4 Mergesort algorithm
// We need to use `typename vector<T>::iterator` to tell the compiler // that `iterator` is a type member. See the quicksort implementation for // and alternative. template<typename T> void Merge(typename vector<T>::iterator begin, const typename vector<T>::iterator middle, const typename vector<T>::iterator end) { // a temporary vector that will contain the sorted vector, we will move the // elements back after merging. vector<T> tmp; tmp.reserve(end - begin); // maintain two needles: left will go begin->middle, right will go // middle->end, we will move which element is smaller to the appropriate spot. auto left = begin; auto right = middle; while (left != middle && right != end) { if (*left <= *right) { tmp.push_back(move(*left)); ++left; } else { tmp.push_back(move(*right)); ++right; } } // one of left or right might run out before the other, finish up the // remaining work while (left != middle) { tmp.push_back(move(*(left++))); } while (right != end) { tmp.push_back(move(*(right++))); } // move the values back for (auto it = tmp.begin(); it != tmp.end(); ++it, ++begin) { *begin = move(*it); } } template<typename T> void MergeSortImpl(typename vector<T>::iterator begin, typename vector<T>::iterator end) { if (end - begin > 1) { auto middle = begin + (end - begin) / 2; MergeSortImpl<T>(begin, middle); MergeSortImpl<T>(middle, end); Merge<T>(begin, middle, end); } } template<typename T> void MergeSort(vector<T>& v) { MergeSortImpl<T>(v.begin(), v.end()); } // download the code from the lecture for the main function
5 Mergesort Analysis
- Best-case: \(\bigO(n\log n)\)
- Average-case: \(\bigO(n\log n)\)
- Worst-case: \(\bigO(n\log n)\)
- Requires \(\bigO(n)\) additional space to merge the unsorted arrays into a
sorted array
- Time / space tradeoff
6 Quicksort
- Idea:
- Can subdivide array based on a "pivot" value.
- Place elements < pivot to the right-side of the array
- Place elements >= pivot to the left-side of the array
- Repeat for each left / right portion of the array
- When sub array sizes are all = 1, then entire array is sorted
- Sorting is done "top-down"
- Can subdivide array based on a "pivot" value.
- Not stable!
7 Quicksort Algorithm
template<typename Iterator> Iterator Partition(Iterator begin, Iterator end) { // Choose the first value as the pivot Iterator pivot = begin; // We will use two iterators to go through the array in opposite direction, // and find some elements out-of-order-relative-to-the-pivot. Iterator left = begin + 1; Iterator right = end - 1; while (left <= right) { // move right while *left <= *pivot while (left != end && *left <= *pivot) { ++left; } // move left while *right > *pivot while (right != begin /* unnecessary */ && *right > *pivot) { --right; } // swap *left and *right if they are in the wrong order if (left < right) { swap(*left, *right); } } // the final pivot should be at the index of `right`, put it in place swap(*pivot, *right); return right; // the location of the pivot } template<typename Iterator> void QuicksortImpl(Iterator begin, Iterator end) { if (end - begin <= 1) { return; } Iterator pivot = Partition(begin, end); // sort the smaller array first, then the larger one if (pivot - begin > end - pivot - 1) { QuicksortImpl(pivot + 1, end); QuicksortImpl(begin, pivot); } else { QuicksortImpl(begin, pivot); QuicksortImpl(pivot + 1, end); } } template<typename T> void Quicksort(vector<T>& v) { QuicksortImpl(v.begin(), v.end()); }
8 Quicksort Analysis
- Best-case: \(\bigO(n\log n)\)
- Average-case: \(\bigO(n\log n)\)
- Worst-case: \(\bigO(n^2)\)
- Quicksort performs poorly depending on the pivot value chosen.
- Run this algorithm with an array already in sorted order.
- If the pivot is the least or greatest value in the array, then the sub arrays aren't evenly divided.
- An optimization to try and prevent this scenario is to select a few pivot values in the array randomly and selecting the medium of these.
- Quicksort performs poorly depending on the pivot value chosen.
- Unlike (our version of) Mergesort, Quicksort does not require additional buffer space and can sort the array in-place.
- It still needs \(\bigO(\log n)\) extra space. Why?