Divide-and-Conquer and Merge Sort

Review Sorts

Divide-and-Conquer

Divide and conquer is an algorithm, or design pattern, consisting of three steps:

  1. Divide: Divide the input into sub-sets, unless the input size is small enough to solve easily.

  2. Recur: Recursively solve the sub-sets problems.

  3. Conquer: Merge the solution from the sub-problems.

The main idea of the divide and conquer design pattern is if the problem is too large to solve easily then divide the problem into two until it is small enough to solve.  Naturally this is a recursive algorithm, but doses not have to be implemented recursively.  As we come out of the recursive call we have to merge the solutions from the sub-problems.

Merge Sort

Assume the problem is a Sequence S of n unordered objects and we want to return S sorted.  Using the Divide and conquer design pattern:

  1. Divide: If S has one element then trivial to solve and return S.  Otherwise divide S into two sequences S1 and S2 with ceiling(n/2) and floor(n/2) elements respectively.

  2. Recur: Recursively divide S1 and S2.

  3. Conquer: Put back the elements into S by merging the sorted sequences S1 and S2 into a sorted sequence.

Illustrate

using merge-sort tree form.

Note that the height of the merge-sort tree  is 2 lg n.

Merge

Merging is the hard part.
Assume that we have two sorted sequences, S1 and S2.

Illustrate

Algorithm merge(S1, S2, S):

// input:  ordered sequences S1 and S2
// output: ordered sequence S containing S1 and S2 elements.
while S1 is not empty and S2 is not empty do
if S1.first().element() <= S2.first().element() then
    S.insertLast(S1.remove(S1.first()))
else
    S.insertLast(S2.remove(S2.first()))
while S1 is not empty do
    S.insertLast(S1.remove(S1.first()))
while S2 is not empty do
    S.insertLast(S2.remove(S2.first()))

Note the time is O(n1 + n2)

Cost

Assume that inserting into first and last position is O(1), this is true for circular array or link list.

Let n be the size of S, the initial unsorted sequence.  Let v be a node of the merge-sort tree, and i the depth of the node.  Then the size of the sequence is n/2i. The time spent at both the divide and conquer phase is proportional to the size of the sequence at the node, O(n/2i). Note that each depth there are 2i nodes.  So the total time spent at the ith depth is (time at each node, O(n/2i) )* ( number of nodes, 2i) equal to n.  Therefore the total time is (time at each depth, n)*(height of the merge-sort tree, lg n) equal to O(n lg n).