[Algorithm] Merge Sort with Divide and Conquer Algorithm

※ I'm neither a native speaker nor an expert. I'm just a noob student who wants to develop the literacy in English and software knowledge. If you find any wrong content or awkward expressions, please feel free to let me know. Thank you!


There are many methods for sorting data efficiently, called 'sort algorithm'. In this section, merge sort will be dealt with. This algorithm achieves sorting things with the time complexity O(nlogn).

Divide and Conquer?

The basic idea of the 'divide and conquer' algorithm is breaking a whole complex problem into sub-problems that can be solved more easily. This involves three parts.

  1. Divide the given problem into smaller problems until it can be solvable.

  2. Conquer (or solve) the sub-problems.

  3. Combine the answers to each sub-problem and get the solution to the original problem.


The basic principle of sorting

Assume that there is a 'target data' to be sorted. In merge sort, the first step is dividing the target into two pieces until the size of the piece becomes one. To get a grasp on this easily, let's take an example. Suppose there is an array following.

8 6 1 5 9 3 4 5

First, divide into the 8 segments.

8 6 1 5 / 9 3 4 5

8 6 / 1 5 / 9 3 / 4 5

8 / 6 / 1 / 5 / 9 / 3 / 4 / 5

Then, the neighboring two parts get merged by the comparison between the first elements of each part, as shown below.

// First step
6 8 / 1 5 / 3 9 / 4 5

// Second step
// Given the first two parts, 
6 8 / 1 5
// compare 6 and 1, which are the first elements of each parts.
6 8 / 5
1
// Then, again, compare 6 and 5, which are the first elements of each parts.
6 8 
1 5
// Now, the remaining elements will follow.
1 5 6 8
// Repeat this
1 5 6 8 / 3 4 5 9

// Third step
1 3 4 5 6 7 8 9

How to write the code actually

Since divide and conquer is a recursive method, the sort function is also written as recursively. In the dividing step, we separate a whole data array into two intervals.

void merge_sort(int* target, int start, int end)
{
    // temporary memory for copying data
    int* copy = new int[end - start + 1];
    merge_inner(target, copy, start, end);
}

// sort function
void merge_inner(int* target, int* copy, int start, int end)
{
    // There must be 'end point' of recursive process.
    if (start == end) return;

    // Divide the interval into two parts.
    int middle = (start + end) >> 1;    
    merge_inner(target, copy, start, middle);
    merge_inner(target, copy, middle + 1, end);
    // After this, each two parts will be sorted.

Next, the conquering step will merge each two separate parts into one.

    // Copy the original array first,
    // because we need to store temporarily the current state of data
    for (int i = start; i <= end; ++i)
    {
        copy[i] = target[i];
    }

    // Sort (Conquer and Combine)
    int left_index = start, right_index = middle + 1;
    int target_index = start;
    while (left_index <= middle && right_index <= end)
    {
        // ascending sorting 
        // This part could be changed depending on 'how you want to sort'
        if (compare(copy[left_index], copy[right_index]))
        {
            target[target_index++] = copy[left_index++];
        }
        else
        {
            target[target_index++] = copy[right_index++];
        }
    }

    while (left_index <= middle)
    {
        target[target_index++] = copy[left_index++];
    }
    while (right_index <= end)
    {
        target[target_index++] = copy[right_index++];
    }
}

bool compare(int a, int b)
{
    return a <= b;
}

As you might noticed, the copied data is required in merge sort. Therefore, the space complexity is O(n).

Divide and conquer algorithm is used in other various problems, such as quick sort, multiplication, and FFT. There will be more chances to take a look at divide and conquer algorithm.

Edited by Sang Hyeok Park