算法实现 - 快速排序(Quick Sort) - 理解版

文章目录

算法介绍
算法分析
- 核心思想
- 三个版本运行过程
- - 挖坑法
  - [Hoare 原版](#Hoare 原版)
  - 前后指针法
算法稳定性和复杂度
- 稳定性
- 时间复杂度
- - 平均情况O(nlogn)
  - [最差情况O( n 2 n^2 n2)](#最差情况O( n 2 n^2 n2))
- 空间复杂度

算法介绍

快速排序是一种高效的排序算法，由英国计算机科学家C. A. R. Hoare在1960年提出，是对冒泡排序算法的一种改进。

该算法采用 分治策略 ，其基本思想：

选择一个基准元素（pivot），通过基准元素将待排序的序列分成两部分，一部分包含所有小于基准元素的元素，另一部分包含所有大于基准元素的元素，然后递归地对这两部分进行快速排序，最终使整个数组成为一个有序的序列。

算法分析

核心思想

选择基准：从数组中选择一个元素作为基准（pivot）。通常可以选择第一个元素、最后一个元素、随机元素或中间元素。
分区操作 ：
- 遍历数组，将小于基准的元素移动到基准左侧，大于基准的元素移动到右侧。这样基准元素就会被放置到其正确的排序位置。
- 此步骤会让数组分成了两部分：左侧（小于基准）和右侧（大于基准）。
递归排序 ：
- 对于基准位置左侧和右侧的两个子数组，分别递归地运用快速排序。
- 递归结束的条件是子数组的长度小于或等于 1，此时它们已经有序。
合并结果：合并已经排序好的子数组与基准，得到一个完整的有序数组。

三个版本运行过程

以数列 a[] = {40, 20, 30, 60, 10, 50} 作为分析案例;

挖坑法

分析：

x 取索引为0 的最左值为 40，即x = 40。

从"右 --> 左"查找小于x的数: 找到满足条件的数a[j]=10，此时 j=4，i=0；然后将a[j]赋值a[i]，即将 a[4] 赋值给 a[0]；接着从左往右遍历。
从"左 --> 右"查找大于x的数: 找到满足条件的数a[i]=60，此时 i=3，j=4；然后将a[i]赋值a[j]，即将 a[3] 赋值给 a[4]；接着从右往左遍历。
从"右 --> 左"查找小于x的数: 没有找到满足条件的数。当i>=j时，停止查找；然后将x赋值给a[i]，即将 40 赋值给 a[3]。此趟遍历结束！

代码：

c 复制代码

#include <stdio.h>

/**
 * Quick sort by C Lang.
 *
 * params:
 *   arr -- the data array
 *   left -- the first index of arr
 *   right -- the last index of arr
 *
 */
void quickSort(int arr[], int left, int right)
{
    // the arr is empty or left one element
    if (left >= right)
    {
        return;
    }

    // init data
    int pivot = arr[left];
    int i = left, j = right;

    while (i < j)
    {
        // right -> left: find the first element less than pivot
        // notice: when arr[j] meet the condition `arr[j] < pivot`, it will break the while loop, indicate that current element is less than pivot.
        while (i < j && arr[j] > pivot)
            j--;
        if (i < j)
        {
            arr[i++] = arr[j];
        }

        // left -> right: find the first element great than pivot
        while (i < j && arr[i] < pivot)
            i++;
        if (i < j)
        {
            arr[j--] = arr[i];
        }

        arr[i] = pivot; // The element has benn correctly placed in its corresponding position

        quickSort(arr, left, i - 1);  // Recursive invoke in the left sub array
        quickSort(arr, i + 1, right); // Recursive invoke in the right sub array
    }
}

/**
 * show array
 */
void printArray(int array[], int n)
{
    for (size_t i = 0; i < n; i++)
    {
        printf("%d ", array[i]);
    }
    printf("\n\n");
}

int main()
{
    int arr[] = {40, 20, 30, 60, 10, 50};

    // calculate the length of a array
    int n = sizeof(arr) / sizeof(arr[0]);

    printf("before sort: \n");
    printArray(arr, n);

	// The third param must be `n -1`
    quickSort(arr, 0, n - 1);

    printf("after sort: \n");
    printArray(arr, n);

    return 0;
}

代码详见：GitHub 快速排序法挖坑版

Hoare 原版

分析：

选择基准值：快速排序的第一步是选择一个基准值（pivot）。通常可以选择数组的第一个元素，但为了提高排序效率，有时也会选择随机元素或使用其他策略。
划分过程：使用两个指针，左指针和右指针，分别从数组的两端开始向中间移动。
具体步骤：
1. 左指针移动：左指针从左向右移动，直到找到一个大于或等于基准值的元素。
2. 右指针移动：右指针从右向左移动，直到找到一个小于或等于基准值的元素。
3. 交换元素：如果左指针指向的元素大于基准值，且右指针指向的元素小于基准值，则交换这两个元素。
4. 继续移动指针：继续移动左右指针，直到它们相遇。
5. 交换基准值：最后，将基准值与右指针指向的元素交换，确保基准值左侧的所有元素都不大于基准值，右侧的所有元素都不小于基准值。

代码：

c 复制代码

#include <stdio.h>

/**
 * swap two number
 */
void swap(int *a, int *b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

/**
 * Quick sort by C Lang.
 *
 * params:
 *   arr -- the data array
 *   left -- the first index of arr
 *   right -- the last index of arr
 *
 */
int partition(int arr[], int left, int right)
{
    // init data
    int pivotIndex = left;
    int i = left, j = right;

    while (i < j)
    {
        // right -> left: find the first element less than pivot
        while (i < j && arr[j] > arr[pivotIndex])
        {
            j--;
        }

        // left -> right: find the first element great than pivot
        while (i < j && arr[i] < arr[pivotIndex])
        {
            i++;
        }

        // after upper two loop, indicate that it is able to swap
        if (i < j)
            swap(&arr[i], &arr[j]);
    }

    // Swap the pivot with the element pointed by the right pointer
    swap(&arr[pivotIndex], &arr[j]);

    return j;
}

void quickSort(int arr[], int left, int right)
{
    if (left < right)
    {
        int pivotIndex = partition(arr, left, right);
        quickSort(arr, left, pivotIndex - 1);
        quickSort(arr, pivotIndex + 1, right);
    }
}

/**
 * show array
 */
void printArray(int array[], int n)
{
    for (size_t i = 0; i < n; i++)
    {
        printf("%d ", array[i]);
    }
    printf("\n\n");
}

int main()
{
    int arr[] = {40, 20, 30, 60, 10, 50};

    // calculate the length of a array
    int n = sizeof(arr) / sizeof(arr[0]);

    printf("before sort: \n");
    printArray(arr, n);

    // The third param must be `n -1`
    quickSort(arr, 0, n - 1);

    printf("after sort: \n");
    printArray(arr, n);

    return 0;
}

代码详见：GitHub 快速排序法 hoare版

前后指针法

前后指针法（也称为荷兰国旗问题解法）是快速排序中的一种高效实现方式。在这个方法中，通常将数组划分为三个部分：

小于基准值的元素
等于基准值的元素
大于基准值的元素。

分析：

变量初始化：
- key 是基准元素的索引，初始为 left。
- prev 是逻辑分区点之前的最后一个小于基准的元素的索引，开始等于 left。
- cur 是当前遍历的元素索引，从 left + 1 开始。
分区逻辑：
- 使用 while 循环遍历数组，从 cur = left + 1 到 right。
- 如果 a[cur] < a[key]，则 prev 增加（指向下一个元素），并交换 a[prev] 和 a[cur]，确保小于基准的元素移动到前面。
- 即使 prev 和 cur 相等，也进行交换，这一步是为了简化逻辑，即不需要特判 prev 和 cur 相等的情况。实际上在交换时什么也不会变。
基准交换：

循环结束后，将基准值与 a[prev] 交换，把基准放置在正确的位置。

代码

c 复制代码

#include <stdio.h>

/**
 * swap two number
 */
void swap(int* a, int* b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

/**
 * Quick sort by C Lang.
 *
 * params:
 *   arr -- the data array
 *   left -- the first index of arr
 *   right -- the last index of arr
 *
 */
int partition(int arr[], int left, int right) {
    int pivotIndex = left;
    int prev = left;
    int cur = left + 1;

    while (cur <= right) {
        if (arr[cur] < arr[pivotIndex] && ++prev != cur) {
            swap(&arr[prev], &arr[cur]);
        }
        cur++;
    }
    swap(&arr[prev], &arr[pivotIndex]);
    return prev;
}

void quickSort(int arr[], int left, int right) {
    if (left < right) {
        int pivot_index = partition(arr, left, right);
        quickSort(arr, left, pivot_index - 1);
        quickSort(arr, pivot_index + 1, right);
    }
}

void print_array(int arr[], int size) {
    for (int i = 0; i < size; i++) {
        printf("%d ", arr[i]);
    }
    printf("\n");
}

int main() {
    int arr[] = {30, 40, 60, 10, 20, 50};
    int n = sizeof(arr) / sizeof(arr[0]);

    printf("Before sorting:\n");
    print_array(arr, n);

    quickSort(arr, 0, n - 1);

    printf("After sorting:\n");
    print_array(arr, n);

    return 0;
}

代码详见：GitHub 快速排序法前后指针版

算法稳定性和复杂度

稳定性

这一点是因为在分区过程中，元素的相对顺序可能会改变。

具体来说，当我们交换元素以确保小于基准的元素在前，大于基准的在后时，相同元素的相对位置可能改变。

例如，如果两个相等的元素分别在基准的两侧，交换后它们的位置关系就被打破了。

举个例子：

示例数组：[ 3 a 3_a 3a, 1, 2, 3 b 3_b 3b]

这里的 3 a 3_a 3a, 3 b 3_b 3b 是两个不同位置的相同元素（相同的值但用下标区分，用于说明其在数组中的初始位置）。

根据实现可能不交换，但假设实现中等于基准的情况下它也移动，这里交换 3 b 3_b 3b， 3 a 3_a 3a得到数组 [ 3 b 3_b 3b, 1, 2, 3 a 3_a 3a]，顺序已经发生了变更。

时间复杂度

平均情况O(nlogn)

基本操作

快速排序的基本操作是划分和递归调用。

操作次数

划分操作：每次划分操作将数组分为两部分。在平均情况下，我们假设每次划分都能将数组大致分为两个相等的部分。这意味着每次划分操作都会遍历整个数组一次，操作次数为 n。
递归操作：由于每次划分都将数组分为两个大致相等的部分，所以每次递归调用处理的数组大小大约是前一次的一半。这个过程会一直持续到子数组的大小为1，此时不需要再进行划分。

对于大小为 n 的数组，快速排序的递归关系可以表示为：
T ( n ) = 2 T ( n 2 ) + n T(n) = 2T\left(\frac{n}{2}\right) + n T(n)=2T(2n)+n

T(n)是对大小为 n 的数组进行排序所需的总操作次数。
T( n 2 \frac{n}{2} 2n)表示对两个大小为 n 2 \frac{n}{2} 2n的子数组进行排序所需的操作次数。
n 是当前划分操作所需的总比较次数。

大O表示法

这种形式可以用主定理求解，符合主定理的标准形式，所以复杂度为：O(n log n)

分治算法主定理，点击看我的另外一篇文章

最差情况O( n 2 n^2 n2)

基本操作

快速排序的基本操作是划分和递归调用。

操作次数

在最坏情况下，每次划分操作只能将数组分成一个大小为 n−1 的部分和一个大小为 0 的部分（例如，当数组已经有序或每次都选择最大或最小元素作为基准时）。在这种情况下，递归关系式变为：

T ( n ) = T ( n − 1 ) + n T(n) = T(n - 1) + n T(n)=T(n−1)+n

T(n - 1)是对大小为 n−1 的子数组进行排序所需的操作次数。
n 是当前划分操作所需的总比较次数。

上述等价于
T ( n ) = n + ( n − 1 ) + ( n − 2 ) + ( n − 3 ) + . . . + 2 + 1 = ( n + 1 ) n 2 = n 2 2 + n 2 T(n)= n + (n-1) + (n-2) + (n-3) + ... + 2 + 1 = \frac{(n + 1)n}{2} = \frac{n^2}{2} + \frac{n}{2} T(n)=n+(n−1)+(n−2)+(n−3)+...+2+1=2(n+1)n=2n2+2n

大O表示法

上述用大O表示法为：O( n 2 n^2 n2)

空间复杂度

快速排序是一种递归算法。每次递归调用时，程序会在栈上保存当前调用的状态信息：

平均情况下：
- 快速排序以较好的概率实现了平均分区，递归调用的深度大约是树的高度，即 O(logn)。
- 每次递归在栈上保存的信息量是常数级别的，例如数组的起始和结束索引、基准元素信息等。
最坏情况下：
- 如果每次选择的基准是最小或最大元素，导致极端不平衡的分区，递归深度达到 O(n)。
- 这种情况会在已经排序的数组或逆序数组上发生。