数据结构 | 深度解析二叉树的基本原理

个人主页-爱因斯晨

文章专栏-数据结构

最近在学习人工智能，偶然发现一个宝藏网站，和大家分享一下吧：
人工智能学习

二叉树是计算机科学中最基础也最常用的数据结构之一，它不仅是理解更复杂树结构（如 AVL 树、红黑树）的基础，也广泛应用于表达式解析、 Huffman 编码、数据库索引等领域。本文将从二叉树的基本概念出发，深入探讨其存储结构、核心操作及实际应用，并通过 C 语言代码示例帮助读者掌握这一重要数据结构。

二叉树的基本概念

二叉树是一种每个节点最多有两个子节点的树状结构，这两个子节点分别被称为左孩子（left child）和右孩子（right child）。根据节点的分布情况，二叉树可以分为以下几种特殊类型：

满二叉树：除叶子节点外，每个节点都有两个子节点，且所有叶子节点都在同一层
完全二叉树：除最后一层外，其余层都是满的，且最后一层的节点都靠左排列
平衡二叉树：左右两个子树的高度差不超过 1 的二叉搜索树

二叉树具有一个重要性质：在非空二叉树中，第 i 层最多有 2^(i-1) 个节点；深度为 k 的二叉树最多有 2^k-1 个节点。

二叉树的存储结构

在 C 语言中，二叉树通常有两种存储方式：顺序存储和链式存储。

顺序存储

顺序存储适用于完全二叉树，使用数组来存储节点，通过索引计算节点间的关系：

若父节点索引为 i，则左孩子索引为 2i+1，右孩子索引为 2i+2
若孩子节点索引为 i，则父节点索引为 (i-1)/2（整数除法）

c 复制代码

// 顺序存储二叉树示例
#define MAX_SIZE 100
typedef char BTDataType;
BTDataType array[MAX_SIZE];
int size; // 实际节点数量

顺序存储的优点是访问速度快，但对于非完全二叉树会浪费存储空间。

链式存储

链式存储是二叉树最常用的存储方式，通过指针建立节点间的联系：

c 复制代码

// 二叉树节点结构定义
typedef char BTDataType;
typedef struct BinaryTreeNode {
    BTDataType data;                  // 节点数据
    struct BinaryTreeNode* left;      // 左孩子指针
    struct BinaryTreeNode* right;     // 右孩子指针
} BTNode;

链式存储的优点是空间利用率高，适合各种类型的二叉树，也是我们接下来实现的重点。

二叉树的基本操作

1. 创建节点

创建新节点是所有操作的基础，需要为节点分配内存并初始化：

c 复制代码

// 创建新节点
BTNode* BuyBTNode(BTDataType x) {
    BTNode* node = (BTNode*)malloc(sizeof(BTNode));
    if (node == NULL) {
        printf("malloc failed\n");
        exit(-1);
    }
    node->data = x;
    node->left = NULL;
    node->right = NULL;
    return node;
}

2. 构建二叉树

手动构建一个简单的二叉树用于演示：

c 复制代码

// 构建示例二叉树
BTNode* CreateBinaryTree() {
    BTNode* nodeA = BuyBTNode('A');
    BTNode* nodeB = BuyBTNode('B');
    BTNode* nodeC = BuyBTNode('C');
    BTNode* nodeD = BuyBTNode('D');
    BTNode* nodeE = BuyBTNode('E');
    BTNode* nodeF = BuyBTNode('F');
    
    nodeA->left = nodeB;
    nodeA->right = nodeC;
    nodeB->left = nodeD;
    nodeB->right = nodeE;
    nodeC->left = nodeF;
    
    return nodeA;
}

构建的二叉树结构如下：

plaintext 复制代码

      A
     / \
    B   C
   / \ /
  D  E F

3. 遍历操作

遍历是二叉树最核心的操作，分为深度优先遍历和广度优先遍历两大类。

深度优先遍历

深度优先遍历又分为三种方式，区别在于访问根节点的时机：

c 复制代码

// 前序遍历：根 -> 左 -> 右
void PreOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    printf("%c ", root->data);
    PreOrder(root->left);
    PreOrder(root->right);
}

// 中序遍历：左 -> 根 -> 右
void InOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    InOrder(root->left);
    printf("%c ", root->data);
    InOrder(root->right);
}

// 后序遍历：左 -> 右 -> 根
void PostOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    PostOrder(root->left);
    PostOrder(root->right);
    printf("%c ", root->data);
}

对于前面构建的二叉树，三种遍历结果分别为：

前序：A B D NULL NULL E NULL NULL C F NULL NULL NULL
中序：NULL D NULL B NULL E NULL A NULL F NULL C NULL
后序：NULL NULL D NULL NULL E B NULL NULL F NULL NULL C A

广度优先遍历（层序遍历）

层序遍历需要借助队列实现，按层次从左到右访问节点：

c 复制代码

// 层序遍历
void LevelOrder(BTNode* root) {
    if (root == NULL)
        return;
    
    // 使用队列存储节点
    Queue q;
    QueueInit(&q);
    QueuePush(&q, root);
    
    while (!QueueEmpty(&q)) {
        BTNode* front = QueueFront(&q);
        QueuePop(&q);
        printf("%c ", front->data);
        
        // 左孩子入队
        if (front->left)
            QueuePush(&q, front->left);
        // 右孩子入队
        if (front->right)
            QueuePush(&q, front->right);
    }
    QueueDestroy(&q);
}

上述二叉树的层序遍历结果为：A B C D E F

4. 二叉树的其他常用操作

c 复制代码

// 计算二叉树节点个数
int BinaryTreeSize(BTNode* root) {
    return root == NULL ? 0 : 
        1 + BinaryTreeSize(root->left) + BinaryTreeSize(root->right);
}

// 计算叶子节点个数
int BinaryTreeLeafSize(BTNode* root) {
    if (root == NULL)
        return 0;
    // 叶子节点：左右孩子都为空
    if (root->left == NULL && root->right == NULL)
        return 1;
    return BinaryTreeLeafSize(root->left) + BinaryTreeLeafSize(root->right);
}

// 计算二叉树的高度
int BinaryTreeHeight(BTNode* root) {
    if (root == NULL)
        return 0;
    int leftHeight = BinaryTreeHeight(root->left);
    int rightHeight = BinaryTreeHeight(root->right);
    return leftHeight > rightHeight ? leftHeight + 1 : rightHeight + 1;
}

// 查找值为x的节点
BTNode* BinaryTreeFind(BTNode* root, BTDataType x) {
    if (root == NULL)
        return NULL;
    if (root->data == x)
        return root;
    
    // 先在左子树查找
    BTNode* leftRet = BinaryTreeFind(root->left, x);
    if (leftRet != NULL)
        return leftRet;
    
    // 再在右子树查找
    BTNode* rightRet = BinaryTreeFind(root->right, x);
    if (rightRet != NULL)
        return rightRet;
    
    return NULL;
}

// 销毁二叉树
void BinaryTreeDestroy(BTNode** root) {
    if (*root == NULL)
        return;
    
    // 后序遍历销毁：先销毁孩子节点，再销毁根节点
    BinaryTreeDestroy(&(*root)->left);
    BinaryTreeDestroy(&(*root)->right);
    free(*root);
    *root = NULL;
}

二叉树的应用场景

二叉树作为基础数据结构，有许多重要变种和应用：

二叉搜索树（BST）：左子树所有节点值小于根节点，右子树所有节点值大于根节点，支持 O (log n) 的插入、删除和查找操作。
表达式树：用于解析数学表达式，叶子节点是操作数，内部节点是运算符。
Huffman 树：一种带权路径长度最短的二叉树，用于数据压缩算法。
线索二叉树：通过将空指针指向遍历序列中的前驱或后继节点，提高遍历效率。
平衡二叉树：如 AVL 树和红黑树，通过自平衡机制保持 O (log n) 的操作复杂度。

完整示例代码

下面是包含队列实现和二叉树所有操作的完整代码：

c 复制代码

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

// 二叉树节点结构定义
typedef char BTDataType;
typedef struct BinaryTreeNode {
    BTDataType data;
    struct BinaryTreeNode* left;
    struct BinaryTreeNode* right;
} BTNode;

// 队列结构定义（用于层序遍历）
typedef BTNode* QDataType;
typedef struct QueueNode {
    QDataType data;
    struct QueueNode* next;
} QNode;

typedef struct Queue {
    QNode* head;
    QNode* tail;
} Queue;

// 队列操作函数声明
void QueueInit(Queue* q);
void QueueDestroy(Queue* q);
void QueuePush(Queue* q, QDataType x);
void QueuePop(Queue* q);
QDataType QueueFront(Queue* q);
int QueueEmpty(Queue* q);

// 队列操作实现
void QueueInit(Queue* q) {
    assert(q);
    q->head = q->tail = NULL;
}

void QueueDestroy(Queue* q) {
    assert(q);
    QNode* cur = q->head;
    while (cur) {
        QNode* next = cur->next;
        free(cur);
        cur = next;
    }
    q->head = q->tail = NULL;
}

void QueuePush(Queue* q, QDataType x) {
    assert(q);
    QNode* newNode = (QNode*)malloc(sizeof(QNode));
    if (newNode == NULL) {
        printf("malloc failed\n");
        exit(-1);
    }
    newNode->data = x;
    newNode->next = NULL;
    
    if (q->tail == NULL) {
        q->head = q->tail = newNode;
    } else {
        q->tail->next = newNode;
        q->tail = newNode;
    }
}

void QueuePop(Queue* q) {
    assert(q);
    assert(!QueueEmpty(q));
    
    QNode* next = q->head->next;
    free(q->head);
    q->head = next;
    
    // 如果队列变空，tail也置空
    if (q->head == NULL) {
        q->tail = NULL;
    }
}

QDataType QueueFront(Queue* q) {
    assert(q);
    assert(!QueueEmpty(q));
    return q->head->data;
}

int QueueEmpty(Queue* q) {
    assert(q);
    return q->head == NULL;
}

// 二叉树操作函数
BTNode* BuyBTNode(BTDataType x) {
    BTNode* node = (BTNode*)malloc(sizeof(BTNode));
    if (node == NULL) {
        printf("malloc failed\n");
        exit(-1);
    }
    node->data = x;
    node->left = NULL;
    node->right = NULL;
    return node;
}

BTNode* CreateBinaryTree() {
    BTNode* nodeA = BuyBTNode('A');
    BTNode* nodeB = BuyBTNode('B');
    BTNode* nodeC = BuyBTNode('C');
    BTNode* nodeD = BuyBTNode('D');
    BTNode* nodeE = BuyBTNode('E');
    BTNode* nodeF = BuyBTNode('F');
    
    nodeA->left = nodeB;
    nodeA->right = nodeC;
    nodeB->left = nodeD;
    nodeB->right = nodeE;
    nodeC->left = nodeF;
    
    return nodeA;
}

void PreOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    printf("%c ", root->data);
    PreOrder(root->left);
    PreOrder(root->right);
}

void InOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    InOrder(root->left);
    printf("%c ", root->data);
    InOrder(root->right);
}

void PostOrder(BTNode* root) {
    if (root == NULL) {
        printf("NULL ");
        return;
    }
    PostOrder(root->left);
    PostOrder(root->right);
    printf("%c ", root->data);
}

void LevelOrder(BTNode* root) {
    if (root == NULL)
        return;
    
    Queue q;
    QueueInit(&q);
    QueuePush(&q, root);
    
    while (!QueueEmpty(&q)) {
        BTNode* front = QueueFront(&q);
        QueuePop(&q);
        printf("%c ", front->data);
        
        if (front->left)
            QueuePush(&q, front->left);
        if (front->right)
            QueuePush(&q, front->right);
    }
    QueueDestroy(&q);
}

int BinaryTreeSize(BTNode* root) {
    return root == NULL ? 0 : 
        1 + BinaryTreeSize(root->left) + BinaryTreeSize(root->right);
}

int BinaryTreeLeafSize(BTNode* root) {
    if (root == NULL)
        return 0;
    if (root->left == NULL && root->right == NULL)
        return 1;
    return BinaryTreeLeafSize(root->left) + BinaryTreeLeafSize(root->right);
}

int BinaryTreeHeight(BTNode* root) {
    if (root == NULL)
        return 0;
    int leftHeight = BinaryTreeHeight(root->left);
    int rightHeight = BinaryTreeHeight(root->right);
    return leftHeight > rightHeight ? leftHeight + 1 : rightHeight + 1;
}

BTNode* BinaryTreeFind(BTNode* root, BTDataType x) {
    if (root == NULL)
        return NULL;
    if (root->data == x)
        return root;
    
    BTNode* leftRet = BinaryTreeFind(root->left, x);
    if (leftRet != NULL)
        return leftRet;
    
    BTNode* rightRet = BinaryTreeFind(root->right, x);
    if (rightRet != NULL)
        return rightRet;
    
    return NULL;
}

void BinaryTreeDestroy(BTNode** root) {
    if (*root == NULL)
        return;
    
    BinaryTreeDestroy(&(*root)->left);
    BinaryTreeDestroy(&(*root)->right);
    free(*root);
    *root = NULL;
}

// 测试函数
int main() {
    BTNode* root = CreateBinaryTree();
    
    printf("前序遍历: ");
    PreOrder(root);
    printf("\n");
    
    printf("中序遍历: ");
    InOrder(root);
    printf("\n");
    
    printf("后序遍历: ");
    PostOrder(root);
    printf("\n");
    
    printf("层序遍历: ");
    LevelOrder(root);
    printf("\n");
    
    printf("节点总数: %d\n", BinaryTreeSize(root));
    printf("叶子节点数: %d\n", BinaryTreeLeafSize(root));
    printf("树的高度: %d\n", BinaryTreeHeight(root));
    
    BTDataType findVal = 'E';
    BTNode* found = BinaryTreeFind(root, findVal);
    if (found)
        printf("找到节点: %c\n", found->data);
    else
        printf("未找到节点: %c\n", findVal);
    
    // 销毁二叉树
    BinaryTreeDestroy(&root);
    assert(root == NULL);
    printf("二叉树已销毁\n");
    
    return 0;
}

总结

二叉树作为一种灵活的数据结构，其核心价值在于它的递归特性和高效的遍历方式。本文从基本概念出发，详细介绍了二叉树的存储结构和常用操作，并通过完整的 C 语言代码实现了这些功能。

掌握二叉树不仅有助于理解更复杂的数据结构，也能培养递归思维和问题分解能力。在实际开发中，二叉树及其变种（如二叉搜索树、平衡二叉树）在处理动态数据集合时表现出色，能够提供高效的插入、删除和查询操作。

学习二叉树的关键在于理解其递归性质，大部分二叉树操作都可以通过递归优雅地实现。同时，熟练掌握四种遍历方式（前序、中序、后序、层序）对于解决二叉树相关问题至关重要。