Python数据结构之旅：09-图论基础——连接万物的网络

如果说树结构展现了数据的层次之美，那么图结构则揭示了万物连接的复杂与奇妙。从社交网络到道路导航，从神经网络到推荐系统，图无处不在。

图：关系的终极表达

图（Graph）是由顶点和边组成的数据结构，用于表示对象之间的关系。

复制代码

G = (V, E)
其中：
V - 顶点集合（Vertices）
E - 边集合（Edges）

图的基本概念

图的分类

1. 无向图 vs 有向图

复制代码

# 无向图：边没有方向
# A — B 表示 A连接B，B也连接A

# 有向图：边有方向  
# A → B 表示 A指向B，但B不一定指向A

2. 加权图 vs 无权图

复制代码

# 无权图：边没有权重
# 加权图：边有权重（如距离、成本、时间）
# A --5-- B 表示从A到B的代价是5

3. 连通图 vs 非连通图

连通图：任意两个顶点之间都有路径
非连通图：存在无法到达的顶点

重要术语

度：顶点连接的边数（有向图分为入度和出度）
路径：顶点序列，其中每对相邻顶点都有边连接
环：起点和终点相同的路径
连通分量：无向图中的极大连通子图

图的表示方法

1. 邻接矩阵

使用二维数组表示顶点间的连接关系。

python 复制代码

class GraphMatrix:
    def __init__(self, num_vertices):
        self.num_vertices = num_vertices
        self.matrix = [[0] * num_vertices for _ in range(num_vertices)]
    
    def add_edge(self, v1, v2, weight=1):
        self.matrix[v1][v2] = weight
        # 如果是无向图，还需要：
        # self.matrix[v2][v1] = weight
    
    def get_neighbors(self, vertex):
        neighbors = []
        for i in range(self.num_vertices):
            if self.matrix[vertex][i] != 0:
                neighbors.append((i, self.matrix[vertex][i]))
        return neighbors

示例（无向无权图）：

复制代码

   A  B  C  D
A  0  1  1  0
B  1  0  1  1  
C  1  1  0  0
D  0  1  0  0

优缺点：

✅ 快速判断两顶点是否相邻
✅ 适合稠密图
❌ 空间复杂度高：O(V²)
❌ 添加顶点开销大

2. 邻接表

使用链表或数组的数组表示每个顶点的邻居。

python 复制代码

class GraphList:
    def __init__(self, num_vertices):
        self.num_vertices = num_vertices
        self.adj_list = [[] for _ in range(num_vertices)]
    
    def add_edge(self, v1, v2, weight=1):
        self.adj_list[v1].append((v2, weight))
        # 如果是无向图，还需要：
        # self.adj_list[v2].append((v1, weight))
    
    def get_neighbors(self, vertex):
        return self.adj_list[vertex]

示例：

复制代码

顶点0: [(1, 1), (2, 1)]
顶点1: [(0, 1), (2, 1), (3, 1)]  
顶点2: [(0, 1), (1, 1)]
顶点3: [(1, 1)]

优缺点：

✅ 空间效率高：O(V + E)
✅ 适合稀疏图
❌ 判断两顶点是否相邻较慢

图的遍历算法

图遍历是图算法的基础，主要有两种策略：

深度优先搜索

"一条路走到黑，走不通再回头"

python 复制代码

def dfs(graph, start):
    visited = [False] * graph.num_vertices
    result = []
    
    def dfs_recursive(vertex):
        visited[vertex] = True
        result.append(vertex)
        
        for neighbor, _ in graph.get_neighbors(vertex):
            if not visited[neighbor]:
                dfs_recursive(neighbor)
    
    dfs_recursive(start)
    return result

# 非递归版本
def dfs_iterative(graph, start):
    visited = [False] * graph.num_vertices
    stack = [start]
    result = []
    
    while stack:
        vertex = stack.pop()
        if not visited[vertex]:
            visited[vertex] = True
            result.append(vertex)
            # 逆序入栈以保证顺序一致
            for neighbor, _ in reversed(graph.get_neighbors(vertex)):
                if not visited[neighbor]:
                    stack.append(neighbor)
    
    return result

广度优先搜索

"层层推进，由近及远"

python 复制代码

from collections import deque

def bfs(graph, start):
    visited = [False] * graph.num_vertices
    queue = deque([start])
    visited[start] = True
    result = []
    
    while queue:
        vertex = queue.popleft()
        result.append(vertex)
        
        for neighbor, _ in graph.get_neighbors(vertex):
            if not visited[neighbor]:
                visited[neighbor] = True
                queue.append(neighbor)
    
    return result

图算法的经典应用

1. 最短路径问题

Dijkstra算法（加权图，无负权边）：

python 复制代码

import heapq

def dijkstra(graph, start):
    distances = [float('inf')] * graph.num_vertices
    distances[start] = 0
    pq = [(0, start)]  # 优先队列：(距离, 顶点)
    
    while pq:
        current_dist, current_vertex = heapq.heappop(pq)
        
        # 如果找到更短路径，跳过
        if current_dist > distances[current_vertex]:
            continue
            
        for neighbor, weight in graph.get_neighbors(current_vertex):
            distance = current_dist + weight
            
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(pq, (distance, neighbor))
    
    return distances

Floyd-Warshall算法（所有顶点对的最短路径）：

python 复制代码

def floyd_warshall(graph):
    n = graph.num_vertices
    dist = [[float('inf')] * n for _ in range(n)]
    
    # 初始化
    for i in range(n):
        dist[i][i] = 0
        for neighbor, weight in graph.get_neighbors(i):
            dist[i][neighbor] = weight
    
    # 动态规划
    for k in range(n):
        for i in range(n):
            for j in range(n):
                if dist[i][j] > dist[i][k] + dist[k][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]
    
    return dist

2. 最小生成树

Prim算法：

python 复制代码

def prim(graph):
    n = graph.num_vertices
    visited = [False] * n
    min_edge = [float('inf')] * n
    min_edge[0] = 0
    parent = [-1] * n
    pq = [(0, 0)]  # (权重, 顶点)
    
    while pq:
        weight, vertex = heapq.heappop(pq)
        visited[vertex] = True
        
        for neighbor, edge_weight in graph.get_neighbors(vertex):
            if not visited[neighbor] and edge_weight < min_edge[neighbor]:
                min_edge[neighbor] = edge_weight
                parent[neighbor] = vertex
                heapq.heappush(pq, (edge_weight, neighbor))
    
    return parent

3. 拓扑排序

用于有向无环图的排序：

python 复制代码

def topological_sort(graph):
    n = graph.num_vertices
    in_degree = [0] * n
    result = []
    queue = deque()
    
    # 计算入度
    for i in range(n):
        for neighbor, _ in graph.get_neighbors(i):
            in_degree[neighbor] += 1
    
    # 入度为0的顶点入队
    for i in range(n):
        if in_degree[i] == 0:
            queue.append(i)
    
    while queue:
        vertex = queue.popleft()
        result.append(vertex)
        
        for neighbor, _ in graph.get_neighbors(vertex):
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    if len(result) != n:
        raise Exception("图中存在环！")
    
    return result

现实世界的图应用

1. 社交网络分析

python 复制代码

class SocialNetwork:
    def __init__(self):
        self.graph = GraphList(0)
        self.user_to_id = {}
        self.id_to_user = {}
    
    def add_friendship(self, user1, user2):
        id1 = self._get_user_id(user1)
        id2 = self._get_user_id(user2)
        self.graph.add_edge(id1, id2)
    
    def suggest_friends(self, user):
        user_id = self._get_user_id(user)
        # 使用BFS或共同好友算法推荐朋友
        pass
    
    def _get_user_id(self, username):
        if username not in self.user_to_id:
            new_id = len(self.user_to_id)
            self.user_to_id[username] = new_id
            self.id_to_user[new_id] = username
            # 扩展图的大小
            self.graph.adj_list.append([])
            self.graph.num_vertices += 1
        return self.user_to_id[username]

2. 路径规划系统

python 复制代码

class NavigationSystem:
    def __init__(self):
        self.graph = GraphList(0)
        self.location_to_id = {}
    
    def add_road(self, location1, location2, distance, traffic_factor=1):
        id1 = self._get_location_id(location1)
        id2 = self._get_location_id(location2)
        # 权重考虑距离和交通因素
        weight = distance * traffic_factor
        self.graph.add_edge(id1, id2, weight)
        self.graph.add_edge(id2, id1, weight)  # 无向图
    
    def find_shortest_path(self, start, end):
        start_id = self._get_location_id(start)
        end_id = self._get_location_id(end)
        
        distances = dijkstra(self.graph, start_id)
        # 重建路径...
        return self._reconstruct_path(start_id, end_id, distances)

3. 任务调度系统

python 复制代码

class TaskScheduler:
    def __init__(self):
        self.graph = GraphList(0)
        self.task_to_id = {}
    
    def add_dependency(self, task, depends_on):
        """添加任务依赖：task 依赖于 depends_on"""
        task_id = self._get_task_id(task)
        dep_id = self._get_task_id(depends_on)
        self.graph.add_edge(dep_id, task_id)  # 依赖关系：先完成depends_on
    
    def get_execution_order(self):
        """获取任务执行顺序"""
        try:
            return topological_sort(self.graph)
        except Exception as e:
            print("任务依赖中存在循环依赖！")
            return None

高级图算法

强连通分量

python 复制代码

def kosaraju_scc(graph):
    # 第一次DFS：记录完成时间
    visited = [False] * graph.num_vertices
    stack = []
    
    def dfs_first_pass(vertex):
        visited[vertex] = True
        for neighbor, _ in graph.get_neighbors(vertex):
            if not visited[neighbor]:
                dfs_first_pass(neighbor)
        stack.append(vertex)
    
    for i in range(graph.num_vertices):
        if not visited[i]:
            dfs_first_pass(i)
    
    # 构建转置图
    transposed = GraphList(graph.num_vertices)
    for i in range(graph.num_vertices):
        for neighbor, weight in graph.get_neighbors(i):
            transposed.add_edge(neighbor, i, weight)
    
    # 第二次DFS：按完成时间逆序遍历转置图
    visited = [False] * graph.num_vertices
    scc_list = []
    
    def dfs_second_pass(vertex, component):
        visited[vertex] = True
        component.append(vertex)
        for neighbor, _ in transposed.get_neighbors(vertex):
            if not visited[neighbor]:
                dfs_second_pass(neighbor, component)
    
    while stack:
        vertex = stack.pop()
        if not visited[vertex]:
            component = []
            dfs_second_pass(vertex, component)
            scc_list.append(component)
    
    return scc_list

图数据库简介

现代图数据库（如Neo4j、Amazon Neptune）专门为处理图数据而设计：

python 复制代码

// Cypher查询语言示例（Neo4j）
// 查找Alice的朋友的朋友，但不是Alice的直接朋友
MATCH (alice:Person {name: 'Alice'})-[:FRIEND]->()-[fof:FRIEND]->(mutual)
WHERE NOT (alice)-[:FRIEND]->(mutual)
RETURN mutual.name

图处理框架

对于超大规模图数据，使用分布式框架：

Apache Giraph：基于Hadoop的图处理
GraphX：Spark的图计算库
Amazon Neptune：托管的图数据库服务

性能考虑与优化

图表示的选择：
- 稠密图：邻接矩阵
- 稀疏图：邻接表
- 动态图：邻接表 + 哈希表
内存布局优化：
- 使用连续数组存储邻接表
- 缓存友好的访问模式
并行化：
- BFS的层级并行
- 图分割算法

总结：连接的智慧

图结构向我们揭示了世界的本质：万物相互连接。从微小的分子相互作用到宏大的宇宙网络，图提供了理解和分析复杂系统的语言。

通过图论，我们学会了：

表示关系：如何抽象和建模连接
分析结构：发现社区、中心节点和关键路径
优化路径：在复杂网络中寻找最优解
预测行为：基于网络结构推断动态

图论不仅是计算机科学的核心，更是理解复杂系统的通用语言。

数据结构之旅的终点与新征程

至此，我们完成了从简单数组到复杂图论的数据结构探索之旅。回顾这一路：

线性结构：数组、链表、栈、队列 - 数据组织的基础
树形结构：二叉树、搜索树、平衡树 - 层次与搜索的艺术
哈希结构：哈希表 - 极速查找的魔法
图结构：图论 - 万物连接的智慧

每种数据结构都在特定场景中闪耀光芒，没有绝对的"最佳"，只有最适合的选择。

真正的智慧不在于记住所有数据结构，而在于理解它们背后的设计哲学和权衡思想，从而在面对实际问题时能够做出恰当的选择。

数据结构的世界仍在不断发展，新的结构和算法不断涌现。但掌握了这些基础，你就拥有了理解和使用任何新工具的钥匙。

愿你在编程的道路上，能够灵活运用这些数据结构的智慧，构建出优雅高效的解决方案！