STL中的map和set：红黑树的优雅应用

🔍 容器分类对比

在C++的STL中，容器主要分为两大类：

序列式容器 ：vector、list、deque等，元素按存储位置顺序访问
关联式容器 ：map、set等，元素按关键字保存和访问

简单来说，序列式容器像是书架上的书，按摆放位置找书；关联式容器像是字典，按关键词（字母顺序）找词条。

🔑 set：简单高效的集合

set是什么？

set是一个存储唯一键值的容器，底层用红黑树实现，所有操作时间复杂度都是O(logN)。

cpp 复制代码

#include <set>
#include <iostream>

int main() {
    // 默认升序排列
    std::set<int> s = {5, 2, 7, 2, 8, 5, 9};
    
    // 自动去重+排序
    for (int num : s) {
        std::cout << num << " ";  // 输出: 2 5 7 8 9
    }
    
    // 降序排列
    std::set<int, std::greater<int>> s2 = {1, 3, 2};
    // 输出: 3 2 1
    
    // C++17结构化绑定（插入返回值）
    auto [iter, inserted] = s.insert(10);
    if (inserted) {
        std::cout << "成功插入10\n";
    }
}

set的核心功能

插入元素：

cpp 复制代码

// 插入单个元素
s.insert(10);  // 插入成功返回{迭代器, true}
s.insert(5);   // 已存在，插入失败返回{迭代器, false}

// 插入范围
std::vector<int> vec = {11, 12, 13};
s.insert(vec.begin(), vec.end());

// C++11 emplace（避免不必要的拷贝）
s.emplace(15);

查找元素：

cpp 复制代码

// find查找（推荐）
auto it = s.find(7);  // O(logN)查找
if (it != s.end()) {
    std::cout << "找到了: " << *it << std::endl;
}

// count判断存在性（set只能是0或1）
if (s.count(7)) {
    std::cout << "7存在" << std::endl;
}

// C++20 contains更直观
#if __cplusplus >= 202002L
if (s.contains(7)) {
    std::cout << "7存在" << std::endl;
}
#endif

删除元素：

cpp 复制代码

s.erase(7);          // 删除值为7的元素
s.erase(s.begin());  // 删除第一个元素
s.erase(s.begin(), std::next(s.begin(), 2));  // 删除前两个元素

🗺️ map：强大的键值对容器

map是什么？

map存储键值对，每个键都是唯一的，底层也是红黑树实现。

cpp 复制代码

#include <map>
#include <string>

int main() {
    // 英汉字典
    std::map<std::string, std::string> dict = {
        {"apple", "苹果"},
        {"banana", "香蕉"},
        {"cat", "猫"}
    };
    
    // C++17结构化绑定遍历（推荐）
    for (const auto& [key, value] : dict) {
        std::cout << key << ": " << value << std::endl;
    }
    
    // 传统遍历方式
    for (const auto& kv : dict) {
        std::cout << kv.first << ": " << kv.second << std::endl;
    }
}

map的多种插入方式

cpp 复制代码

std::map<std::string, int> scores;

// 1. 使用insert（避免不必要的构造）
scores.insert({"小明", 90});  // C++11初始化列表
auto result = scores.insert({"小明", 85});  // 已存在，插入失败
if (!result.second) {
    std::cout << "小明已存在，分数为: " << result.first->second << std::endl;
}

// 2. 使用make_pair
scores.insert(std::make_pair("小红", 85));

// 3. 使用emplace（C++11，原地构造）
scores.emplace("小刚", 88);  // 避免临时对象

// 4. 使用数组下标（最常用，但需注意副作用）
scores["小李"] = 92;  // 如果不存在会自动插入

map的查找和修改

cpp 复制代码

// 查找 - 推荐使用find（安全）
auto it = scores.find("小明");
if (it != scores.end()) {
    std::cout << "小明的分数: " << it->second << std::endl;
    it->second = 95;  // 修改分数
}

// 直接使用[] - 注意副作用
std::cout << "小红的分数: " << scores["小红"] << std::endl;  // 可能创建新元素！

// 安全的查找方式
std::string name = "小张";
if (auto search = scores.find(name); search != scores.end()) {
    // C++17 if with initializer
    std::cout << name << "的分数: " << search->second << std::endl;
} else {
    std::cout << name << "不存在" << std::endl;
}

// C++20 contains
#if __cplusplus >= 202002L
if (scores.contains("小明")) {
    std::cout << "小明存在" << std::endl;
}
#endif

🔥 超好用的operator[]

map的operator[]可能是最强大的特性，但需要小心使用：

cpp 复制代码

std::map<std::string, int> wordCount;

// 统计单词出现次数（一行搞定！）
std::vector<std::string> words = {"apple", "banana", "apple", "cherry"};
for (const auto& word : words) {
    wordCount[word]++;  // 不存在会自动创建并初始化为0，然后++
}

// wordCount现在：{"apple":2, "banana":1, "cherry":1}

// operator[]的魔法：
// - key存在 → 返回对应value的引用
// - key不存在 → 插入{key, 默认值}，返回value的引用

// 但要注意的问题：
std::map<std::string, int> m;
int value = m["不存在的键"];  // 这会插入新元素，value=0
// 如果只是想检查存在性，不要用operator[]

🔄 multiset和multimap：允许重复

multiset和multimap允许键值重复，使用场景不同：

cpp 复制代码

// multiset：允许重复元素
std::multiset<int> ms = {1, 3, 3, 3, 5};
std::cout << ms.count(3);  // 输出: 3（有三个3）

// 遍历所有重复元素
auto range = ms.equal_range(3);
for (auto it = range.first; it != range.second; ++it) {
    std::cout << *it << " ";  // 输出三个3
}

// multimap：允许重复key
std::multimap<std::string, int> mm;
mm.insert({"小明", 90});
mm.insert({"小明", 95});  // 允许重复key
mm.insert(std::make_pair("小红", 88));

// 注意：multimap不支持operator[]，因为可能有多个相同key

// 查找multimap中的所有小明
auto range2 = mm.equal_range("小明");
for (auto it = range2.first; it != range2.second; ++it) {
    std::cout << it->first << ": " << it->second << std::endl;
}

🎯 实际应用场景

场景1：统计单词频率（LeetCode风格）

cpp 复制代码

#include <map>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>

void countWords(const std::string& text) {
    std::map<std::string, int> freq;
    std::istringstream iss(text);
    std::string word;
    
    while (iss >> word) {
        // 转换小写（可选）
        for (char& c : word) c = std::tolower(c);
        freq[word]++;
    }
    
    // 输出频率最高的单词
    auto max_it = std::max_element(freq.begin(), freq.end(),
        [](const auto& a, const auto& b) { return a.second < b.second; });
    
    if (max_it != freq.end()) {
        std::cout << "最频繁的单词: " << max_it->first 
                  << " (出现" << max_it->second << "次)" << std::endl;
    }
}

场景2：检测链表环（LeetCode 142） - 优化版

cpp 复制代码

#include <set>

struct ListNode {
    int val;
    ListNode *next;
    ListNode(int x) : val(x), next(nullptr) {}
};

// 方法1：使用set（内存占用O(N)）
ListNode* detectCycle_set(ListNode* head) {
    std::set<ListNode*> visited;
    ListNode* curr = head;
    
    while (curr) {
        if (visited.find(curr) != visited.end()) {
            return curr;  // 找到环的起点
        }
        visited.insert(curr);
        curr = curr->next;
    }
    return nullptr;  // 无环
}

// 方法2：双指针法（O(1)内存，更优）
ListNode* detectCycle_optimal(ListNode* head) {
    ListNode *slow = head, *fast = head;
    
    // 第一阶段：判断是否有环
    while (fast && fast->next) {
        slow = slow->next;
        fast = fast->next->next;
        if (slow == fast) break;  // 相遇
    }
    
    if (!fast || !fast->next) return nullptr;  // 无环
    
    // 第二阶段：找到环的起点
    slow = head;
    while (slow != fast) {
        slow = slow->next;
        fast = fast->next;
    }
    return slow;
}

场景3：随机链表复制（LeetCode 138） - 扩展版

cpp 复制代码

#include <map>

class Node {
public:
    int val;
    Node* next;
    Node* random;
    Node(int _val) : val(_val), next(nullptr), random(nullptr) {}
};

// 方法1：使用map（O(N)内存）
Node* copyRandomList_map(Node* head) {
    if (!head) return nullptr;
    
    std::map<Node*, Node*> nodeMap;  // 原节点→新节点映射
    
    // 第一遍：创建新节点
    Node* curr = head;
    while (curr) {
        nodeMap[curr] = new Node(curr->val);
        curr = curr->next;
    }
    
    // 第二遍：设置next和random指针
    curr = head;
    while (curr) {
        nodeMap[curr]->next = nodeMap[curr->next];
        nodeMap[curr]->random = nodeMap[curr->random];
        curr = curr->next;
    }
    
    return nodeMap[head];
}

// 方法2：原地复制（O(1)额外内存）
Node* copyRandomList_inplace(Node* head) {
    if (!head) return nullptr;
    
    // 第一步：创建交错链表 A->A'->B->B'->...
    Node* curr = head;
    while (curr) {
        Node* copy = new Node(curr->val);
        copy->next = curr->next;
        curr->next = copy;
        curr = copy->next;
    }
    
    // 第二步：设置random指针
    curr = head;
    while (curr) {
        if (curr->random) {
            curr->next->random = curr->random->next;
        }
        curr = curr->next->next;
    }
    
    // 第三步：分离链表
    Node* newHead = head->next;
    curr = head;
    while (curr) {
        Node* copy = curr->next;
        curr->next = copy->next;
        if (copy->next) {
            copy->next = copy->next->next;
        }
        curr = curr->next;
    }
    
    return newHead;
}

⚡ 性能小贴士与扩展

时间复杂度：
- 增删查改都是O(logN)
- 基于红黑树实现，性能稳定
有序性：
- 遍历时按键值排序（默认升序）
- 支持反向遍历：rbegin(), rend()

内存优化：

cpp 复制代码

// 使用移动语义减少拷贝
std::map<std::string, std::vector<int>> data;
std::vector<int> large_vec = {1, 2, 3, 4, 5};
data["key"] = std::move(large_vec);  // 移动而非拷贝

边界操作：

cpp 复制代码

std::set<int> s = {10, 20, 30, 40, 50};

// lower_bound: 第一个>=30的
auto lb = s.lower_bound(30);  // 指向30

// upper_bound: 第一个>30的  
auto ub = s.upper_bound(30);  // 指向40

// equal_range: [lower_bound, upper_bound)
auto range = s.equal_range(30);

快速获取极值：

cpp 复制代码

std::set<int> s = {5, 2, 8, 1};

// 最小值
std::cout << "最小值: " << *s.begin() << std::endl;      // 1

// 最大值  
std::cout << "最大值: " << *s.rbegin() << std::endl;     // 8

// 第k小的元素
auto it = s.begin();
std::advance(it, 2);  // 移动到第三个元素
std::cout << "第三小的元素: " << *it << std::endl;  // 5

💡 高级使用技巧

1. 自定义排序规则

cpp 复制代码

// 不区分大小写的字符串比较
struct CaseInsensitive {
    bool operator()(const std::string& a, const std::string& b) const {
        return std::lexicographical_compare(
            a.begin(), a.end(), b.begin(), b.end(),
            [](char c1, char c2) { 
                return std::tolower(c1) < std::tolower(c2); 
            }
        );
    }
};

std::set<std::string, CaseInsensitive> names;  // 不区分大小写

// 按多个字段排序
struct Person {
    std::string name;
    int age;
};

struct ComparePerson {
    bool operator()(const Person& a, const Person& b) const {
        if (a.age != b.age) return a.age < b.age;
        return a.name < b.name;  // 年龄相同按名字排序
    }
};

std::set<Person, ComparePerson> people;

2. 范围查询与操作

cpp 复制代码

std::set<int> s = {10, 20, 30, 40, 50, 60, 70};

// 找到[30, 60)区间的元素
auto low = s.lower_bound(30);  // 第一个>=30的
auto up = s.upper_bound(60);   // 第一个>60的

// 删除区间[30, 60)
s.erase(low, up);  // 删除30到60之间的元素（包含30，不包含60）

// 检查某个区间内是否有元素
bool hasElementsBetween = (s.lower_bound(25) != s.upper_bound(45));

3. 使用map作为缓存

cpp 复制代码

// 斐波那契数列的备忘录模式
std::map<int, long long> fib_cache;

long long fibonacci(int n) {
    if (n <= 1) return n;
    
    // 检查缓存
    auto it = fib_cache.find(n);
    if (it != fib_cache.end()) {
        return it->second;
    }
    
    // 计算并缓存
    long long result = fibonacci(n-1) + fibonacci(n-2);
    fib_cache[n] = result;
    return result;
}

4. 合并两个map

cpp 复制代码

std::map<std::string, int> m1 = {{"a", 1}, {"b", 2}};
std::map<std::string, int> m2 = {{"b", 3}, {"c", 4}};

// 合并m2到m1，键冲突时取m2的值
m1.merge(m2);  // C++17

// 或者使用insert
m1.insert(m2.begin(), m2.end());  // 键冲突时不覆盖

🎓 总结与最佳实践

map和set是C++中最有用的容器之一：

优势：

基于红黑树，性能稳定（O(logN)）
自动排序，无需手动维护
接口简洁，功能强大
解决复杂问题的利器

选择指南：

需要唯一键值 → 用set/map
允许重复键值 → 用multiset/multimap
只需要判断存在性 → set
需要键值映射 → map
需要快速统计 → 善用operator[]
不需要顺序但需要快速查找 → 考虑unordered_set/unordered_map

常见陷阱与解决方案：

map的operator[]副作用：查询时用find，修改时才用operator[]
set中自定义类型的比较：必须提供严格弱序比较器
迭代器失效：只有被删除元素的迭代器会失效
性能考虑：元素较多时，unordered容器可能更快

一句话建议：

掌握了map和set，你会发现很多复杂问题都能用几行代码优雅解决。记住选择正确的数据结构比优化算法更重要！

cpp 复制代码

// 快速验证代码示例
int main() {
    // 简单演示
    std::set<int> demo_set = {3, 1, 4, 1, 5};
    std::cout << "Set内容: ";
    for (int n : demo_set) std::cout << n << " ";
    
    std::map<std::string, int> demo_map = {{"one", 1}, {"two", 2}};
    std::cout << "\nMap中two的值: " << demo_map["two"] << std::endl;
    
    return 0;
}
```![在这里插入图片描述](https://i-blog.csdnimg.cn/direct/3f4102ecc6dc447fa7e19574ef294eaf.jpeg)