深入浅出：Map与Set的核心原理与使用场景

一、set的使用

1. set 是什么？

1.1 基本概念

set 是 C++ STL 中的关联式容器 ，与 vector、list 这类序列式容器不同，关联式容器的元素按照关键字进行存储和访问，逻辑结构是非线性的。

set 的核心特点：

自动排序：元素默认按升序排列（底层红黑树的中序遍历）
自动去重：不允许存储重复元素
高效查找：插入、删除、查找的时间复杂度均为 O(log N)
元素不可修改 ：修改元素可能破坏红黑树的结构，因此 set 不支持直接修改

1.2 底层结构

set 的底层是红黑树（一种自平衡二叉搜索树），这解释了它为什么能同时做到自动排序和高效查找。

2. set 的构造

set 支持多种构造方式，需引入头文件 <set>：

cpp 复制代码

#include <set>
#include <vector>
using namespace std;

// 1. 默认构造
set<int> s1;

// 2. 初始化列表构造
set<int> s2 = {5, 2, 8, 2, 9};  // 重复的2只会保留一个

// 3. 迭代器范围构造
vector<int> v = {3, 1, 4, 1, 5};
set<int> s3(v.begin(), v.end());

// 4. 拷贝构造
set<int> s4(s2);

// 5. 自定义排序规则（降序）
set<int, greater<int>> s5;  // 从大到小排序

3. 迭代器与遍历

set 提供双向迭代器 ，支持正向和反向遍历。由于底层是红黑树，中序遍历得到的是有序序列。

cpp 复制代码

set<int> s = {5, 2, 8, 2, 9};  // 实际存储：2, 5, 8, 9

// 方式1：范围for循环（最简洁）
for (const auto& elem : s) {
    cout << elem << " ";  // 输出：2 5 8 9
}

// 方式2：正向迭代器
for (auto it = s.begin(); it != s.end(); ++it) {
    cout << *it << " ";
}

// 方式3：反向迭代器（从大到小）
for (auto rit = s.rbegin(); rit != s.rend(); ++rit) {
    cout << *rit << " ";  // 输出：9 8 5 2
}

4. 增删查改操作

4.1 插入元素：insert

insert 方法将元素插入 set，若元素已存在则插入失败。返回值是一个 pair<iterator, bool>，second 表示是否插入成功。

cpp 复制代码

set<int> s;
s.insert(10);
s.insert(5);
s.insert(10);  // 重复插入，失败

// 批量插入
s.insert({3, 7, 1, 9});

// 带位置提示的插入（可提高效率）
auto it = s.begin();
s.insert(it, 4);  // 从it位置开始搜索插入位置

4.2 删除元素：erase

erase 有三种重载形式：

cpp 复制代码

set<int> s = {1, 2, 3, 4, 5};

// 1. 删除指定值（返回删除个数，set中非0即1）
size_t cnt = s.erase(3);    // cnt = 1
cnt = s.erase(10);          // cnt = 0

// 2. 删除迭代器指向的元素
s.erase(s.begin());         // 删除最小值

// 3. 删除迭代器区间
s.erase(s.begin(), ++s.begin());  // 注意：迭代器失效问题

4.3 查找元素：find

find 方法返回指向匹配元素的迭代器，若未找到则返回 end()。时间复杂度 O(log N)，比通用算法 std::find 的 O(N) 快得多。

cpp 复制代码

set<int> s = {2, 4, 6, 8, 10};

// 推荐：使用 set 的 find 方法
auto it = s.find(6);
if (it != s.end()) {
    cout << "找到：" << *it << endl;
} else {
    cout << "未找到" << endl;
}

// 不推荐：使用算法库的 find（O(N) 复杂度）
// auto it2 = find(s.begin(), s.end(), 6);

4.4 其他常用操作

cpp 复制代码

set<int> s = {1, 3, 5, 7, 9};

// 判断元素是否存在（count 返回 0 或 1）
if (s.count(5)) {
    cout << "5 存在" << endl;
}

// 获取容器大小
cout << s.size() << endl;      // 5
cout << s.empty() << endl;     // 0（非空）

// 清空容器
s.clear();

// 查找边界
s = {10, 20, 30, 40, 50};
auto low = s.lower_bound(25);  // 第一个 >=25 的元素 → 30
auto up = s.upper_bound(30);   // 第一个 >30 的元素 → 40

5. multiset 与 set 的差异

当需要存储允许重复 的有序元素时，应使用 multiset。两者接口几乎一致，但行为有重要差异。

5.1 差异对比表

特性	set	multiset
重复元素	不允许，自动去重	允许
`insert` 返回值	`pair<iterator, bool>`	仅返回 iterator
`erase(value)`	删除 1 个，返回 1 或 0	删除所有匹配，返回删除个数
`count(value)`	返回 0 或 1	返回实际个数
`find(value)`	返回唯一匹配的迭代器	返回第一个匹配的迭代器

5.2 multiset 使用示例

cpp 复制代码

#include <set>
multiset<int> ms = {3, 1, 4, 1, 5, 9, 2, 6};

// 允许重复元素
ms.insert(1);  // 现在有三个 1

// count 返回实际个数
cout << ms.count(1) << endl;  // 输出：3

// find 返回第一个匹配的迭代器
auto it = ms.find(1);
while (it != ms.end() && *it == 1) {
    cout << *it << " ";  // 遍历所有值为 1 的元素
    ++it;
}

// erase(value) 删除所有匹配元素
size_t deleted = ms.erase(1);  // 删除所有 1，返回删除个数
cout << deleted << endl;       // 输出：3

// equal_range 获取相等元素的区间
auto range = ms.equal_range(5);
for (auto it = range.first; it != range.second; ++it) {
    cout << *it << " ";
}

二、map的使用

1. 基石：pair 类型介绍

在正式接触 map 之前，我们必须先了解它的基本存储单元------pair。

map 容器中存储的每一个元素都是一个键值对，而 pair 正是用于将两个数据（键 Key 和值 Value）绑定在一起的模板类。它定义在 <utility> 头文件中（通常 map 头文件会间接包含它）。

1.1 pair 的定义与结构

pair 的结构非常简单，它将两个数据组合成一个单独的对象，其定义大致如下：

cpp 复制代码

template<class T1, class T2>
struct pair {
    T1 first;   // 第一个元素，在map中代表"键"
    T2 second;  // 第二个元素，在map中代表"值"
};

关键特性：

异构性 ：first 和 second 可以是任意不同的类型。
轻量级：通常仅占用两个成员变量所需的内存空间。

1.2 pair 的创建与初始化

C++ 提供了多种灵活的方式来构造 pair 对象，最常用的是 make_pair 函数和 C++11 引入的列表初始化：

cpp 复制代码

#include <utility> // pair头文件
#include <string>

int main() {
    // 1. 显式指定类型构造
    std::pair<std::string, int> p1("apple", 5);
    
    // 2. 使用 make_pair 自动推导类型（最常用）
    auto p2 = std::make_pair("banana", 3);
    
    // 3. C++11 列表初始化
    std::pair<std::string, double> p3 = {"orange", 2.5};
    
    // 访问元素
    std::cout << p1.first << ": " << p1.second << std::endl;
    return 0;
}

注意： 在 map 内部，键（first）被视为 const 类型，不允许修改，否则会破坏底层二叉搜索树的有序结构。

2. map 容器概述

2.1 底层原理与特性

std::map 是 C++ STL 中的关联式容器，其底层实现通常为红黑树 （一种自平衡的二叉搜索树）。这使得 map 具备了以下核心特性：

有序性 ：内部元素会根据键（Key）的大小自动进行升序排列 （默认使用 std::less）。
高效性 ：插入、删除、查找操作的平均时间复杂度均为 O(log N)。
唯一性：键（Key）在容器中是唯一的，不允许重复。

2.2 核心区别：map vs set

虽然 map 和 set 底层结构相似，但用途有明显区别：

set ：只存储键本身，用于去重和集合运算。
map ：存储键值对，可以通过键快速访问或修改对应的值，适用于字典、缓存等场景。

3. map 的构造与初始化

map 提供了多种构造函数，常见的初始化方式如下：

map 提供了多种构造函数：

构造函数	说明
`map()`	无参构造，创建空容器
`map(InputIterator first, InputIterator last)`	迭代器区间构造
`map(const map& x)`	拷贝构造
`map(initializer_list il)`	初始化列表构造（C++11）

cpp 复制代码

#include <map>
#include <string>

int main() {
    // 无参构造
    map<string, int> m1;
    
    // 初始化列表构造
    map<string, int> m2 = {
        {"apple", 5},
        {"banana", 8},
        {"orange", 6}
    };
    
    // 拷贝构造
    map<string, int> m3(m2);
    
    // 迭代器区间构造
    map<string, int> m4(++m2.begin(), --m2.end());
    
    return 0;
}

4. map 的增删查

4.1 插入元素 - insert

cpp 复制代码

map<string, int> m;

// 方式一：插入 pair
m.insert(pair<string, int>("apple", 5));

// 方式二：make_pair
m.insert(make_pair("banana", 8));

// 方式三：列表初始化（最简洁）
m.insert({"orange", 6});

// 方式四：使用 []（最常用）
m["grape"] = 10;

insert 的返回值是一个 pair<iterator, bool>：

插入成功：iterator 指向新元素，bool 为 true
插入失败（键已存在）：iterator 指向已存在的元素，bool 为 false

4.2 删除元素 - erase

cpp 复制代码

map<string, int> m = {{"a",1}, {"b",2}, {"c",3}};

// 按 key 删除
m.erase("a");

// 按迭代器删除
auto it = m.find("b");
if (it != m.end()) m.erase(it);

// 按迭代器区间删除
m.erase(m.begin(), m.end());

// 清空所有元素
m.clear();

4.3 查找元素 - find

cpp 复制代码

auto it = m.find("apple");
if (it != m.end()) {
    cout << "Found: " << it->first << " -> " << it->second << endl;
} else {
    cout << "Not found" << endl;
}

4.4 统计与判断

cpp 复制代码

// 判断是否存在
if (m.count("apple")) {
    // 存在（map 中 count 返回 0 或 1）
}

// 判断是否为空
if (m.empty()) { }

// 获取元素个数
size_t n = m.size();

5. map 的数据修改

map 中 key 不可修改 ，但 value 可以修改：

cpp 复制代码

map<string, int> m = {{"apple", 5}};

// 通过迭代器修改
auto it = m.find("apple");
if (it != m.end()) {
    it->second = 10;  // 修改 value
    // it->first = "new";  // 错误！key 不可修改
}

// 通过 [] 修改
m["apple"] = 15;  // 存在则修改，不存在则插入

// 通过 at() 修改（不存在时抛异常）
m.at("apple") = 20;

6. map 的迭代器和 operator\[\]

迭代器特点

map 的迭代器是双向迭代器
遍历结果是按 key 升序的（红黑树中序遍历）
普通迭代器可以修改 value，但不能修改 key

cpp 复制代码

map<string, int>::iterator it;       // 普通迭代器
map<string, int>::const_iterator cit; // const 迭代器
map<string, int>::reverse_iterator rit; // 反向迭代器

// 遍历
for (auto it = m.begin(); it != m.end(); ++it) {
    it->second = 100;  // 可以修改 value
}

operator\[\] 的妙用

operator[] 是 map 中最常用的元素访问方式，其行为很特殊

cpp 复制代码

map<string, int> m;
m["apple"] = 5;  // key 不存在，插入 {apple, 5}
m["apple"] = 10; // key 存在，修改 value 为 10
int price = m["banana"]; // key 不存在，插入 {banana, 0}，返回 0

operator\[\] 的等价实现：

cpp 复制代码

(*((this->insert(make_pair(k, mapped_type()))).first)).second

经典应用 - 统计单词出现次数：

cpp 复制代码

string words[] = {"apple", "banana", "apple", "orange", "banana", "apple"};
map<string, int> countMap;

for (const auto& w : words) {
    countMap[w]++;  // 不存在则插入并初始化为 0，然后自增
}

for (const auto& kv : countMap) {
    cout << kv.first << " : " << kv.second << endl;
}
// 输出：apple:3, banana:2, orange:1

operator\[\] 与 at() 的区别：

特性	operator\[\]	at()
key 存在时	返回 value 引用	返回 value 引用
key 不存在时	插入默认值并返回引用	抛出异常

7. multimap 和 map 的差异

multimap 与 map 的核心区别在于 key 是否允许重复 ：

特性	map	multimap
key 唯一性	key 必须唯一	key 可重复
`operator[]`	支持	不支持
`at()`	支持	不支持
insert 返回值	`pair<iterator, bool>`	仅返回 `iterator`（始终插入成功）
find 行为	返回唯一元素或 end	返回第一个匹配元素
count 作用	返回 0 或 1	返回实际数量

multimap 使用示例：

cpp 复制代码

#include <map>
#include <iostream>
using namespace std;

int main() {
    multimap<string, int> mm;
    
    // 插入元素（允许重复 key）
    mm.insert({"apple", 5});
    mm.insert({"apple", 8});
    mm.insert({"apple", 6});
    mm.insert({"banana", 3});
    
    // 统计某个 key 的数量
    cout << "apple 数量: " << mm.count("apple") << endl;  // 输出 3
    
    // 查找所有相同 key 的元素
    auto range = mm.equal_range("apple");
    for (auto it = range.first; it != range.second; ++it) {
        cout << it->first << " : " << it->second << endl;
    }
    // 输出：apple:5, apple:8, apple:6
    
    // 遍历所有元素（按 key 排序）
    for (const auto& kv : mm) {
        cout << kv.first << " : " << kv.second << endl;
    }
    
    return 0;
}