第21课-C++[set和map学习和使用]

🌼引言

C++ 标准模板库（STL）中的 set 和 map 是两种非常实用的关联式容器。它们具备快速查找、有序存储的特点，因而在很多需要高效数据管理的场景中被广泛应用。本文将深入讲解 set 和 map 的用法，并通过实际例子分析如何将它们应用到编程问题中。

一、关联式容器的基础知识

1.1 什么是关联式容器？

在 STL 中，容器分为两大类：序列式容器 和关联式容器。

序列式容器 ：包括 vector、deque、list 等，它们的特点是按元素插入的顺序存储数据，并提供快速的随机访问能力。然而，在这些容器中查找某个特定值的效率较低（线性时间复杂度 O(n)），不适合频繁查找的情况。
关联式容器 ：包括 set、map、multiset、multimap 等，它们采用键值对（key-value）的形式存储数据，可以通过键快速找到相应的值。这类容器的查找、插入和删除操作的时间复杂度较低（O(log n)），且存储的元素自动按键排序。

关联式容器的特点如下：

高效查找 ：set 和 map 可以在 O(log n) 的时间内查找、插入或删除数据。
自动排序：存储的元素按键自动排序，默认是升序，可以通过自定义比较函数实现降序排序。
键值对存储 ：对于 map 和 multimap，每个元素都以键值对的形式存储，即每个键对应一个值。

1.2 键值对概念

map 和 multimap 是基于键值对（key-value pair）结构的关联式容器 。键值对是一种用于表示一一对应关系的数据结构。在键值对中，key 表示键，value 表示值。通过 key 可以快速定位到 value，这种设计非常适合用于数据查找和映射。

1.在 C++ 中，标准库提供了 pair 结构用于实现键值对。例如：

复制代码

pair<string, int> p("Alice", 90); 
cout << "Name: " << p.first << ", Score: " << p.second << endl;

pair 中的 first 表示键值，second 则表示实值，在给关联式容器中插入数据时，可以构建 pair 对象。

2.利用函数实现键值对：

make_pair 是一个方便的工具函数，用于创建 pair 对象。通过它，你可以简洁地创建键值对、返回多个值以及处理成对的数据。在 STL 容器（如 map 和 multimap）中使用 make_pair，不仅使代码简洁，而且提高了代码的可读性。

复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <utility>

int main() {
    std::vector<std::pair<int, std::string>> items;

    // 使用 make_pair 添加元素
    items.push_back(std::make_pair(3, "Apple"));
    items.push_back(std::make_pair(1, "Banana"));
    items.push_back(std::make_pair(2, "Cherry"));

    // 按第一个元素排序
    std::sort(items.begin(), items.end());

    for (const auto& item : items) {
        std::cout << item.second << " ";
    }
    std::cout << std::endl;

    return 0;
}

1.3、树型结构的关联式容器

所以在 C++ 标准中，共提供了四种树型结构的关联式容器

set

multiset

map

multimap

关于哈希结构的关联式容器将在哈希表中学习

树型结构与哈希结构的关联式容器功能都是一模一样的，不过哈希结构查找比树型结构快得多 -> O(1)

注：STL 中选择的树型结构为红黑树 RB-Tree 树型结构中的元素中序遍历后有序，而哈希结构中的元素无序

二、`set` 容器详解

set 是 C++ STL 中的一个集合容器，用于存储不重复的元素，且所有元素是有序存储的。它适合需要唯一性和快速查找的场景。

2.1 `set` 的特点

排序+去重

唯一性 ：set 中的每个元素都是唯一的，不能重复，重复元素不插入。
自动排序 ：set 会自动对元素进行排序，默认情况下按升序排列。
无键值对 ：set 只存储键值，数据即键值，不存在 key-value 对。

2.2 `set` 的常用操作

操作	作用
`insert`	插入一个元素
`erase`	删除一个元素
`find`	查找一个元素，返回其迭代器
`count`	统计元素个数，`set` 中最多为 1
`empty`	判断 `set` 是否为空
`size`	返回 `set` 中元素的个数
`clear`	清空 `set` 中的所有元素
`begin`、`end`	返回迭代器，支持遍历

代码演示：

复制代码

#include <iostream>
#include <vector>
#include <set>
using namespace std;

int main()
{
	vector<int> arr = { 7,3,6,9,3,1,6,2 };
	set<int> s1(arr.begin(), arr.end());

	//迭代器遍历
	cout << "迭代器遍历结果: ";
	set<int>::iterator it = s1.begin();
	while (it != s1.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;

	//判空、求大小
	cout << "===================" << endl;
	cout << "empty(): " << s1.empty() << endl;
	cout << "size(): " << s1.size() << endl;
	cout << "max_size(): " << s1.max_size() << endl;

	//插入元素
	cout << "===================" << endl;
	cout << "insert(5): ";
	s1.insert(5);
	for (auto e : s1) cout << e << " ";
	cout << endl;

	//删除元素
	cout << "===================" << endl;
	cout << "erase(6): ";
	s1.erase(6);
	for (auto e : s1) cout << e << " ";
	cout << endl;

	//交换、查找、清理
	cout << "===================" << endl;
	set<int> s2(arr.begin() + 5, arr.end());
	s1.swap(s2);
	cout << "s1: ";
	for (auto e : s1) cout << e << " ";
	cout << endl;

	cout << "s2: ";
	for (auto e : s2) cout << e << " ";
	cout << endl;

	cout << "s1.find(9): ";
	cout << (s1.find(9) != s1.end()) << endl;

	cout << "s2.clear(): " << endl;
	s2.clear();

	cout << "s1: ";
	for (auto e : s1) cout << e << " ";
	cout << endl;

	cout << "s2: ";
	for (auto e : s2) cout << e << " ";
	cout << endl;

	return 0;
}

至于 count 也可以用来查找元素是否存在，对于 set 来说，键值 key 就是实值 value，并且因为不允许冗余，所以只有一个键值，count 统计键值数量就相当于查找。

复制代码

#include <iostream>
#include <vector>
#include <set>
using namespace std;


int main()
{
	vector<int> arr = { 7,3,6,9,3,1,6,2 };
	set<int> s1(arr.begin(), arr.end());

	for (int i = 0; i < 10; i++)
	{
		if (s1.count(i))
			cout << i << " 在 set 中" << endl;
		else
			cout << i << " 不在 set 中" << endl;
	}

	return 0;
}

2.3 `set` 的基本用法

示例：创建和插入数据

以下是一个创建 set 的例子，并展示如何插入和删除数据。

复制代码

#include <iostream>
#include <set>
using namespace std;

int main() {
    set<int> s;  // 创建一个空的 set

    // 插入元素
    s.insert(10);
    s.insert(20);
    s.insert(10);  // 重复插入 10，不会被插入

    // 输出 set 中的元素
    for (int e : s) {
        cout << e << " ";
    }
    cout << endl;

    return 0;
}

示例：查找元素

find 可以用于查找元素是否存在于 set 中。如果存在，返回该元素的迭代器；否则，返回 end 迭代器。

复制代码

int main() {
    set<int> s = {1, 2, 3, 4, 5};

    // 查找元素 3
    auto it = s.find(3);
    if (it != s.end()) {
        cout << "Element 3 found." << endl;
    } else {
        cout << "Element 3 not found." << endl;
    }

    return 0;
}

set的特点：

2.4 `multiset` 容器

multiset 是 set 的一个变体 ，multiset 和 set 的操作没什么区别，一模一样， 允许存储重复的元素。multiset 的操作与 set 基本相同，只是它允许插入重复元素。示例如下：

复制代码

#include <iostream>
#include <set>
using namespace std;

int main() {
    multiset<int> ms;
    ms.insert(5);
    ms.insert(5);
    ms.insert(3);

    for (int e : ms) {
        cout << e << " ";
    }
    cout << endl;

    return 0;
}

值得一提的是，当在 multiset 中查找冗余的数据时，返回的是 中序遍历中，第一次出现的元素。

复制代码

#include <iostream>
#include <vector>
#include <set>
using namespace std;

int main()
{
	vector<int> arr = { 3,5,3,4,5,9,2,3 };
	multiset<int> ms1(arr.begin(), arr.end());

	auto it = ms1.begin();
	while (it != ms1.end())
	{
		cout << *it << " | " << &*(it) << endl;
		++it;
	}

	cout << "================" << endl;

	cout << "ms1.find(3): " << &*(ms1.find(3)) << endl;

	return 0;
}

所以，multiset 才是真正的排序，set 则是去重 + 排序

统计键值数 count 在 multiset 中可以发挥真正效果。

复制代码

#include <iostream>
#include <vector>
#include <set>
using namespace std;


int main()
{
	vector<int> arr = { 3,5,3,4,5,9,2,3 };
	multiset<int> ms1(arr.begin(), arr.end());

	for (int i = 0; i < 10; i++)
		cout << i << "在 multiset 中的数量: " << ms1.count(i) << endl;

	return 0;
}

三、`map` 容器详解

map 是 C++ STL 中的一种键值对关联式容器，每个元素包含一个键和值。map 中的键是唯一的，不能重复；每个键都映射到一个对应的值。map 在很多场景中被用作哈希表来存储键值对数据。

3.1 `map` 的特点

唯一键 ：map 中的每个键是唯一的。
键值对存储 ：map 存储的是键值对，即 key-value。
按键自动排序 ：map 会自动对键进行排序，默认按升序排列。
可以通过键访问值 ：可以通过 operator[] 运算符直接访问键对应的值。

3.2 `map` 的常用操作

操作	作用
`insert`	插入一个键值对
`erase`	删除指定键的键值对
`find`	查找指定键，返回其迭代器
`count`	统计键的数量（`map` 中最多为 1）
`operator[]`	根据键访问值，如果键不存在则自动插入新键
`empty`	判断 `map` 是否为空
`size`	返回 `map` 中键值对的个数
`clear`	清空 `map` 中的所有键值对

map 的定义和常用操作

map 位于 <map> 头文件中，定义方式如下：

复制代码

#include <map>
#include <iostream>
using namespace std;

int main() {
    map<string, int> age; // 创建一个 map，其中键是 string，值是 int

    // 插入键值对
    age["Alice"] = 25;
    age["Bob"] = 30;

    // 查找键值对
    if (age.find("Alice") != age.end()) {
        cout << "Alice's age: " << age["Alice"] << endl;
    }

    // 遍历 map
    for (const auto& entry : age) {
        cout << entry.first << ": " << entry.second << endl;
    }

    return 0;
}

map 的基本用法示例

插入和访问键值对

#include <map>
#include <iostream>
using namespace std;

int main() {
map<string, int> scores;

复制代码

 // 插入键值对
 scores.insert(make_pair("Alice", 90));
 scores.insert(make_pair("Bob", 85));
 
 // 也可以通过 operator[] 插入和访问
 scores["Charlie"] = 95;

 // 访问值
 cout << "Alice's score: " << scores["Alice"] << endl;

 return 0;

}

查找和删除键值对

#include <map>
#include <iostream>
using namespace std;

int main() {
map<string, int> scores = {{"Alice", 90}, {"Bob", 85}, {"Charlie", 95}};

复制代码

 // 查找键 "Bob"
 auto it = scores.find("Bob");
 if (it != scores.end()) {
     cout << "Found Bob's score: " << it->second << endl;
 }

 // 删除键 "Alice"
 scores.erase("Alice");

 // 输出所有键值对
 for (const auto& entry : scores) {
     cout << entry.first << ": " << entry.second << endl;
 }

 return 0;

}

`3.3 map` 的特点与性能

自动排序 ：map 中的元素按键自动排序，因此每次插入键值对后，map 都会根据键进行重新排序。
唯一性 ：map 中不允许重复键，如果试图插入相同键的元素，将会被忽略。
高效查找 ：map 的查找、插入和删除的时间复杂度为 O(log n)，适合需要快速查找的场景。
默认初始化 ：使用 operator[] 时，如果键不存在，则会创建一个默认值的键值对。

注意事项

map[key] 如果访问的键不存在，则会自动创建该键，并给其赋一个默认值（如 int 的默认值是 0）。使用 find 可以避免这种不必要的插入。

map 中的数据是按键有序的，因此 map 不支持像 unordered_map 那样的 O(1) 查找。

map 的应用场景

词频统计 ：可以使用 map 来统计字符串中每个字符的出现频率。
缓存机制 ：可以通过 map 来实现简单的缓存机制，用于存储经常使用的键值对。
字典实现 ：map 是存储字典（如配置文件、键值对映射等）的一种常用数据结构。

总计：

总结

map 是一个非常强大的数据结构，可以存储唯一的键值对并按键排序。它在很多场景中都非常实用，尤其适合用于需要快速查找和唯一键的情况。在使用时需要注意 map[key] 会默认初始化键值对的问题，可以通过 find 函数来避免不必要的插入。

四、前K个高频单词

692. 前K个高频单词 - 力扣（LeetCode）

题目分析：题目很短，就是在一个字符串数组中，找出前 k 个出现频率最高的单词

注意：如果出现次数相同，则按字典序排序。（相同次数小写排前面）。

解法：map + 快排

利用 map 建立 <string, int> 的映射关系，在按照字典序排序的同时统计出每个单词的出现频率，再通过快排依照数量进行二次排序，选择前 k 个高频单词即可

因为基础版快排不稳定，可能会导致频率相同的单词顺序出问题，即违背题目要求：如果出现频率相同，则按字典序排序

所以这里需要使用稳定版快排 stable_sort，如果频率相同，保持原有顺序

复制代码

//map + stable_sort
class Solution {
public:
    struct Compare
    {
        bool operator()(const pair<string, int>& kv1, const pair<string, int>& kv2) const
        {
            return kv1.second > kv2.second;
        }
    };

    vector<string> topKFrequent(vector<string>& words, int k) {
        //统计每个单词出现的频率，同时先按照字典序排序
        map<string, int> table;
        for(auto e : words)
            table[e]++;

        //将处理好的数据存入数组中
        vector<pair<string, int>> vTable(table.begin(), table.end());

        //按照出现频率进行二次排序
        stable_sort(vTable.begin(), vTable.end(), Compare());

        //取出前 k 个高频单词
        vector<string> vs;
        for(int i = 0; i < k; i++)
            vs.push_back(vTable[i].first);
        
        return vs;
    }
};

LCR 154. 复杂链表的复制 - 力扣（LeetCode）

五、补充：交集与差集

交集，指两个数组中相同的元素所构成的集合

求交集的步骤如下：

先将两个数组排序 + 去重
遍历两个数组
如果不相等，小的 ++
相等就是交集，记录下来
其中一方走完，所有交集就查找完了

排序 + 去重，这就不就是**set 吗？**

349. 两个数组的交集 - 力扣（LeetCode）

复制代码

class Solution {
public:
    vector<int> intersection(vector<int>& nums1, vector<int>& nums2) {
      set<int> set1(nums1.begin(),nums1.end());
       set<int> set2(nums2.begin(),nums2.end());

       
       vector<int> num1(set1.begin(),set1.end());
       vector<int> num2(set2.begin(),set2.end());
        int i=0,j=0;
        vector<int> v;

        while(i<num1.size()&&j<num2.size())
        {
            if(num1[i]==num2[j])
            {
                v.push_back(num1[i]);
            }
            else if(num1[i]<num2[j])
            {
                i++;
            }
            else{
                j++;
            }

        }
        return v;
    }
};

总结：

容器	特点	适用场景
`set`	唯一元素，有序存储	去重，查找唯一元素
`multiset`	允许重复元素，有序存储	频率统计，处理重复数据
`map`	键值对存储，唯一键，有序	频率统计，字典查询
`multimap`	键值对存储，允许重复键，有序	频率统计，管理多值的数据

第21课-C++[set和map学习和使用]

一、关联式容器的基础知识

1.1 什么是关联式容器？

1.2 键值对概念

1.3、树型结构的关联式容器

二、set 容器详解

2.1 set 的特点

2.2 set 的常用操作

2.3 set 的基本用法

2.4 multiset 容器

三、map 容器详解

3.1 map 的特点

3.2 map 的常用操作

3.3 map 的特点与性能

总结

四、前K个高频单词

五、补充：交集与差集

总结：

二、`set` 容器详解

2.1 `set` 的特点

2.2 `set` 的常用操作

2.3 `set` 的基本用法

2.4 `multiset` 容器

三、`map` 容器详解

3.1 `map` 的特点

3.2 `map` 的常用操作

`3.3 map` 的特点与性能