【C++进阶】---- map和set的使用

1.序列式容器和关联式容器

前⾯我们已经接触过STL中的部分容器如：string、vector、list、deque、array、forward_list等，这些容器统称为序列式容器，因为逻辑结构为线性序列的数据结构，两个位置存储的值之间⼀般没有紧密的关联关系，⽐如交换⼀下，他依旧是序列式容器。顺序容器中的元素是按他们在容器中的存储位置来顺序保存和访问的。

关联式容器也是⽤来存储数据的，与序列式容器不同的是，关联式容器逻辑结构通常是⾮线性结构，两个位置有紧密的关联关系，交换⼀下，他的存储结构就被破坏了。顺序容器中的元素是按关键字来保存和访问的。关联式容器有map/set系列unordered_map/unordered_set系列。

本章节讲解的map和set底层是红⿊树，红⿊树是⼀颗平衡⼆叉搜索树。set是key搜索场景的结构，map是key/value搜索场景的结构。

2.set系列的使用

2.1set类的介绍

• set的声明如下，T就是set底层关键字的类型

• set默认要求T⽀持⼩于⽐较，如果不⽀持或者想按⾃⼰的需求⾛可以⾃⾏实现仿函数传给第⼆个模版参数

• set底层存储数据的内存是从空间配置器申请的，如果需要可以⾃⼰实现内存池，传给第三个参数。

• ⼀般情况下，我们都不需要传后两个模版参数。

• set底层是⽤红⿊树实现，增删查效率是，迭代器遍历是⾛的搜索树的中序，所以是有序的。 O(logN)

• 前⾯部分我们已经学习了vector/list等容器的使⽤，STL容器接⼝设计，⾼度相似，所以这⾥我们就不再⼀个接⼝⼀个接⼝的介绍，⽽是直接带着⼤家看⽂档，挑⽐较重要的接⼝进⾏介绍。

cpp 复制代码

template < class T, // set::key_type/value_type
 class Compare = less<T>, // set::key_compare/value_compare
 class Alloc = allocator<T> // set::allocator_type
 > class set;

2.2set的构造和迭代器

set的构造我们关注以下⼏个接⼝即可。 set⽀持正向和反向迭代遍历，遍历默认按升序顺序，因为底层是⼆叉搜索树，迭代器遍历⾛的中序；⽀持迭代器就意味着⽀持范围for，set的iterator和const_iterator都不⽀持迭代器修改数据，修改关键字数据，破坏了底层搜索树的结构。

cpp 复制代码

// empty (1) ⽆参默认构造 
explicit set (const key_compare& comp = key_compare(),
 const allocator_type& alloc = allocator_type());

// range (2) 迭代器区间构造 
template <class InputIterator>
 set (InputIterator first, InputIterator last,
 const key_compare& comp = key_compare(),
 const allocator_type& = allocator_type());
 
// copy (3) 拷⻉构造 
set (const set& x);

// initializer list (5) initializer 列表构造 
set (initializer_list<value_type> il,
 const key_compare& comp = key_compare(),
 const allocator_type& alloc = allocator_type());
 
// 迭代器是⼀个双向迭代器 
iterator -> a bidirectional iterator to const value_type

// 正向迭代器 
iterator begin();
iterator end();

// 反向迭代器 
reverse_iterator rbegin();
reverse_iterator rend();

cpp 复制代码

#include<iostream>
#include<set>
#include<vector>
using namespace std;

int main()
{
	// 1.无参构造
	set<int> s1;

	// 2.迭代器区间构造
	vector<int> v1 = { 1,2,2,3,4,5 };
	set<int> s2(v1.begin(), v1.end());

	// 3.拷贝构造
	set<int> s3(s2);

	// 4.链表构造
	set<int> s4({ 1,2,3,4,5,1,2 });

	return 0;
}

2.3set的增删改

set的增删查关注以下⼏个接⼝即可：

cpp 复制代码

Member types
key_type -> The first template parameter (T)
value_type -> The first template parameter (T)

// 单个数据插⼊，如果已经存在则插⼊失败 
pair<iterator,bool> insert (const value_type& val);

// 列表插⼊，已经在容器中存在的值不会插⼊ 
void insert (initializer_list<value_type> il);

// 迭代器区间插⼊，已经在容器中存在的值不会插⼊ 
template <class InputIterator>
void insert (InputIterator first, InputIterator last);

// 查找val，返回val所在的迭代器，没有找到返回end() 
iterator find (const value_type& val);

// 查找val，返回Val的个数 
size_type count (const value_type& val) const;

// 删除⼀个迭代器位置的值 
iterator erase (const_iterator position);

// 删除val，val不存在返回0，存在返回1 
size_type erase (const value_type& val);

// 删除⼀段迭代器区间的值 
iterator erase (const_iterator first, const_iterator last);

// 返回⼤于等val位置的迭代器 
iterator lower_bound (const value_type& val) const;

// 返回⼤于val位置的迭代器 
iterator upper_bound (const value_type& val) const;

2.4insert和迭代器遍历使用案例

cpp 复制代码

#include<iostream>
#include<set>
using namespace std;

int main()
{
	// 去重+升序排序
	set<int> s1;
	// 去重+降序排序,给一个大于的仿函数
	//set<int,greater<int>> s1;
	s1.insert(1);
	s1.insert(1);
	s1.insert(2);
	s1.insert(3);

	//set<int>::iterator it1 = s1.begin();
	auto it1 = s1.begin();
	while (it1 != s1.end())
	{
		// error C3892: "it": 不能给常量赋值 
		//*it1 = 1;
		cout << *it1 << " ";
		it1++;
	}
	cout << endl;

	// 插⼊⼀段initializer_list列表值，已经存在的值插⼊失败
	s1.insert({ 1,2,3,4 });
	for (auto e : s1)
	{
		cout << e << " ";
	}
	cout << endl;

	set<string> strset = { "sort","insert","add" };
	// 遍历string比较ascll码大小顺序遍历的
	for (auto e : strset)
	{
		cout << e << " ";
	}
	cout << endl;

	return 0;
}

2.5find和erase使用案例

cpp 复制代码

#include<iostream>
#include<set>
using namespace std;

int main()
{
	set<int> s1 = { 4,2,7,2,8,5,9 };
	for (auto e : s1)
	{
		cout << e << " ";
	}
	cout << endl;

	// 删除最小值
	s1.erase(s1.begin());
	for (auto e : s1)
	{
		cout << e << " ";
	}
	cout << endl;

	// 直接删除x
	int x;
	cin >> x;
	int num = s1.erase(x);
	if (num == 0)
	{
		cout << x << "不存在!" << endl;
	}
	for (auto e : s1)
	{
		cout << e << " ";
	}
	cout << endl;

	// 直接查找再利用迭代器删除x
	cin >> x;
	auto pos = s1.find(x);
	if (pos != s1.end())
	{
		s1.erase(pos);
	}
	else
	{
		cout << x << "不存在!" << endl;
	}
	for (auto e : s1)
	{
		cout << e << " ";
	}
	cout << endl;

	// 算法库的查找O(N)
	auto pos1 = find(s1.begin(), s1.end(), x);
	// set自身实现的查找O(logN)
	auto pos2 = s1.find(x);

	// 利用count间接实现快速查找
	cin >> x;
	if (s1.count(x))
	{
		cout << x << "在!" << endl;
	}
	else
	{
		cout << x << "不在!" << endl;
	}

	return 0;
}

2.6multiset和set的差异

multiset和set的使⽤基本完全类似，主要区别点在于multiset⽀持值冗余，那么 insert/find/count/erase都围绕着⽀持值冗余有所差异，具体参看下⾯的样例代码理解。

find：

cpp 复制代码

#include<iostream>
#include<set>
using namespace std;

int main()
{
	// 相⽐set不同的是，multiset是排序，但是不去重 
	multiset<int> s = { 4,2,7,2,4,8,4,5,4,9 };
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

	// 相⽐set不同的是，x可能会存在多个，find查找中序的第⼀个 
	int x;
	cin >> x;
	auto pos = s.find(x);
	while (pos != s.end() && *pos == x)
	{
		cout << *pos << " ";
		pos++;
	}
	cout << endl;

	// 相⽐set不同的是，count会返回x的实际个数 
	cout << s.count(x) << endl;

	// 相⽐set不同的是，erase给值时会删除所有的x
	s.erase(x);
	for (auto e : s)
	{
		cout << e << " ";
	}
	cout << endl;

	return 0;
}

3.map系列的使用

3.1map类型介绍

map的声明如下，Key就是map底层关键字的类型，T是map底层value的类型，set默认要求Key⽀持⼩于⽐较，如果不⽀持或者需要的话可以⾃⾏实现仿函数传给第⼆个模版参数，map底层存储数据的内存是从空间配置器申请的。⼀般情况下，我们都不需要传后两个模版参数。map底层是⽤红⿊树实现，增删查改效率是 O ( logN ) ，迭代器遍历是⾛的中序，所以是按key有序顺序遍历的。

cpp 复制代码

template < class Key, // map::key_type key类型
           class T, // map::mapped_type value类型
           class Compare = less<Key>, // map::key_compare 数据比较
           class Alloc = allocator<pair<const Key,T> > //map::allocator_type 空间配置器
         > class map;

3.2pair类型介绍

map底层的红⿊树节点中的数据，使⽤pair<Key, T>存储键值对数据。

cpp 复制代码

typedef pair<const Key, T> value_type;
template <class T1, class T2>
struct pair
{
    typedef T1 first_type;
    typedef T2 second_type;

    T1 first;
    T2 second;

    pair()
    :first(T1())
    ,second(T2())
    {}

    pair(const T1& a, const T2& b)
    :first(a)
    ,second(b)
    {}

    template<class U, class V>
    pair (const pair<U,V>& pr)
    :first(pr.first)
    ,second(pr.second)
    {}
};

template <class T1,class T2>
inline pair<T1,T2> make_pair (T1 x, T2 y)
{
    return ( pair<T1,T2>(x,y) );
}

3.3map的构造

map的构造我们关注以下⼏个接⼝即可。
map的⽀持正向和反向迭代遍历，遍历默认按key的升序顺序，因为底层是⼆叉搜索树，迭代器遍历⾛的中序；⽀持迭代器就意味着⽀持范围for，map⽀持修改value数据，不⽀持修改key数据，修改关键字数据，破坏了底层搜索树的结构。

cpp 复制代码

// empty (1) ⽆参默认构造
explicit map (const key_compare& comp = key_compare(),
            const allocator_type& alloc = allocator_type());

// range (2) 迭代器区间构造
template <class InputIterator>
map (InputIterator first, InputIterator last,
     const key_compare& comp = key_compare(),
     const allocator_type& = allocator_type());

// copy (3) 拷⻉构造
map (const map& x);

// initializer list (5) initializer 列表构造
map (initializer_list<value_type> il,
     const key_compare& comp = key_compare(),
     const allocator_type& alloc = allocator_type());

// 迭代器是⼀个双向迭代器
iterator -> a bidirectional iterator to const value_type

// 正向迭代器
iterator begin();
iterator end();

// 反向迭代器
reverse_iterator rbegin();
reverse_iterator rend();

cpp 复制代码

#include<iostream>
#include<map>
using namespace std;

int main()
{
	// 无参默认构造
	map<int, int> m1;

	// 传参构造
	pair<string, int> p1 = { "英语",90 };
	pair<string, int> p2 = { "数学",95 };
	pair<string, int> p3 = { "语文",80 };
	map<string, int> m2 = { p1,p2,p3 };

	// 匿名对象初始化
	map<string, string> m3 = { pair<string, string>("insert", "插入") };
	// make_pair可自动推导模板参数类型，也可指定类型，make_pair<string,string>
	map<string, string> m4 = { make_pair("string","字符串") };

	// initializer_list构造
	map<string, string> m5 = { {"left", "左边"}, {"right", "右边"},
{"insert", "插⼊"},{ "string", "字符串" } };

	// 迭代器区间初始化
	map<string, string> m6(m5.begin(), m5.end());

	// 拷贝构造
	map<string, string> m7(m6);

	return 0;
}

3.4map的增删查

map的增删查关注以下⼏个接⼝即可：
map增接⼝，插⼊的pair键值对数据，跟set所有不同，但是查和删的接⼝只⽤关键字key跟set是完全类似的，不过find返回iterator，不仅仅可以确认key在不在，还找到key映射的value，同时通过迭代还可以修改value

cpp 复制代码

Member types
key_type -> The first template parameter (Key)
mapped_type -> The second template parameter (T)
value_type -> pair<const key_type,mapped_type>

// 单个数据插⼊，如果已经key存在则插⼊失败,key存在相等value不相等也会插⼊失败
pair<iterator,bool> insert (const value_type& val);

// 列表插⼊，已经在容器中存在的值不会插⼊
void insert (initializer_list<value_type> il);

// 迭代器区间插⼊，已经在容器中存在的值不会插⼊
template <class InputIterator>
void insert (InputIterator first, InputIterator last);

// 查找k，返回k所在的迭代器，没有找到返回end()
iterator find (const key_type& k);

// 查找k，返回k的个数
size_type count (const key_type& k) const;

// 删除⼀个迭代器位置的值
iterator erase (const_iterator position);

// 删除k，k存在返回0，存在返回1
size_type erase (const key_type& k);

// 删除⼀段迭代器区间的值
iterator erase (const_iterator first, const_iterator last);

3.5map的数据修改

前⾯我提到map⽀持修改mapped_type 数据，不⽀持修改key数据，修改关键字数据，破坏了底层搜索树的结构。
map第⼀个⽀持修改的⽅式时通过迭代器，迭代器遍历时或者find返回key所在的iterator修改，map 还有⼀个⾮常重要的修改接⼝ operator[] ，但是 operator[]不仅仅⽀持修改，还⽀持插⼊数据和查找数据，所以他是⼀个多功能复合接⼝。
需要注意从内部实现⻆度，map这⾥把我们传统说的value值，给的是T类型，typedef为
mapped_type。⽽value_type是红⿊树结点中存储的pair键值对值。⽇常使⽤我们还是习惯将这⾥的 T映射值叫做value。
operator[]内部调用insert，insert的返回值是pair<iterator,bool>,如果插入成功，返回的pair的first是新插入key所在节点的迭代器，second是true。如果插入失败，返回的pair的first是已经存在的key所在节点的迭代器，second是false。mapped_type()是value的默认构造，对于自定义成员会调用其默认构造，对于内置类型，如int，会初始化成0，double会初始化成0.0

插入成功的时候，operator[]具备了插入+修改的功能。

插入失败的时候，operator[]具备了查找+修改的功能。

cpp 复制代码

Member types
key_type -> The first template parameter (Key)
mapped_type -> The second template parameter (T)
value_type -> pair<const key_type,mapped_type>

// 查找k，返回k所在的迭代器，没有找到返回end()，如果找到了通过iterator可以修改key对应的
mapped_type值
iterator find (const key_type& k);

// ⽂档中对insert返回值的说明
// The single element versions (1) return a pair, with its member pair::first
set to an iterator pointing to either the newly inserted element or to the
element with an equivalent key in the map. The pair::second element in the pair
is set to true if a new element was inserted or false if an equivalent key
already existed.

// insert插⼊⼀个pair<key, T>对象
// 1、如果key已经在map中，插⼊失败，则返回⼀个pair<iterator,bool>对象，返回pair对象
first是key所在结点的迭代器，second是false
// 2、如果key不在在map中，插⼊成功，则返回⼀个pair<iterator,bool>对象，返回pair对象
first是新插⼊key所在结点的迭代器，second是true
// 也就是说⽆论插⼊成功还是失败，返回pair<iterator,bool>对象的first都会指向key所在的迭
代器
// 那么也就意味着insert插⼊失败时充当了查找的功能，正是因为这⼀点，insert可以⽤来实现
operator[]
// 需要注意的是这⾥有两个pair，不要混淆了，⼀个是map底层红⿊树节点中存的pair<key, T>，另
⼀个是insert返回值pair<iterator,bool>

pair<iterator,bool> insert (const value_type& val);
mapped_type& operator[] (const key_type& k);

// operator的内部实现
mapped_type& operator[] (const key_type& k)
{
    // 1、如果k不在map中，insert会插⼊k和mapped_type默认值，同时[]返回结点中存储
    mapped_type值的引⽤，那么我们可以通过引⽤修改返映射值。所以[]具备了插⼊+修改功能
    // 2、如果k在map中，insert会插⼊失败，但是insert返回pair对象的first是指向key结点的
    迭代器，返回值同时[]返回结点中存储mapped_type值的引⽤，所以[]具备了查找+修改的功能
    pair<iterator, bool> ret = insert({ k, mapped_type() });
    iterator it = ret.first;
    return it->second;
}

3.6构造遍历及增删查使用样例

cpp 复制代码

#include<iostream>
#include<map>
using namespace std;

int main()
{
	// initializer_list构造及迭代遍历
	map<string, string> dict = { {"left", "左边"}, {"right", "右边"},
{"insert", "插入"},{ "string", "字符串" } };

	auto it = dict.begin();
	while (it != dict.end())
	{
		//cout << (*it).first <<":"<<(*it).second << endl;
		// map的迭代基本都使⽤operator->,这⾥省略了⼀个->
		// 第一个->是迭代器运算符重载，返回pair*，第二个箭头是结构指针解引⽤取pair数据
		//cout << it.operator->()->first << ":" << it.operator->()-> second << endl;
		cout << it->first << ":" << it->second << endl;
		it++;
	}
	cout << endl;

	// insert插⼊pair对象的4种⽅式，对⽐之下，最后⼀种最⽅便
	pair<string, string> kv1("first", "第一个");
	dict.insert(kv1);
	dict.insert(pair<string, string>("second", "第二个"));
	dict.insert(make_pair("sort", "排序"));
	dict.insert({ "auto","自动的" });

	// map不允许键值冗余，已经存在的key会插入失败
	dict.insert({ "left", "左边，剩余" });

	for (const auto& e : dict)
	{
		cout << e.first << ":" << e.second << endl;
	}
	cout << endl;

	// 查找
	string str;
	while (cin >> str)
	{
		auto ret = dict.find(str);
		if (ret != dict.end())
		{
			cout << "->" << ret->second << endl;
		}
		else
		{
			cout << "查无此单词" << endl;
		}
	}

	// erase等接⼝跟set完全类似

	return 0;
}

3.7map的迭代器和[]功能样例

cpp 复制代码

#include<iostream>
#include<map>
using namespace std;

int main()
{
	// 1.利用find和iterator修改功能，统计水果出现次数
	string arr[] = { "苹果", "西瓜", "苹果", "西瓜", "苹果", "苹果", "西瓜",
"苹果", "香蕉", "苹果", "香蕉" };
	map<string, int> countMap;
	for (const auto& str : arr)
	{
		// 先查找⽔果在不在map中
		// 1、不在，说明⽔果第⼀次出现，则插⼊{⽔果, 1}
		// 2、在，则查找到的节点中⽔果对应的次数++
		auto ret = countMap.find(str);
		if (ret == countMap.end())
		{
			countMap.insert({str,1});
		}
		else
		{
			ret->second++;
		}
	}

	for (const auto& e : countMap)
	{
		cout << e.first << ":" << e.second << endl;
	}
	cout << endl;

	// 2.利⽤[]插⼊+修改功能，巧妙实现统计⽔果出现的次数
	map<string, int> countMap2;
	for (const auto& str : arr)
	{
		// []先查找⽔果在不在map中
		// 1、不在，说明⽔果第⼀次出现，则插⼊{⽔果, 0}，同时返回次数的引⽤，++⼀下就变成1次了
		// 2、在，则返回⽔果对应的次数++
		// 不在的时候[]具有查找+修改功能
		// 在的时候[]具有插入+修改功能
		countMap2[str]++;
	}

	for (const auto& e : countMap2)
	{
		cout << e.first << ":" << e.second << endl;
	}

	// 3.[]的功能
	map<string, string> dict;
	dict.insert(make_pair("sort", "排序"));

	// key不存在->插⼊ {"insert", string()}
	dict["insert"];

	// 插⼊+修改
	dict["left"] = "左边";

	// 修改
	dict["left"] = "左边、剩余";

	// key存在->查找
	cout << dict["left"] << endl;

	return 0;
}

3.8multimap和map的差异

multimap和map的使⽤基本完全类似，主要区别点在于multimap⽀持关键值key冗余，那么
insert/find/count/erase都围绕着⽀持关键值key冗余有所差异，这⾥跟set和multiset完全⼀样，⽐如 find时，有多个key，返回中序第⼀个。其次就是multimap不⽀持[]，因为⽀持key冗余，[]就只能⽀持插⼊了，不能⽀持修改。

4.两道相关题

4.1随机链表的复制

https://leetcode.cn/problems/copy-list-with-random-pointer/description/

cpp 复制代码

// 1.随机链表的复制
/*
// Definition for a Node.
class Node {
public:
    int val;
    Node* next;
    Node* random;

    Node(int _val) {
        val = _val;
        next = NULL;
        random = NULL;
    }
};
*/

// 1.
// 利用原链表的节点做key，新链表对应的节点做value
// 后续新链表的random就是nodeMap[cur->random]
// 通过原链表的random节点，找到新链表的random节点
class Solution {
public:
    Node* copyRandomList(Node* head) {
        map<Node*, Node*> nodeMap;
        Node* copyhead = nullptr, * copytail = nullptr;
        Node* cur = head;
        while (cur)
        {
            if (copytail == nullptr)
            {
                // 空链表
                copyhead = copytail = new Node(cur->val);
            }
            else
            {
                //尾插
                copytail->next = new Node(cur->val);
                copytail = copytail->next;
            }

            // 原链表节点做key，新链表节点做value
            nodeMap[cur] = copytail;
            cur = cur->next;
        }

        cur = head;
        Node* copyCur = copyhead;
        while (cur)
        {
            if (cur->random == nullptr)
            {
                copyCur->random = nullptr;
            }
            else
            {
                copyCur->random = nodeMap[cur->random];
            }

            cur = cur->next;
            copyCur = copyCur->next;
        }

        return copyhead;
    }
};

// 2.在原节点后面尾接新的相同的节点，然后copy->random=cur->random->next,然后在从原链表上分离
class Solution {
public:
    Node* copyRandomList(Node* head) {
        if (head == nullptr)
            return nullptr;

        Node* cur = head;
        while (cur)
        {
            Node* newNode = new Node(cur->val);
            newNode->next = cur->next;
            cur->next = newNode;

            cur = cur->next->next;
        }

        cur = head;
        Node* copyCur = cur->next;
        while (cur)
        {
            if (cur->random == nullptr)
            {
                copyCur->random = nullptr;
            }
            else
            {
                copyCur->random = cur->random->next;
            }

            cur = cur->next->next;
            if (cur != nullptr)
                copyCur = copyCur->next->next;
        }

        Node* copyhead = nullptr, * copytail = nullptr;
        cur = head;
        while (cur)
        {
            Node* nextNode = cur->next->next;
            if (copyhead == nullptr)
            {
                copyhead = copytail = cur->next;
            }
            else
            {
                copytail->next = cur->next;
                copytail = copytail->next;
            }

            cur->next = nextNode;
            cur = nextNode;
        }

        return copyhead;
    }
};

4.2前k个高频单词

https://leetcode.cn/problems/top-k-frequent-words/submissions/661053571/

cpp 复制代码

class Solution {
public:

    // 将words里的数据放入map中统计个数顺便排序
    // 再把map里的数据转入vectror中让sort排序，
    // 因为sort只支持随机迭代器容器的排序
    // 在写个仿函数指定比较规则

    struct kvCompare
    {
        bool operator()(const pair<string, int>& p1, const pair<string, int>& p2) const
        {
            return p1.second > p2.second ||
                (p1.second == p2.second && p1.first < p2.first);
        }
    };
    vector<string> topKFrequent(vector<string>& words, int k) {
        map<string, int> countMap;
        for (const auto& e : words)
        {
            countMap[e]++;
        }

        vector<pair<string, int>> v(countMap.begin(), countMap.end());
        sort(v.begin(), v.end(), kvCompare());
        //stable_sort(v.begin(), v.end(), kvCompare());
        // stable_sort是稳定排序，不会破坏顺序，只写return p1.second > p2.second即可

        vector<string> ret;
        for (int i = 0; i < k; i++)
        {
            ret.push_back(v[i].first);
        }

        return ret;
    }
};