【C++STL ：string类 (二) 】从接口应用到内存模型的全面探索

🔥艾莉丝努力练剑：个人主页

❄专栏传送门：《C语言》、《数据结构与算法》、C/C++干货分享&学习过程记录、Linux操作系统编程详解、笔试/面试常见算法：从基础到进阶

⭐️为天地立心，为生民立命，为往圣继绝学，为万世开太平

🎬艾莉丝的简介：

C++的两个参考文档

老朋友（非官方文档）：cplusplus

官方文档（同步更新）：cppreference

[3 ~> string 的进阶用法与技巧：string类的常见接口介绍说明](#3 ~> string 的进阶用法与技巧：string类的常见接口介绍说明)

[3.7 vs 和 g++下string结构的说明](#3.7 vs 和 g++下string结构的说明)

[3.7.1 vs下string的结构](#3.7.1 vs下string的结构)

[3.7.2 g++下string的结构](#3.7.2 g++下string的结构)

[3.8 打开文件](#3.8 打开文件)

[3.9 first_开头的四个接口](#3.9 first_开头的四个接口)

[3.10 运算符重载](#3.10 运算符重载)

[3.11 compare和npos](#3.11 compare和npos)

[3.12 operator+](#3.12 operator+)

[3.13 流插入、流提取](#3.13 流插入、流提取)

[3.14 迭代器和指针](#3.14 迭代器和指针)

[3.15 本章节必刷算法题](#3.15 本章节必刷算法题)

[4 ~> string底层揭秘：懂原理，才能用得恰到好处](#4 ~> string底层揭秘：懂原理，才能用得恰到好处)

[4.1 检查抛异常](#4.1 检查抛异常)

[4.2 命名空间](#4.2 命名空间)

[4.3 为什么像operator+这类函数要重载到全局](#4.3 为什么像operator+这类函数要重载到全局)

[4.4 比较大小](#4.4 比较大小)

[4.5 clear( )](#4.5 clear( ))

[4.6 string类的三个swap](#4.6 string类的三个swap)

[4.7 operator+=](#4.7 operator+=)

[4.8 insert && erase && find && substr && push_back && append](#4.8 insert && erase && find && substr && push_back && append)

[4.8.1 insert](#4.8.1 insert)

[4.8.2 erase](#4.8.2 erase)

[4.8.3 find](#4.8.3 find)

[4.8.4 substr](#4.8.4 substr)

[4.8.5 push_back](#4.8.5 push_back)

[4.8.6 append](#4.8.6 append)

[4.9 resize：修改数据](#4.9 resize：修改数据)

[4.10 reserve：扩容](#4.10 reserve：扩容)

[4.11 为什么strcpy要改成memcpy || memmove？](#4.11 为什么strcpy要改成memcpy || memmove？)

[4.12 getline](#4.12 getline)

[4.13 挪动数据（示意图）](#4.13 挪动数据（示意图）)

[4.14 string s2(s1)](#4.14 string s2(s1))

[5 ~> String类的实现：深拷贝与浅拷贝的隐患与解决](#5 ~> String类的实现：深拷贝与浅拷贝的隐患与解决)

[5.1 引子](#5.1 引子)

编辑

[5.2 浅拷贝](#5.2 浅拷贝)

[5.3 深拷贝](#5.3 深拷贝)

[5.3.1 传统写法的string类](#5.3.1 传统写法的string类)

[5.3.2 现代版写法的string类](#5.3.2 现代版写法的string类)

[5.4 写时拷贝](#5.4 写时拷贝)

[5.4.1 扩展阅读](#5.4.1 扩展阅读)

[6 ~> 本文完整代码](#6 ~> 本文完整代码)

[1 使用层面剩余代码实现](#1 使用层面剩余代码实现)

Test.cpp：

[2 string类底层实现](#2 string类底层实现)

3 ~> string 的进阶用法与技巧：string类的常见接口介绍说明

3.7 vs 和 g++下string结构的说明

注意：下述结构是在32位平台下进行验证，32位平台下指针占4个字节。

3.7.1 vs下string的结构

string总共占28个字节，内部结构稍微复杂一点，先是有一个联合体，联合体用来定义****string中字符串的存储空间：

（1）当字符串长度小于16时，使用内部固定的字符数组来存放；

（2）当字符串长度大于等于16时，从堆上开辟空间。

cpp 复制代码

union _Bxty
{ // storage for small buffer or pointer to larger one
 value_type _Buf[_BUF_SIZE];
 pointer _Ptr;
 char _Alias[_BUF_SIZE]; // to permit aliasing
} _Bx;

这种设计也是有一定道理的，大多数情况下字符串的长度都小于16，那string对象创建好之后，内部已经有了16个字符数组的固定空间，不需要通过堆创建，效率高。

其次：还有一个 size_t 字段保存字符串长度，一个size_t字段保存从堆上开辟空间总的容量。

最后：还有一个指针做一些其他事情。故总共占16+4+4+4=28个字节。

3.7.2 g++下string的结构

g++下，string是通过写时拷贝实现的，string对象总共占4个字节，内部只包含了一个指针，该指针将来指向一块堆空间，内部包含了如下字段：

（1）空间总大小；

（2）字符串有效长度；

（3）引用计数；

cpp 复制代码

struct _Rep_base
{
 size_type               _M_length;
 size_type               _M_capacity;
 _Atomic_word            _M_refcount;
};

（4）指向堆空间的指针，用来存储字符串。

3.8 打开文件

cpp 复制代码

string filename("Test.cpp");
FILE* fout = fopen(filename.c_str(), "r");
if (fout)
{
	cout << "打开文件成功" << endl;
}

send：网络的接口，给别人发送东西要调用；
string str("hello world");
send(str.c_str()); // c_str，最核心的就是如果是C语言风格的接口，就能很好的兼容

一般用c_str就不太用data了。
get_allocator：可以获取内部用的内存池，allocator就是内部的内存池对象。
allocator<T>给的是缺省参数，可以自己写一个申请内存释放内存的内存池，传过去的就是你的了------给你一个空间，让你可以选择用别人写的，也可以说觉得别人写得不行，用你自己的。

cpp 复制代码

std::string::copy

我们一般用string::substr------这个很好用------拷贝从pos位置开始的len个字符这样一部分，构造一个string对象返回。

cpp 复制代码

string suffix = filename.substr(4, 4);

不用给第二个缺省参数，默认取到结尾。

cpp 复制代码

	string suffix = filename.substr(4);
	cout << suffix << endl;

由此可见这个substr比copy好用多了，我们就不用这个copy。

万一给了这样一个文件------

cpp 复制代码

string filename("Testabxxxx.cpp");
FILE* fout = fopen(filename.c_str(), "r");
if (fout)
{
	cout << "打开文件成功" << endl;
}

查找------

cpp 复制代码

//查找
//size_t pos = filename.find('.');//没找到返回npos
//if (pos != string::npos)
//{
//	string suffix = filename.substr(4);
//	cout << suffix << endl;
//}

比如说给了个Test.c------

cpp 复制代码

	string file("Test.c");
	size_t pos = file.find('.');
	if (pos != string::npos)
	{
		string suffix = file.substr(4);
		cout << suffix << endl;
	}

比如说给了个多个后缀名，如Test.tar.zip这样的------那我们就倒着找：rfind（倒着往前找）------

cpp 复制代码

string file("Test.tar.zip");

比如说给了个多个后缀名，如Test.tar.zip这样的------那我们就倒着找：rfind（倒着往前找）------

cpp 复制代码

    string file("Test.tar.zip");
    //如果只想找.zip
	size_t pos = file.rfind('.');//没找到也是返回npos
	if (pos != string::npos)
	{
		string suffix = file.substr(pos);
		cout << suffix << endl;
	}

找个现实当中样例更复杂一点的，比如说URL------ 一个网址 ------ Linux网络里面会接触：

cpp 复制代码

string url = "https://legacy.cplusplus.com/reference/string/string/rfind/";

协议：https；域名：legacy.cplusplus.com；域名之下的内容：reference/string/string/rfind/
协议/域名（要发向的网址的服务器地址是什么，通过协议转换为IP）/内容；
http 和 https的区别：https是加密过的，更加安全。

cpp 复制代码

	size_t i1 = url.find(":") ;//假设第一个位置叫i1，i1指向":"

取到这个网络（network）协议（protocol）的子串。

cpp 复制代码

	if (i1 != string::npos)
	{
		string protocol = url.substr(0, i1);//左闭右开，直接减------开区间位置减去闭区间位置就是个数
		//用substr空间它底层会自己管理，copy这里有个char*指针，用多少空间还得自己算，substr自己会管理空间何乐而不为？
		// 所以说copy不好用，我们一般不会去用
		cout << protocol << endl;//获取到协议的名称：https
		//假设要找域名，"/"不好找，但是所有的协议有一个特点：格式是定死的------协议后面是":"，":"后面是两个"/"
//find可以指定从pos位置开始找，我们就指定在两个"/"后面开始找，找此时后面第一个"/"，即从i1+3位置（://）开始找
		size_t i2 = url.find('/', i1 + 3);
		if (i2 != string::npos)
		{
			//取出域名
			string domain = url.substr(i1 + 3, i2 - (i1 + 3));
			cout << domain << endl;//domain：域名

			//取出第三部分：一般称之为"资源"------了解了网络就知道了
			string uri = url.substr(i2 + 1);//没必要算结尾的位置了，直接substr
			cout << uri << endl;

			//通过find、rfind、substr配合，就可以把协议的经典的三个部分分隔开，并且取出来
			//可见在实践中在分割字符串等方面还是非常有用的
		}
	}

3.9 first_开头的四个接口

下面这四个**【string operations】**接口了解一下即可，基本上用不到------

find_first_of

只要是源字符串里面有任意一个find_first_of子串里面的字符，就都返回；我们都比对一遍，是子串里面的就要修改成'*'，不是就不动，返回的时候在子串里面的就修改成'*'了；不是取到第一个，实际上这个接口的表述有问题，应该叫find_any_of才比较恰当；为什么不改？语言的向前兼容问题------已经发布好几年了，不敢轻易改动。这个接口用得不多。

find_last_of

和 find_first_of 的区别就是它是倒着往前找的，实际上应该叫 rfind_first_of 才对。

find_first_not_of

比如"abcdefasfgsa"这样一个字符串返回，比如说要找abcde就只保留abcde，其余修改；
也就是说给一个子串，和源串对比，是就保留，不是就修改。

find_last_not_of

正好和find_first_not_of反过来。

cpp 复制代码

	std::string str("Please,replace the vowels in this sentence by asterisks.");
	std::size_t found = str.find_first_not_of("aeiou");
	while (found != std::string::npos)
	{
		str[found] = '*';
		found = str.find_first_not_of("aeiou", found + 1);
	}

	std::cout << str << '\n';

3.10 运算符重载

运算符重载用起来香------

cpp 复制代码

cout << str < url << endl;

运算符优先级的问题，会先比较左边。

cpp 复制代码

cout << (str < url) << endl;

按字典------ASCII码表比较。

这里从小写的h和大写的P开始比较（h大，大写在前，小写在后）。
比较ASCII码值，字符没办法字符是无法存储的，字符底层是查ASCII码表，看字符映射的整型值，即字符的ASCII码值。

不重载成成员函数而重载成全局的目的就是：这样不同类型之间也可以比较：比如说string可以和const char*比较。

3.11 compare和npos

compare和npos都不看了。

3.12 operator+

operator+是以非成员函数重载的：

operator+和operator+=的区别就是："+"不改变自己，"+="要改变自己

成员函数实现不了：传参第一个一定是str。

cpp 复制代码

cout << str + "xxxxx" << endl;
cout << str + url << endl;
cout << "xxxxx" + str + url << endl;

当心：传值返回 ------ 代价要大一些，返回的是传值对象的拷贝。

不过，现在的编译器不用担心，C++11之后编译器有优化，效率很高。

C++11以后，传值返回对象效率都很不错------

cpp 复制代码

string ret = str + "xxxxxx";//引用

3.13 流插入、流提取

流插入、流提取：operator>>（cin）、operator<<（cout）。
注意：输入多个字符串，默认是" "或者endl/'\n'换行去间隔。

cpp 复制代码

	cin >> url >> str;
	cout << url << endl;
	cout << str << endl;

遇到空格或者换行，编译器就认为这一次的输入就结束了。

输入xxxx yyyy，xxxx就进入到url里面，yyyy就进入到str里面。

getline，换行不结束，甚至可以自己控制什么时候结束。

swap：这个阶段不结合底层讲不清楚，下面我们介绍底层的时候还会再细说。

3.14 迭代器和指针

3.15 本章节必刷算法题

1、917. 仅仅反转字母

2、125. 验证回文串

3、415. 字符串相加

4、43. 字符串相乘

5、541. 反转字符串 II

6、557. 反转字符串中的单词 III

7、LCR 192. 把字符串转换成整数 (atoi)

8、HJ1 字符串最后一个单词的长度

4 ~> string底层揭秘：懂原理，才能用得恰到好处

在前面的介绍中，博主已经对string类进行了简单的介绍，uu们只要能够正常使用即可。在面试当中，面试官总喜欢让学生自己来模拟实现string类，最主要是实现string类的构造、拷贝构造、赋值运算符重载以及析构函数（~string）。

为什么要介绍底层？会用不就好了吗？哈哈，确实如此，但是光会用是不够的，只有我们懂得原理了，才能够用得恰到好处，之前数据结构的学习就是这样的道理。

我们今天继续来结合着string的文档来实现一下string类的底层。

4.1 检查抛异常

这里就只展现一下main函数里面检查抛异常的语句的代码------

cpp 复制代码

int main()
{
	//try{}catch(const excetion& e){}------抛异常
	try
	{
		//jqj：自己定义的命名空间
		jqj::test_string3();

		//cout << typeid(jqj::string::iterator).name() << endl;
		//cout << typeid(std::string::iterator).name() << endl;
	}
	catch (const exception& e)//exception：异常
	{
		cout << e.what() << endl;
	}

	return 0;
}

4.2 命名空间

我们说过，使用相同的命名空间编译器会认为是同一个文件。

长一点的就声明和定义分离（.h、.cpp），声明定义缺省不能同时存在，而短一点的可以直接内联。这里的namespace命名空间就是直接让编译器认为三个文件是同一个文件。

4.3 为什么像operator+这类函数要重载到全局

不重载成成员函数，而是重载成全局函数，这样不同类型的字符就可以进行运算了。

4.4 比较大小

这段代码之前在实现Date类的时候就亮过相，可以"偷懒"，直接复用------

cpp 复制代码

	//比较大小
	bool string::operator<(const string& s) const
	{
		return strcmp(_str, s._str) < 0;
	}

	bool string::operator<=(const string& s) const
	{
		return *this < s || *this == s;
	}

	bool string::operator>(const string& s) const
	{
		return !(*this <= s);
	}

	bool string::operator>=(const string& s) const
	{
		return !(*this < s);
	}

	bool string::operator==(const string& s) const
	{
		return strcmp(_str, s._str) == 0;
	}

	bool string::operator!=(const string& s) const
	{
		return !(*this == s);
	}

4.5 clear( )

cpp 复制代码

		void clear()//清理有效数据
		{
			_str[0] = '\0';//第一个变成\0
			_size = 0;
			//clear()非常简单
		}

4.6 string类的三个swap

cpp 复制代码

//string涉及到的swap有三个：
//成员函数、全局函数、算法库（C++98：针对string就不高效了，特点是通用）
		void swap(string& s);

cpp 复制代码

	void swap(T& a, T& b) //算法库里的swap交换算法
	{
		T c(a); 
		a = b;
		b = c;
	}

cpp 复制代码

	inline void swap(string& a, string& b)
	{
		a.swap(b);
	}

4.7 operator+=

4.8 insert && erase && find && substr && push_back && append

4.8.1 insert

4.8.2 erase

4.8.3 find

4.8.4 substr

4.8.5 push_back

4.8.6 append

4.9 resize：修改数据

4.10 reserve：扩容

一般是扩容，之前介绍过，reserve在vs2022上面不会缩容。

4.11 为什么strcpy要改成memcpy || memmove？

用strcpy如果碰上中间多个\0就废了，由于C语言的兼容性，到\0就停止了，后面可能是几个随机值对应的汉字，像什么"烫烫烫"、"屯屯屯"、"蔼蔼碍"等等莫名其妙的汉字。

所以我们使用memcpy或者memmove，会把所有的\0存下来。

多个\0这种情况中，最后一个\0是标识符，前面的\0是字符。

因此，我们其他地方也都要改掉------

4.12 getline

4.13 挪动数据（示意图）

4.14 string s2(s1)

cpp 复制代码

	//string s2(s1);
	string::string(const string& s)
	{
		_str = new char[s._capacity + 1];
		/*strcpy(_str, s._str);*/
//用strcpy如果碰上中间多个\0就废了，C语言的兼容性，到\0就停止了，后面可能是随机数对应的汉字
		memcpy(_str, s._str, s._size + 1);
//所以我们使用memcpy或者memmove，会把所有的\0存下来
		//多个\0这种情况中，最后一个\0是标识符，前面的\0是字符
		_size = s._size;
		_capacity = s._capacity;
	}

	//s1 = s3
	string& string::operator=(const string& s)
	{
		if (this != &s)
		{
			char* tmp = new char[s._capacity + 1];
			//string(tmp, s_str);//统一一下，这里也改掉
			memcpy(tmp, s._str, s._size + 1);

			delete[] _str;
			_str = tmp;
			_size = s._size;
			_capacity = s._capacity;
		}

		return *this;
	}

5 ~> String类的实现：深拷贝与浅拷贝的隐患与解决

5.1 引子

cpp 复制代码

#define  _CRT_SECURE_NO_WARNINGS  1
#include<iostream>
#include<string.h>
#include<assert.h>
// 为了和标准库区分，此处使用String
class String
{
public:
	/*String()
	:_str(new char[1])
	{*_str = '\0';}
	*/
	//String(const char* str = "\0") 错误示范
	//String(const char* str = nullptr) 错误示范
	String(const char* str = "")
	{
		// 构造String类对象时，如果传递nullptr指针，可以认为程序非
		if (nullptr == str)
		{
			assert(false);
			return;
		}
		_str = new char[strlen(str) + 1];
		strcpy(_str, str);
	}
	~String()
	{
		if (_str)
		{
			delete[] _str;
			_str = nullptr;
		}
	}
private:
	char* _str;
};
// 测试
void TestString()
{
	String s1("hello world!!!");
	String s2(s1);
}// 为了和标准库区分，此处使用String
class String
{
public:
	/*String()
	:_str(new char[1])
	{*_str = '\0';}
	*/
	//String(const char* str = "\0") 错误示范
	//String(const char* str = nullptr) 错误示范
	String(const char* str = "")
	{
		// 构造String类对象时，如果传递nullptr指针，可以认为程序非
		if (nullptr == str)
		{
			assert(false);
			return;
		}
		_str = new char[strlen(str) + 1];
		strcpy(_str, str);
	}
	~String()
	{
		if (_str)
		{
			delete[] _str;
			_str = nullptr;
		}
	}
private:
	char* _str;
};
// 测试一下
void TestString()
{
	String s1("hello world!!!");
	String s2(s1);
}

说明：上述String类没有显式定义其拷贝构造函数与赋值运算符重载，此时编译器会合成默认的，当用s1构造s2时，编译器会调用默认的拷贝构造。最终导致的问题是，s1、s2共用同一块内存空间，在释放时同一块空间被释放多次而引起程序崩溃，这种拷贝方式，就称为浅拷贝。

5.2 浅拷贝

浅拷贝：也称位拷贝，编译器只是将对象中的值拷贝过来。如果对象中管理资源，最后就会导致多个对象共享同一份资源，当一个对象销毁时就会将该资源释放掉，而此时另一些对象不知道该资源已经被释放，以为还有效，所以当继续对资源进项操作时，就会发生发生了访问违规。

举个例子，就像一个家庭中有两个孩子，但父母只买了一份玩具，两个孩子愿意一块玩，则万事大吉，万一这两个孩子不想分享，就你争我夺，玩具损坏------世子之争，向来如此。
可以采用深拷贝解决浅拷贝问题，即：每个对象都有一份独立的资源，不要和其他对象共享。父母给每个孩子都买一份玩具，各自玩各自的就不会有问题了。

5.3 深拷贝

如果一个类中涉及到资源的管理，其拷贝构造函数、赋值运算符重载以及析构函数必须得要显式给出。一般情况都是按照深拷贝方式提供。

5.3.1 传统写法的string类

5.3.2 现代版写法的string类

5.4 写时拷贝

写时拷贝就是一种拖延症，是在浅拷贝的基础之上增加了引用计数的方式来实现的。
**引用计数：**用来记录资源使用者的个数。在构造时，将资源的计数给成1，每增加一个对象使用该资源，就给计数增加1，当某个对象被销毁时，先给该计数减1，然后再检查是否需要释放资源，如果计数为1，说明该对象时资源的最后一个使用者，将该资源释放；否则就不能释放，因为还有其他对象在使用该资源。

我们来看一位已故大佬陈皓的文章，介绍得很详尽------

C++ STL STRING的COPY-ON-WRITE技术

C++的STD::STRING的"读时也拷贝"技术！

陈皓大佬的博客个人主页链接：陈皓（左耳朵耗子）

5.4.1 扩展阅读

（酷壳）C++面试中STRING类的一种正确写法

STL 的string类怎么啦？

6 ~> 本文完整代码

1 使用层面剩余代码实现

Test.cpp：

cpp 复制代码

#define  _CRT_SECURE_NO_WARNINGS  1
#include<iostream>
#include<string>
#include<algorithm>
#include<list>
using namespace std;

//int main()
//{
//	////send：网络的接口，给别人发送东西要调用
//	//string str("hello world");
//	//send(str.c_str());//c_str，最核心的就是如果是C语言风格的接口，就能很好的兼容
//
//	string filename("Test.cpp");
//	FILE* fout = fopen(filename.c_str(), "r");
//	if (fout)
//	{
//		cout << "打开文件成功" << endl;
//	}
//	//一般用c_str就不太用data了
//	//get_allocator：可以获取内部用的内存池，allocator就是内部的内存池对象
//	//allocator<T>给的是缺省参数，可以自己写一个申请内存释放内存的内存池，传过去的就是你的了
//	//------给你一个空间，让你可以选择用别人写的，也可以说觉得别人写得不行，用你自己的
//
//	//std::string::copy ------ 从pos开始的len个的字符拷贝出来到一个char*指向的数组什么什么的------我们一般不用这个接口
//	//我们一般用string::substr------这个很好用------拷贝从pos位置开始的len个字符这样一部分，构造一个string对象返回
//
//	//string suffix = filename.substr(4, 4);
//	//不用给第二个缺省参数，默认取到结尾
//
//	//string suffix = filename.substr(4);
//	//cout << suffix << endl;
//	//由此可见这个substr比copy好用多了，我们就不用这个copy
//
//	//万一给了这样一个文件
//	//string filename("Testabxxxx.cpp");
//	//FILE* fout = fopen(filename.c_str(), "r");
//	//if (fout)
//	//{
//	//	cout << "打开文件成功" << endl;
//	//}
//	//查找
//	//size_t pos = filename.find('.');//没找到返回npos
//	//if (pos != string::npos)
//	//{
//	//	string suffix = filename.substr(4);
//	//	cout << suffix << endl;
//	//}
//
//	//比如说给了个Test.c
//	//string file("Test.c");
//	//size_t pos = file.find('.');
//	//if (pos != string::npos)
//	//{
//	//	string suffix = file.substr(4);
//	//	cout << suffix << endl;
//	//}
//
//	//比如说给了个多个后缀名，如Test.tar.zip这样的------那我们就倒着找：rfind（倒着往前找）
//	string file("Test.tar.zip");
//	//如果只想找.zip
//	size_t pos = file.rfind('.');//没找到也是返回npos
//	if (pos != string::npos)
//	{
//		string suffix = file.substr(pos);
//		cout << suffix << endl;
//	}
//
//	//找个现实当中样例更复杂一点的，比如说URL------ 一个网址 ------ 网络里面会接触
//	string url = "https://legacy.cplusplus.com/reference/string/string/rfind/";
//	//协议：https；域名：legacy.cplusplus.com；域名之下的内容：reference/string/string/rfind/
//	//协议/域名（要发向的网址的服务器地址是什么，通过协议转换为IP）/内容
//	//http和https的区别：https是加密过的，更加安全
//	size_t i1 = url.find(":") ;//假设第一个位置叫i1，i1指向":"
//	//取到这个网络（network）协议（protocol）的子串
//	if (i1 != string::npos)
//	{
//		string protocol = url.substr(0, i1);//左闭右开，直接减------开区间位置减去闭区间位置就是个数
//		//用substr空间它底层会自己管理，copy这里有个char*指针，用多少空间还得自己算，substr自己会管理空间何乐而不为？
//		// 所以说copy不好用，我们一般不会去用
//		cout << protocol << endl;//获取到协议的名称：https
//		//假设要找域名，"/"不好找，但是所有的协议有一个特点：格式是定死的------协议后面是":"，":"后面是两个"/"
////find可以指定从pos位置开始找，我们就指定在两个"/"后面开始找，找此时后面第一个"/"，即从i1+3位置（://）开始找
//		size_t i2 = url.find('/', i1 + 3);
//		if (i2 != string::npos)
//		{
//			//取出域名
//			string domain = url.substr(i1 + 3, i2 - (i1 + 3));
//			cout << domain << endl;//domain：域名
//
//			//取出第三部分：一般称之为"资源"------了解了网络就知道了
//			string uri = url.substr(i2 + 1);//没必要算结尾的位置了，直接substr
//			cout << uri << endl;
//
//			//通过find、rfind、substr配合，就可以把协议的经典的三个部分分隔开，并且取出来
//			//可见在实践中在分割字符串等方面还是非常有用的
//		}
//	}
////下面这四个(【string operations】的)接口了解一下即可，基本上用不到
////find_first_of
//// 只要是源字符串里面有任意一个find_first_of子串里面的字符，就都返回
//// 都比对一遍，是子串里面的就要修改成'*'，不是就不动，返回的时候在子串里面的就修改成'*'了
//// 不是取到第一个，实际上这个接口的表述有问题，应该叫find_any_of才比较恰当
//// 为什么不改？语言的向前兼容问题------已经发布好几年了，不敢轻易改动。这个接口用得不多
////find_last_of
//// 和find_first_of的区别就是它是倒着往前找的，实际上应该叫rfind_first_of才对
////find_first_not_of
//// 比如"abcdefasfgsa"这样一个字符串返回，比如说要找abcde就只保留abcde，其余修改
//// 也就是说给一个子串，和源串对比，是就保留，不是就修改
////find_last_not_of
//	//正好和find_first_not_of反过来
//	std::string str("Please,replace the vowels in this sentence by asterisks.");
//	std::size_t found = str.find_first_not_of("aeiou");
//	while (found != std::string::npos)
//	{
//		str[found] = '*';
//		found = str.find_first_not_of("aeiou", found + 1);
//	}
//
//	std::cout << str << '\n';
//	//运算符重载用起来香
//	//cout << str < url << endl;//运算符优先级的问题，会先比较左边
//	cout << (str < url) << endl;//按字典------ASCII码表比较
//	//这里从小写的h和大写的P开始比较（h大，大写在前，小写在后）
//	//比较ASCII码值，字符没办法字符是无法存储的，字符底层是查ASCII码表，看字符映射的整型值，即字符的ASCII码值
//
//	//不重载成成员函数而重载成全局的目的就是：这样不同类型之间也可以比较：string可以和const char*比较
//
//	//compare和npos都不看了
//
//	//operator+是以非成员函数重载的：
//	//operator+和operator+=的区别就是："+"不改变自己，"+="要改变自己
//	//成员函数实现不了：传参第一个一定是str
//	cout << str + "xxxxx" << endl;
//	cout << str + url << endl;
//	cout << "xxxxx" + str + url << endl;
//	//当心：传值返回 ------ 代价要大一些，返回的是传值对象的拷贝。
//	// 现在的编译器不用担心，C++11之后编译器有优化，效率很高
//
//	//C++11以后，传值返回对象效率都很不错
//	string ret = str + "xxxxxx";//引用
//
//	//流插入、流提取：operator>>（cin）、operator<<（cout）
//	//注意：输入多个字符串，默认是" "或者endl/'\n'换行去间隔
//	cin >> url >> str;
//	cout << url << endl;
//	cout << str << endl;
//	//遇到空格或者换行，编译器就认为这一次的输入就结束了
//	//输入xxxx yyyy，xxxx就进入到url里面，yyyy就进入到str里面
//	//getline，换行不结束，甚至可以自己控制什么时候结束
//
//	//swap：这个阶段不结合底层讲不清楚
//
//	return 0;
//}

2 string类底层实现

string.h：

cpp 复制代码

#pragma once
#include<iostream>
#include<string.h>
#include<assert.h>
#include<algorithm>

//命名空间
namespace jqj
{
	//类
	class string
	{
	public:
		//string()
		//	:_str(new char[1] {'\0'})
		//	, _size(0)
		//	, capacity(0)
		//{}

		typedef char* iterator;
		typedef const char* const_iterator;

		iterator begin()
		{
			return _str;
		}

		iterator end()
		{
			return _str + _size;
		}

		const_iterator begin() const
		{
			return _str;
		}

		const_iterator end() const
		{
			return _str + _size;
		}

		char& operator[](size_t pos) 
		{
			assert(pos < _size);
			return _str[pos];
		}

		const char& operator[](size_t pos) const
		{
			assert(pos < _size);
			return _str[pos];
		}

		size_t size() const
		{
			return _size;
		}

		const char* c_str() const
		{
			return _str;
		}

		//string涉及到的swap有三个：成员函数、全局函数、算法库（C++98：针对string就不高效了，特点是通用）
		void swap(string& s);

		string(const char* str = "");
		~string();
		string(const string& s);
		string& operator=(const string& s);

		void resize(size_t n, char ch = '\0');//改变数据
		void reserve(size_t n);
		void push_back(char ch);
		void append(const char* str);

		void clear()//清理有效数据
		{
			_str[0] = '\0';//第一个变成\0
			_size = 0;
			//clear()非常简单
		}

		string& operator+=(const char* str)
		{
			append(str);
			return *this;
		}

		string& operator+=(char ch)
		{
			push_back(ch);
			return *this;
		}

		void insert(size_t pos, char ch);
		void insert(size_t pos, const char* ch);
		void erase(size_t pos = 0, size_t len = npos);
		size_t find(char ch, size_t pos = 0);
		size_t find(const char* str, size_t pos = 0);

		string substr(size_t pos = 0, size_t len = npos);
		//有了substr没必要算结尾的位置了，直接substr到结尾

		//这段代码之前在实现Date类的时候就亮过相，可以"偷懒"，直接复用
		bool operator<(const string& s) const;
		bool operator<=(const string& s) const;
		bool operator>(const string& s) const;
		bool operator>=(const string& s) const;
		bool operator==(const string& s) const;
		bool operator!=(const string& s) const;

	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:
		// const static整型可以这么用，进行特殊处理 ------ 这里是编译器针对整型的一个特殊处理
		const static size_t npos = -1;

		//为什么说是针对整型？如下，浮点数就不能用------
		//double不支持
		//const static double x = 1.1;
	};

	//流插入（istream）和流提取（ostream）
	std::ostream& operator<<(std::ostream& out, const string& s);
	std::istream& operator>>(std::istream& in, string& s);
	std::istream& getline(std::istream& in, string& s,char delim = '\n');

	template<class T>//模版
	//swap（string涉及到三种swap）
	void swap(T& a, T& b) //算法库里的swap交换算法
	{
		T c(a); 
		a = b;
		b = c;
	}

	inline void swap(string& a, string& b)
	{
		a.swap(b);
	}
}

string.cpp：

cpp 复制代码

#define  _CRT_SECURE_NO_WARNINGS  1
#include"string.h"

namespace jqj
{
	void string::swap(string& s)
	{
		std::swap(_str, s._str);
		std::swap(_size, s._size);
		std::swap(_capacity, s._capacity);
	}

	string::string(const char* str)
		:_size(strlen(str))
	{
		//灵活应用
		_str = new char[_size + 1];
		_capacity = _size;
		strcpy(_str, str);
	}

	string::~string()
	{
		delete[] _str;
		_str = nullptr;
		_size = 0;
		_capacity = 0;
	}

	//string s2(s1);
	string::string(const string& s)
	{
		_str = new char[s._capacity + 1];
		/*strcpy(_str, s._str);*/
		//用strcpy如果碰上中间多个\0就废了，C语言的兼容性，到\0就停止了，后面可能是随机数对应的汉字
		memcpy(_str, s._str, s._size + 1);//所以我们使用memcpy或者memmove，会把所有的\0存下来
		//多个\0这种情况中，最后一个\0是标识符，前面的\0是字符
		_size = s._size;
		_capacity = s._capacity;
	}

	//s1 = s3
	string& string::operator=(const string& s)
	{
		if (this != &s)
		{
			char* tmp = new char[s._capacity + 1];
			//string(tmp, s_str);//统一一下，这里也改掉
			memcpy(tmp, s._str, s._size + 1);

			delete[] _str;
			_str = tmp;
			_size = s._size;
			_capacity = s._capacity;
		}

		return *this;
	}

	void string::reserve(size_t n)
	{
		if (n > _capacity)
		{
			std::cout << "reserve:" << n << std::endl;

			//扩容
			char* tmp = new char[n + 1];
			//strcpy(tmp, _str);//这里也改掉
			memcpy(tmp, _str, _size + 1);
			delete[] _str;
			_str = tmp;
			_capacity = n;
		}
	}

	//resize：修改数据
	void string::resize(size_t n, char ch)
	{
		if (n <= _size)
		{
			//删除，保留前n个
			_size = n;
			_str[_size] = '\0';
		}
		else
		{
			reserve(n);
			for (size_t i = 0; i < n; i++)
			{
				_str[i] = ch;
			}
			_size = n;
			_str[_size] = '\0';
		}
	}

	void string::push_back(char ch)
	{
		if (_size == _capacity)
		{
			reserve(_capacity == 0 ? 4 : _capacity * 2);
		}
		_str[_size] = ch;
		_size++;
		_str[_size] = '\0';
	}

	void string::append(const char* str)
	{
		size_t len = strlen(str);
		if (_size + len > _capacity)
		{
			reserve(std::max(_size + len, _capacity * 2));
		}

		//strcpy(_str + _size, str);//改成memcpy
		memcpy(_str + _size, str, len + 1);
		_size += len;
	}

	void string::insert(size_t pos, char ch)
	{
		assert(pos <= _size);

		if (_size == _capacity)
		{
			reserve(_capacity == 0 ? 4 : _capacity * 2);
		}

		//挪动数据
		//int end = _size;
		//while (end >= (int)pos)//强制类型转换
		//{
		//	_str[end + 1] = _str[end];
		//	--end;
		//}

		//_str[pos] = ch;
		//_size++;
		//另一种写法，不用强制类型转换，指向原来end的后一个位置
		size_t end = _size + 1;
		while (end > pos)
		{
			_str[end] = _str[end - 1];
			--end;
		}

		_str[pos] = ch;
		_size++;
	}

	void string::insert(size_t pos, const char* str)
	{
		assert(pos <= _size);

		//写法1：
		//size_t len = strlen(str);
		//if (len == 0)
		//	return;

		////确保有足够的容量
		//while (_size + len >= _capacity)
		//{
		//	reserve(_capacity == 0 ? 4 : _capacity * 2);
		//}

		////移除现有字符，为插入腾出空间
		//int end = _size;
		//while (end >= (int)pos)//强制类型转换
		//{
		//	_str[end + len] = _str[end];
		//}

		////插入新字符串
		//for (size_t i = 0; i < len; ++i)
		//{
		//	_str[pos + i] = str[i];
		//}
		//_str += len;

		//写法2：
		size_t len = strlen(str);
		if (_size + len >= _capacity)
		{
			reserve(std::max(_size + len, _capacity * 2));
		}

		////挪动数据
		//int end = _size;
		//while (end >= (int)pos)//强制类型转换
		//{
		//	_str[end + len] = _str[end];
		//	--end;
		//}
		size_t end = _size + len;
		while (end > pos + len - 1)
		{
			_str[end] = _str[end - len];
			--end;
		}

		/*strncpy(_str + pos, str, len);*///替换掉
		memcpy(_str + pos, str, len);

		_size += len;
	}

	void string::erase(size_t pos, size_t len)
	{
		////写法1：
		////如果len是npos或者len超过字符串的剩余长度，就直接删到末尾
		////删完了
		//if (len == npos || pos + len >= _size)
		//{
		//	_size = pos;
		//}
		//else
		//{
		//	//移动字符，覆盖要删除的部分
		//	//删除部分
		//	size_t tmp = pos + len;
		//	while (tmp <= _size)
		//	{
		//		_str[pos + (tmp - (pos + len))] = _str[tmp];
		//		++tmp;
		//	}
		//	_size -= len;
		//}
		//_str[_size] = '\0';

		//写法2：
		assert(pos < _size);

		if (len == npos || len >= _size - pos)
		{
			//删完了
			_size = pos;
			_str[_size] = '\0';
		}
		else
		{
			//删除部分
			/*strcpy(_str + pos, _str + pos + len);*/
			memcpy(_str + pos, _str + pos + len, _size - (pos + len) + 1);

			_size -= len;
		}
	}

	//const size_t string::npos = -1;//声明定义分离，缺省参数不能声明和定义同时给

	//find接口底层实现
	size_t string::find(char ch, size_t pos)
	{
		assert(pos < _size);
		for (size_t i = 0; i < _size; i++)
		{
			if (_str[i] == ch)
				return i;
		}

		return npos;
	}

	size_t string::find(const char* str, size_t pos)
	{
		assert(pos < _size);

		const char* ptr = strstr(_str + pos, str);
		if (ptr)
		{
			return ptr - str;
		}
		else
		{
			return npos;
		}
	}

	//substr接口的底层实现
	string string::substr(size_t pos, size_t len)
	{
		assert(pos < _size);

		if (len == npos || len > _size - pos)
		{
			len = _size - pos;
		}

		string sub;
		sub.reserve(len);
		for (size_t i = 0; i < len; i++)
		{
			sub += _str[pos + i];
		}
		return sub;
	}

	//比较大小
	bool string::operator<(const string& s) const
	{
		return strcmp(_str, s._str) < 0;
	}

	bool string::operator<=(const string& s) const
	{
		return *this < s || *this == s;
	}

	bool string::operator>(const string& s) const
	{
		return !(*this <= s);
	}

	bool string::operator>=(const string& s) const
	{
		return !(*this < s);
	}

	bool string::operator==(const string& s) const
	{
		return strcmp(_str, s._str) == 0;
	}

	bool string::operator!=(const string& s) const
	{
		return !(*this == s);
	}

	//流插入（istream）和流提取（ostream）
	std::ostream& operator<<(std::ostream& out, const string& s)
	{
		for (auto ch : s)
		{
			out << ch;
		}

		return out;
	}

	std::istream& operator>>(std::istream& in, string& s)
	{
		s.clear();

		char buff[256];
		int i = 0;

		char ch;
		/*in >> ch;*/
		ch = in.get();
		while (ch != '\0' && ch != ' ')
		{
			buff[i++] = ch;
			if (i = 255)
			{
				buff[i] = '\0';
				s += buff;
				i = 0;
			}

			ch = in.get();
		}

		if (i > 0)
		{
			buff[i] = '\0';
			s += buff;
		}

		return in;
	}

	std::istream& getline(std::istream& in, string& s, char delim)
	{
		s.clear();

		char buff[256];
		int i = 0;

		char ch;
		/*in >> ch;*/
		ch = in.get();
		while (ch != delim)
		{
			buff[i++] = ch;

			if (i == 255)
			{
				buff[i] = '\0';
				s += buff;
				i = 0;
			}

			ch = in.get();
		}
		if (i > 0)
		{
			buff[i] = '\0';
			s += buff;
		}

		return in;
	}
}

Test.cpp：

cpp 复制代码

#define  _CRT_SECURE_NO_WARNINGS  1
//string底层
#include"string.h"
#include<iostream>
using namespace std;

namespace jqj
{
	void test_string1()
	{
		jqj::string s1;
		cout << s1.c_str() << endl;//返回底层的char* _str

		string s2("hello world");
		cout << s2.c_str() << endl;
		s2[0] = 'x';

		for (size_t i = 0; i < s2.size(); i++)
		{
			s2[i]++;
		}
		cout << s2.c_str() << endl;

		string s3 = "hello world";//隐式类型转换，构造+拷贝构造->优化为构造
		//直接构造
		string s4("hello world");
		const string s5("hello world");

		for (size_t i = 0; i < s2.size(); i++)
		{
			cout << s5[i] << "-";
		}
		cout << endl;

		//范围for
		for (auto ch : s4)
		{
			cout << ch << " ";
		}
		cout << endl;

		string::iterator it4 = s4.begin();
		while (it4 != s4.end())
		{
			*it4 += 1;
			cout << *it4 << " ";
			++it4;
		}
		cout << endl;

		for (auto ch : s5)
		{
			cout << ch << " ";
		}
		cout << endl;

		string::const_iterator it5 = s5.begin();
		while (it5 != s5.end())
		{
			//*it5 += 1;//
			cout << *it5 << " ";
			++it5;
		}
		cout << endl;
	}

	void test_string2()
	{
		jqj::string s1;
		cout << s1.c_str() << endl;
		s1.push_back('x');
		s1.push_back('x');
		s1.push_back('x');
		cout << s1.c_str() << endl;

		string s2("hello world");
		cout << s2.c_str() << endl;
		s2.push_back('x');
		s2.push_back('y');
		s2.push_back('z');
		cout << s2.c_str() << endl;

		string s3("hello");
		s3.append("xxxxxxxxxxxxxxxxxxxxx");
		cout << s3.c_str() << endl;

		string s4("hello");
		s4.append("xx");
		s4.append("xx");
		cout << s4.c_str() << endl;

		s4 += '+';
		s4 += "hello jqj";
		cout << s4.c_str() << endl;
	}

	void test_string3()
	{
		string s1("hello world");
		cout << s1.c_str() << endl;
		s1.insert(5, 'x');
		cout << s1.c_str() << endl;

		s1.insert(0, 'x');
		cout << s1.c_str() << endl;

		string s2("hello world");
		cout << s2.c_str() << endl;
		s2.insert(5, "xxx");
		cout << s2.c_str() << endl;

		s2.insert(0, "yyy");
		cout << s2.c_str() << endl;
	}

	void test_string4()
	{
		string s1("hello world");
		cout << s1.c_str() << endl;
		s1.erase(4, 3);
		cout << s1.c_str() << endl;

		string s2("hello world");
		cout << s2.c_str() << endl;
		s2.erase(4);
		cout << s2.c_str() << endl;

		string s3("hello world");
		cout << s3.c_str() << endl;
		s3.erase(4, 100);
		cout << s3.c_str() << endl;
	}

	void test_string5()
	{
		string s1("hello world");
		string s2(s1);
		cout << s1.c_str() << endl;
		cout << s2.c_str() << endl;

		s1[0] = 'x';
		cout << s1.c_str() << endl;
		cout << s2.c_str() << endl;

		string s3("hello worldxxxxxx");
		s1 = s3;
		cout << s1.c_str() << endl;
		cout << s3.c_str() << endl;

		s3 = s3;
		cout << s3.c_str() << endl;
		cout << s3.c_str() << endl;
	}

	void test_string6()
	{
		string s1("hello world");
		s1 += 'x';
		s1 += '\0';
		s1 += "yyy";
		cout << s1 << endl;
		cout << s1.c_str() << endl;

		string s2(s1);
		cout << s1 << endl;
		cout << s2 << endl;
	}

	void test_string7()
	{
		string s1;
		s1.resize(100, '*');
		cout << s1 << endl;

		s1.resize(10);
		cout << s1 << endl;

		s1.resize(20, '#');
		cout << s1 << endl;

		string url = "https://legacy.cplusplus.com/reference/string/string/rfind/";
		size_t i1 = url.find(":");
		if (i1 != string::npos)
		{
			string protocol = url.substr(0, i1);
			cout << protocol << endl;

			size_t i2 = url.find('/', i1 + 3);
			if (i2 != string::npos)
			{
				string domain = url.substr(i1 + 3, i2 - (i1 + 3));
				cout << domain << endl;
				string uri = url.substr(i2 + 1);
				cout << uri << endl;
			}
		}
	}

	void test_string8()
	{
		//jqj::string s1, s2("xxxxxxxxxx");
		//cin >> s1 >> s2;
		//cout << s1 << endl;
		//cout << s2 << endl;

		//getline(cin, s1);
		//cout << s1 << endl;

		jqj::string s3("hello world"), s4("xxxxxxxxx");
		s3.swap(s4);
	}
}

int main()
{
	//try{}catch(const excetion& e){}------抛异常
	try
	{
		//jqj：自己定义的命名空间
		jqj::test_string8();

		//cout << typeid(jqj::string::iterator).name() << endl;
		//cout << typeid(std::string::iterator).name() << endl;
	}
	catch (const exception& e)//exception：异常
	{
		cout << e.what() << endl;
	}

	return 0;
}

结尾

往期回顾：

【C++：STL】深入详解string类（一）：从读文档开始学习string类

**结语：**创作不易，大家不要忘记给已经满头大汗的艾莉丝来个"一键四连"哦！

🗡博主在这里放了一只小狗，大家看完了摸摸小狗放松一下吧！🗡

૮₍ ˶ ˊ ᴥ ˋ˶₎ა