C++学习：六个月从基础到就业——STL算法（一）基础与查找算法

C++学习：六个月从基础到就业------STL算法（一）基础与查找算法

本文是我C++学习之旅系列的第二十五篇技术文章，也是第二阶段"C++进阶特性"的第三篇，主要介绍C++ STL算法库的基础知识与查找类算法。查看完整系列目录了解更多内容。

引言

在前面的两篇文章中，我们深入探讨了STL的容器和迭代器。今天，我们将开始学习STL的第三个核心组件------算法库，这是C++标准库的强大功能之一。为了便于学习和理解，我将把STL算法库的内容分成四个部分：

基础与查找算法（本文）
排序与变序算法
数值与集合算法
实际应用案例

本文作为STL算法系列的第一篇，将介绍算法库的基础知识，以及常用的查找和非修改序列算法。

STL算法库概述

什么是STL算法库？

STL算法库是C++标准模板库的一部分，提供了一系列用于处理容器元素的通用函数模板。这些算法通过迭代器操作容器中的元素，使得相同的算法可以应用于不同类型的容器，实现了代码的高度复用。

STL算法库主要位于头文件<algorithm>中，还有部分数值算法位于<numeric>头文件中。

STL算法库的特点

泛型设计：通过模板实现，可以处理任何类型的数据
高效实现：经过优化的算法，提供良好的性能
表达力强：使用合适的算法可以用少量代码表达复杂的操作
与迭代器紧密结合：通过迭代器接口操作容器，实现容器与算法的解耦

算法分类

STL算法库中的算法可以大致分为以下几类：

非修改序列操作：不改变元素值的算法，如查找、计数等
修改序列操作：改变元素值或位置的算法，如复制、替换、删除等
排序相关算法：用于排序和与排序相关的操作，如排序、合并、二分查找等
数值算法：用于进行数值计算的算法，如求和、乘积等
集合操作：操作有序集合的算法，如并集、交集等

非修改序列算法

非修改序列算法是指那些不会修改容器元素值的算法，它们主要用于元素的查找、计数、比较等操作。

遍历算法：`std::for_each` 和 `std::for_each_n`

for_each是一个非常实用的算法，它对区间内的每个元素应用指定的函数。

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>

// 打印元素的函数
void print(int n) {
    std::cout << n << " ";
}

int main() {
    std::vector<int> nums = {1, 2, 3, 4, 5};
    
    // 使用函数指针
    std::cout << "使用函数指针: ";
    std::for_each(nums.begin(), nums.end(), print);
    std::cout << std::endl;
    
    // 使用lambda表达式
    std::cout << "使用lambda表达式: ";
    std::for_each(nums.begin(), nums.end(), [](int n) {
        std::cout << n * n << " "; // 打印平方值
    });
    std::cout << std::endl;
    
    // 使用for_each_n (C++17)
    std::cout << "使用for_each_n处理前3个元素: ";
    std::for_each_n(nums.begin(), 3, print);
    std::cout << std::endl;
    
    return 0;
}

输出：

复制代码

使用函数指针: 1 2 3 4 5 
使用lambda表达式: 1 4 9 16 25 
使用for_each_n处理前3个元素: 1 2 3

for_each算法非常灵活，可以用于执行任何操作，包括修改元素（如果传递的是非const引用）。但要注意，如果你只是想对每个元素进行简单操作，考虑使用范围for循环（C++11引入）可能更加清晰。

查找算法: `std::find`、`std::find_if` 和 `std::find_if_not`

查找算法用于在容器中定位特定的元素。

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>

struct Person {
    std::string name;
    int age;
    
    Person(std::string n, int a) : name(std::move(n)), age(a) {}
};

int main() {
    std::vector<int> nums = {10, 20, 30, 40, 50};
    
    // 查找特定值
    auto it1 = std::find(nums.begin(), nums.end(), 30);
    if (it1 != nums.end()) {
        std::cout << "找到值30，位置: " << std::distance(nums.begin(), it1) << std::endl;
    } else {
        std::cout << "未找到值30" << std::endl;
    }
    
    // 查找符合条件的元素
    auto it2 = std::find_if(nums.begin(), nums.end(), [](int n) {
        return n > 35;
    });
    if (it2 != nums.end()) {
        std::cout << "找到第一个大于35的值: " << *it2 << std::endl;
    }
    
    // 查找不符合条件的元素
    auto it3 = std::find_if_not(nums.begin(), nums.end(), [](int n) {
        return n < 30;
    });
    if (it3 != nums.end()) {
        std::cout << "找到第一个不小于30的值: " << *it3 << std::endl;
    }
    
    // 在结构体向量中查找
    std::vector<Person> people = {
        Person("Alice", 30),
        Person("Bob", 25),
        Person("Charlie", 35),
        Person("David", 28)
    };
    
    // 查找名字为Bob的人
    auto personIt = std::find_if(people.begin(), people.end(), [](const Person& p) {
        return p.name == "Bob";
    });
    
    if (personIt != people.end()) {
        std::cout << "找到Bob，他的年龄是: " << personIt->age << std::endl;
    }
    
    return 0;
}

输出：

复制代码

找到值30，位置: 2
找到第一个大于35的值: 40
找到第一个不小于30的值: 30
找到Bob，他的年龄是: 25

查找序列：`std::search`、`std::find_end` 和 `std::find_first_of`

这些算法用于查找子序列或元素集合：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>

void printPosition(const std::string& name, const std::vector<int>& v, std::vector<int>::iterator it) {
    if (it != v.end()) {
        std::cout << name << "找到，位置: " << std::distance(v.begin(), it) << std::endl;
    } else {
        std::cout << name << "未找到" << std::endl;
    }
}

int main() {
    std::vector<int> haystack = {1, 2, 3, 4, 5, 1, 2, 3, 4, 5};
    std::vector<int> needle = {2, 3, 4};
    
    // search - 查找第一次出现的子序列
    auto it1 = std::search(haystack.begin(), haystack.end(), needle.begin(), needle.end());
    printPosition("search: 子序列", haystack, it1);
    
    // find_end - 查找最后一次出现的子序列
    auto it2 = std::find_end(haystack.begin(), haystack.end(), needle.begin(), needle.end());
    printPosition("find_end: 子序列", haystack, it2);
    
    // find_first_of - 查找任何一个元素集合中的元素首次出现
    std::vector<int> anyOf = {5, 10, 15};
    auto it3 = std::find_first_of(haystack.begin(), haystack.end(), anyOf.begin(), anyOf.end());
    printPosition("find_first_of: 任何一个元素", haystack, it3);
    
    // 使用谓词版本
    auto it4 = std::search(haystack.begin(), haystack.end(), needle.begin(), needle.end(),
                         [](int a, int b) { return std::abs(a - b) <= 1; }); // 允许值相差1
    printPosition("search(谓词): 相似子序列", haystack, it4);
    
    // 使用Boyer-Moore搜索(C++17)
    // 需要支持C++17的编译器
    /*
    auto it5 = std::search(haystack.begin(), haystack.end(), 
                         std::boyer_moore_searcher(needle.begin(), needle.end()));
    printPosition("Boyer-Moore search: 子序列", haystack, it5);
    */
    
    // 在字符串中搜索
    std::string text = "The quick brown fox jumps over the lazy dog";
    std::string pattern = "fox";
    
    auto strIt = std::search(text.begin(), text.end(), pattern.begin(), pattern.end());
    if (strIt != text.end()) {
        std::cout << "找到字符串 \"" << pattern << "\" 在位置: " 
                  << std::distance(text.begin(), strIt) << std::endl;
    }
    
    return 0;
}

输出：

复制代码

search: 子序列找到，位置: 1
find_end: 子序列找到，位置: 6
find_first_of: 任何一个元素找到，位置: 4
search(谓词): 相似子序列找到，位置: 0
找到字符串 "fox" 在位置: 16

这些查找算法在处理复杂数据集时非常有用，特别是当我们需要查找特定模式或元素集合时。

计数算法：`std::count`和`std::count_if`

这些算法用于计算满足特定条件的元素数量：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>

int main() {
    std::vector<int> nums = {1, 2, 3, 2, 4, 2, 5, 6, 2, 7};
    
    // 计算特定值的出现次数
    int count2 = std::count(nums.begin(), nums.end(), 2);
    std::cout << "数字2出现的次数: " << count2 << std::endl;
    
    // 计算符合条件的元素数量
    int countEven = std::count_if(nums.begin(), nums.end(), [](int n) {
        return n % 2 == 0;
    });
    std::cout << "偶数的数量: " << countEven << std::endl;
    
    // 在字符串中计数
    std::string text = "The quick brown fox jumps over the lazy dog";
    
    // 计算空格数量
    int spaces = std::count(text.begin(), text.end(), ' ');
    std::cout << "空格数量: " << spaces << std::endl;
    
    // 计算元音字母数量
    auto isVowel = [](char c) {
        c = std::tolower(c);
        return c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u';
    };
    
    int vowels = std::count_if(text.begin(), text.end(), isVowel);
    std::cout << "元音字母数量: " << vowels << std::endl;
    
    return 0;
}

输出：

复制代码

数字2出现的次数: 4
偶数的数量: 5
空格数量: 8
元音字母数量: 11

比较算法：`std::equal`、`std::mismatch` 和 `std::lexicographical_compare`

这些算法用于比较序列：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>

int main() {
    std::vector<int> v1 = {1, 2, 3, 4, 5};
    std::vector<int> v2 = {1, 2, 3, 4, 5};
    std::vector<int> v3 = {1, 2, 3, 5, 4};
    
    // 检查两个序列是否相等
    bool equal12 = std::equal(v1.begin(), v1.end(), v2.begin());
    bool equal13 = std::equal(v1.begin(), v1.end(), v3.begin());
    
    std::cout << "v1和v2相等? " << (equal12 ? "是" : "否") << std::endl;
    std::cout << "v1和v3相等? " << (equal13 ? "是" : "否") << std::endl;
    
    // 找出第一个不匹配的位置
    auto [it1, it2] = std::mismatch(v1.begin(), v1.end(), v3.begin());
    if (it1 != v1.end()) {
        std::cout << "第一个不匹配位置: " << std::distance(v1.begin(), it1);
        std::cout << ", v1中的值: " << *it1 << ", v3中的值: " << *it2 << std::endl;
    }
    
    // 比较序列的字典序
    bool less13 = std::lexicographical_compare(v1.begin(), v1.end(), v3.begin(), v3.end());
    bool less31 = std::lexicographical_compare(v3.begin(), v3.end(), v1.begin(), v1.end());
    
    std::cout << "v1字典序小于v3? " << (less13 ? "是" : "否") << std::endl;
    std::cout << "v3字典序小于v1? " << (less31 ? "是" : "否") << std::endl;
    
    // 字符串比较
    std::string s1 = "apple";
    std::string s2 = "banana";
    
    bool lessStr = std::lexicographical_compare(s1.begin(), s1.end(), s2.begin(), s2.end());
    std::cout << "apple字典序小于banana? " << (lessStr ? "是" : "否") << std::endl;
    
    // 使用C++17的加强版equal (检查长度不同的序列)
    std::vector<int> v4 = {1, 2, 3};
    
    // 在C++17之前需要手动检查大小
    bool equal14 = v1.size() == v4.size() && std::equal(v1.begin(), v1.end(), v4.begin());
    
    // C++17可以直接提供第二个范围的结束迭代器
    // bool equal14_cpp17 = std::equal(v1.begin(), v1.end(), v4.begin(), v4.end());
    
    std::cout << "v1和v4相等? " << (equal14 ? "是" : "否") << std::endl;
    
    return 0;
}

输出：

复制代码

v1和v2相等? 是
v1和v3相等? 否
第一个不匹配位置: 3, v1中的值: 4, v3中的值: 5
v1字典序小于v3? 是
v3字典序小于v1? 否
apple字典序小于banana? 是
v1和v4相等? 否

查询匹配条件：`std::all_of`、`std::any_of` 和 `std::none_of`

这些算法用于检查序列中的元素是否满足特定条件：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>

int main() {
    std::vector<int> v1 = {2, 4, 6, 8, 10};
    std::vector<int> v2 = {1, 3, 5, 7, 9};
    std::vector<int> v3 = {1, 2, 3, 4, 5};
    
    // 检查是否所有元素都是偶数
    bool allEven1 = std::all_of(v1.begin(), v1.end(), [](int n) { 
        return n % 2 == 0; 
    });
    bool allEven2 = std::all_of(v2.begin(), v2.end(), [](int n) { 
        return n % 2 == 0; 
    });
    
    std::cout << "v1中所有元素都是偶数? " << (allEven1 ? "是" : "否") << std::endl;
    std::cout << "v2中所有元素都是偶数? " << (allEven2 ? "是" : "否") << std::endl;
    
    // 检查是否存在偶数
    bool anyEven2 = std::any_of(v2.begin(), v2.end(), [](int n) { 
        return n % 2 == 0; 
    });
    bool anyEven3 = std::any_of(v3.begin(), v3.end(), [](int n) { 
        return n % 2 == 0; 
    });
    
    std::cout << "v2中存在偶数? " << (anyEven2 ? "是" : "否") << std::endl;
    std::cout << "v3中存在偶数? " << (anyEven3 ? "是" : "否") << std::endl;
    
    // 检查是否不存在偶数
    bool noneEven2 = std::none_of(v2.begin(), v2.end(), [](int n) { 
        return n % 2 == 0; 
    });
    
    std::cout << "v2中不存在偶数? " << (noneEven2 ? "是" : "否") << std::endl;
    
    // 实际应用：验证用户输入
    std::vector<std::string> userInputs = {"123", "abc", "456", "xyz"};
    
    bool allDigits = std::all_of(userInputs.begin(), userInputs.end(), [](const std::string& s) {
        return std::all_of(s.begin(), s.end(), ::isdigit);
    });
    
    std::cout << "所有输入都是数字? " << (allDigits ? "是" : "否") << std::endl;
    
    // 找出包含数字的输入
    auto hasDigit = [](const std::string& s) {
        return std::any_of(s.begin(), s.end(), ::isdigit);
    };
    
    std::cout << "包含数字的输入: ";
    for (const auto& input : userInputs) {
        if (hasDigit(input)) {
            std::cout << input << " ";
        }
    }
    std::cout << std::endl;
    
    return 0;
}

输出：

复制代码

v1中所有元素都是偶数? 是
v2中所有元素都是偶数? 否
v2中存在偶数? 否
v3中存在偶数? 是
v2中不存在偶数? 是
所有输入都是数字? 否
包含数字的输入: 123 456

专用查找算法：`std::adjacent_find` 和 `std::search_n`

adjacent_find用于查找相邻的相同元素，而search_n用于查找连续重复的元素：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>

int main() {
    std::vector<int> v1 = {1, 3, 3, 5, 7, 9, 9};
    
    // 查找相邻重复元素
    auto it1 = std::adjacent_find(v1.begin(), v1.end());
    if (it1 != v1.end()) {
        std::cout << "找到相邻重复元素: " << *it1 
                  << ", 位置: " << std::distance(v1.begin(), it1) << std::endl;
    }
    
    // 使用自定义条件查找相邻元素
    auto it2 = std::adjacent_find(v1.begin(), v1.end(), [](int a, int b) {
        return b - a > 1;  // 查找两个元素之间的差大于1
    });
    if (it2 != v1.end()) {
        std::cout << "找到差大于1的相邻元素: " << *it2 << " 和 " << *(it2 + 1)
                  << ", 位置: " << std::distance(v1.begin(), it2) << std::endl;
    }
    
    // 查找连续的n个相同元素
    std::vector<int> v2 = {1, 2, 3, 3, 3, 4, 5, 5, 6};
    auto it3 = std::search_n(v2.begin(), v2.end(), 3, 3);  // 查找3个连续的3
    if (it3 != v2.end()) {
        std::cout << "找到3个连续的3，起始位置: " << std::distance(v2.begin(), it3) << std::endl;
    }
    
    // 使用谓词版本的search_n
    std::vector<double> v3 = {1.1, 1.2, 3.5, 3.4, 3.6, 5.5};
    auto it4 = std::search_n(v3.begin(), v3.end(), 3, 3.0, [](double val, double target) {
        return std::abs(val - target) <= 1.0;  // 找出三个连续的接近3的值
    });
    if (it4 != v3.end()) {
        std::cout << "找到3个连续接近3的值: ";
        for (int i = 0; i < 3; ++i) {
            std::cout << *(it4 + i) << " ";
        }
        std::cout << "，起始位置: " << std::distance(v3.begin(), it4) << std::endl;
    }
    
    // 在字符串中的应用
    std::string text = "Hello, how   are you doing?";
    auto it5 = std::adjacent_find(text.begin(), text.end(), [](char a, char b) {
        return std::isspace(a) && std::isspace(b);
    });
    if (it5 != text.end()) {
        std::cout << "找到连续空格，位置: " << std::distance(text.begin(), it5) << std::endl;
    }
    
    return 0;
}

输出：

复制代码

找到相邻重复元素: 3, 位置: 1
找到差大于1的相邻元素: 3 和 5, 位置: 1
找到3个连续的3，起始位置: 2
找到3个连续接近3的值: 3.5 3.4 3.6 ，起始位置: 2
找到连续空格，位置: 11

简单修改序列算法

虽然这篇文章主要关注查找和非修改序列算法，但我们也会介绍几个简单的修改序列算法，它们经常与查找算法结合使用。

生成与填充：`std::fill`、`std::fill_n` 和 `std::generate`

这些算法用于填充或生成序列中的数据：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <random>

template<typename T>
void printVector(const std::vector<T>& vec, const std::string& name) {
    std::cout << name << ": ";
    for (const auto& item : vec) {
        std::cout << item << " ";
    }
    std::cout << std::endl;
}

int main() {
    // 使用fill填充向量
    std::vector<int> v1(10);
    std::fill(v1.begin(), v1.end(), 5);
    printVector(v1, "fill(5)");
    
    // 使用fill_n填充部分向量
    std::vector<int> v2(10, 0);  // 初始化为10个0
    std::fill_n(v2.begin() + 2, 5, 7);  // 从索引2开始填充5个7
    printVector(v2, "fill_n(7)");
    
    // 使用generate生成序列
    std::vector<int> v3(10);
    int value = 1;
    std::generate(v3.begin(), v3.end(), [&value]() { return value++; });
    printVector(v3, "generate(递增)");
    
    // 使用generate_n生成部分序列
    std::vector<int> v4(10, 0);
    value = 10;
    std::generate_n(v4.begin() + 3, 5, [&value]() { return value--; });
    printVector(v4, "generate_n(递减)");
    
    // 生成随机数
    std::vector<int> v5(10);
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dis(1, 100);
    
    std::generate(v5.begin(), v5.end(), [&]() { return dis(gen); });
    printVector(v5, "随机数");
    
    // 使用iota填充递增序列（来自<numeric>）
    std::vector<int> v6(10);
    std::iota(v6.begin(), v6.end(), 100);  // 从100开始的递增序列
    printVector(v6, "iota(从100开始)");
    
    return 0;
}

输出（随机数部分将有所不同）：

复制代码

fill(5): 5 5 5 5 5 5 5 5 5 5 
fill_n(7): 0 0 7 7 7 7 7 0 0 0 
generate(递增): 1 2 3 4 5 6 7 8 9 10 
generate_n(递减): 0 0 0 10 9 8 7 6 0 0 
随机数: 37 59 88 7 21 42 99 63 84 45 
iota(从100开始): 100 101 102 103 104 105 106 107 108 109

转换元素：`std::transform`

transform算法用于将函数应用于序列中的元素，并将结果存储在另一序列中：

cpp 复制代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
#include <cctype>

template<typename T>
void printVector(const std::vector<T>& vec, const std::string& name) {
    std::cout << name << ": ";
    for (const auto& item : vec) {
        std::cout << item << " ";
    }
    std::cout << std::endl;
}

int main() {
    std::vector<int> v1 = {1, 2, 3, 4, 5};
    
    // 一元操作 - 计算平方
    std::vector<int> squares(v1.size());
    std::transform(v1.begin(), v1.end(), squares.begin(),
                  [](int x) { return x * x; });
    printVector(squares, "平方值");
    
    // 使用transform修改原始向量
    std::transform(v1.begin(), v1.end(), v1.begin(),
                  [](int x) { return x * 2; });
    printVector(v1, "乘以2");
    
    // 二元操作 - 将两个向量相加
    std::vector<int> v2 = {10, 20, 30, 40, 50};
    std::vector<int> sum(v1.size());
    std::transform(v1.begin(), v1.end(), v2.begin(), sum.begin(),
                  [](int x, int y) { return x + y; });
    printVector(sum, "v1 + v2");
    
    // 字符转换 - 将字符串转换为大写
    std::string text = "Hello, World!";
    std::string upper(text.size(), ' ');
    std::transform(text.begin(), text.end(), upper.begin(),
                  [](unsigned char c) { return std::toupper(c); });
    std::cout << "原文: " << text << std::endl;
    std::cout << "转换后: " << upper << std::endl;
    
    // 复杂对象变换
    struct Person {
        std::string name;
        int age;
    };
    
    std::vector<Person> people = {
        {"Alice", 25},
        {"Bob", 30},
        {"Charlie", 35},
        {"David", 40}
    };
    
    // 提取姓名
    std::vector<std::string> names(people.size());
    std::transform(people.begin(), people.end(), names.begin(),
                  [](const Person& p) { return p.name; });
    
    std::cout << "姓名列表: ";
    for (const auto& name : names) {
        std::cout << name << " ";
    }
    std::cout << std::endl;
    
    // 计算每个人明年的年龄
    std::vector<int> nextYearAges(people.size());
    std::transform(people.begin(), people.end(), nextYearAges.begin(),
                  [](const Person& p) { return p.age + 1; });
    
    std::cout << "明年的年龄: ";
    for (int age : nextYearAges) {
        std::cout << age << " ";
    }
    std::cout << std::endl;
    
    return 0;
}

输出：

复制代码

平方值: 1 4 9 16 25 
乘以2: 2 4 6 8 10 
v1 + v2: 12 24 36 48 60 
原文: Hello, World!
转换后: HELLO, WORLD!
姓名列表: Alice Bob Charlie David 
明年的年龄: 26 31 36 41

实际应用示例

让我们通过一个实际的例子，展示如何结合使用上述算法来解决实际问题。

文本分析应用

下面的例子展示了一个简单的文本分析应用，包括分词、统计词频、查找特定单词等功能：

cpp 复制代码

#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <sstream>
#include <map>
#include <cctype>
#include <iomanip>

// 将文本转换为小写
std::string toLowerCase(const std::string& text) {
    std::string result;
    std::transform(text.begin(), text.end(), std::back_inserter(result),
                  [](unsigned char c) { return std::tolower(c); });
    return result;
}

// 删除标点符号和数字
std::string cleanText(const std::string& text) {
    std::string result;
    std::copy_if(text.begin(), text.end(), std::back_inserter(result),
                [](unsigned char c) { 
                    return std::isalpha(c) || std::isspace(c); 
                });
    return result;
}

// 分词函数
std::vector<std::string> tokenize(const std::string& text) {
    std::vector<std::string> tokens;
    std::istringstream iss(text);
    std::string token;
    
    while (iss >> token) {
        if (!token.empty()) {
            tokens.push_back(toLowerCase(token));
        }
    }
    
    return tokens;
}

// 统计词频
std::map<std::string, int> countWordFrequency(const std::vector<std::string>& words) {
    std::map<std::string, int> frequency;
    for (const auto& word : words) {
        ++frequency[word];
    }
    return frequency;
}

int main() {
    // 示例文本
    std::string text = "This is a sample text. This text is used to demonstrate "
                      "the power of STL algorithms for text analysis. "
                      "Algorithms are powerful tools in C++ programming.";
    
    std::cout << "原始文本:\n" << text << std::endl << std::endl;
    
    // 1. 清理文本
    std::string cleanedText = cleanText(text);
    
    // 2. 分词
    std::vector<std::string> words = tokenize(cleanedText);
    
    std::cout << "单词数量: " << words.size() << std::endl << std::endl;
    
    // 3. 统计词频
    auto wordFreq = countWordFrequency(words);
    
    // 按频率排序
    std::vector<std::pair<std::string, int>> freqPairs(wordFreq.begin(), wordFreq.end());
    std::sort(freqPairs.begin(), freqPairs.end(),
             [](const auto& a, const auto& b) {
                 return a.second > b.second;  // 按频率降序排序
             });
    
    // 打印词频结果（前10个）
    std::cout << "词频统计 (前10个):" << std::endl;
    int count = 0;
    for (const auto& [word, freq] : freqPairs) {
        std::cout << std::setw(15) << std::left << word << ": " << freq << std::endl;
        if (++count >= 10) break;
    }
    std::cout << std::endl;
    
    // 4. 查找特定单词
    std::string searchWord = "algorithms";
    auto it = std::find(words.begin(), words.end(), searchWord);
    if (it != words.end()) {
        int position = std::distance(words.begin(), it);
        std::cout << "单词 \"" << searchWord << "\" 在位置 " << position << " 找到" << std::endl;
        
        // 查找所有出现位置
        std::cout << "所有出现位置: ";
        int pos = 0;
        for (auto wordIt = words.begin(); wordIt != words.end(); ++wordIt, ++pos) {
            if (*wordIt == searchWord) {
                std::cout << pos << " ";
            }
        }
        std::cout << std::endl << std::endl;
    } else {
        std::cout << "未找到单词 \"" << searchWord << "\"" << std::endl << std::endl;
    }
    
    // 5. 查找最长和最短的单词
    auto [minIt, maxIt] = std::minmax_element(words.begin(), words.end(),
                                            [](const std::string& a, const std::string& b) {
                                                return a.length() < b.length();
                                            });
    
    std::cout << "最短的单词: " << *minIt << " (" << minIt->length() << " 个字符)" << std::endl;
    std::cout << "最长的单词: " << *maxIt << " (" << maxIt->length() << " 个字符)" << std::endl;
    std::cout << std::endl;
    
    // 6. 统计以特定字母开头的单词
    char startLetter = 't';
    int startsWithT = std::count_if(words.begin(), words.end(),
                                  [startLetter](const std::string& word) {
                                      return !word.empty() && 
                                             std::tolower(word[0]) == startLetter;
                                  });
    
    std::cout << "以字母 '" << startLetter << "' 开头的单词数量: " << startsWithT << std::endl;
    
    // 7. 检查是否所有单词都小于20个字符
    bool allShort = std::all_of(words.begin(), words.end(),
                              [](const std::string& word) {
                                  return word.length() < 20;
                              });
    
    std::cout << "所有单词都少于20个字符? " << (allShort ? "是" : "否") << std::endl;
    
    return 0;
}

输出：

复制代码

原始文本:
This is a sample text. This text is used to demonstrate the power of STL algorithms for text analysis. Algorithms are powerful tools in C++ programming.

单词数量: 24

词频统计 (前10个):
is             : 3
text           : 2
this           : 2
algorithms     : 2
of             : 2
a              : 1
sample         : 1
used           : 1
to             : 1
demonstrate    : 1

单词 "algorithms" 在位置 12 找到
所有出现位置: 12 19 

最短的单词: a (1 个字符)
最长的单词: demonstrate (11 个字符)

以字母 't' 开头的单词数量: 4
所有单词都少于20个字符? 是

这个例子展示了如何使用STL算法来实现一个文本分析应用，包括文本清理、分词、词频统计、单词查找和文本特征分析等功能。这些操作在自然语言处理和信息检索等领域非常常见。

总结

在本文中，我们介绍了STL算法库的基础知识，并详细探讨了非修改序列算法和一些简单的修改序列算法，包括:

遍历算法 ：如for_each和for_each_n
查找算法 ：如find、find_if、search和find_first_of
计数算法 ：如count和count_if
比较算法 ：如equal、mismatch和lexicographical_compare
条件检查算法 ：如all_of、any_of和none_of
特殊查找算法 ：如adjacent_find和search_n
简单修改序列算法 ：如fill、generate和transform

通过实际应用示例，我们看到了这些算法如何结合使用来解决文本处理等实际问题。

这些算法为我们提供了强大的工具，使我们能够以简洁、高效的方式处理容器中的数据，而不必手动编写循环和条件判断代码。这不仅提高了代码的可读性和可维护性，也减少了出错的可能性。

在下一篇文章中，我们将继续探讨STL算法库中的排序和变序算法，包括sort、stable_sort、partial_sort以及reverse、rotate、shuffle等变序算法。

参考资源

C++ Reference - 详细的STL算法文档
《C++标准库》by Nicolai M. Josuttis
《Effective STL》by Scott Meyers
《C++17 STL Cookbook》by Jacek Galowicz

这是我C++学习之旅系列的第二十五篇技术文章。查看完整系列目录了解更多内容。

如有任何问题或建议，欢迎在评论区留言交流！

C++学习：六个月从基础到就业——STL算法（一） 基础与查找算法