cpp-httplib:路径参数解析类剖析

文章目录

在Java里,springboot能实现如下代码:

java 复制代码
@GetMapping("/user/{userId}/")
public User getuser(@PathVariable int userId) {
	return userMapper.selectById(userId);
}

即可获取用户路径参数,在cpp-httplib开源库中也有这个功能,在该库中,叫macher,一共实现了两个匹配器,本文聚焦于PathParamsMatcher,源码如下:

cpp 复制代码
class MatcherBase 
{
public:
  MatcherBase(std::string pattern) : pattern_(std::move(pattern)) {}
  virtual ~MatcherBase() = default;

  const std::string &pattern() const { return pattern_; }

  // Match request path and populate its matches and
  virtual bool match(Request &request) const = 0;

private:
  std::string pattern_;
};

/**
 * Captures parameters in request path and stores them in Request::path_params
 *
 * Capture name is a substring of a pattern from : to /.
 * The rest of the pattern is matched against the request path directly
 * Parameters are captured starting from the next character after
 * the end of the last matched static pattern fragment until the next /.
 *
 * Example pattern:
 * "/path/fragments/:capture/more/fragments/:second_capture"
 * Static fragments:
 * "/path/fragments/", "more/fragments/"
 *
 * Given the following request path:
 * "/path/fragments/:1/more/fragments/:2"
 * the resulting capture will be
 * {{"capture", "1"}, {"second_capture", "2"}}
 */
class PathParamsMatcher final : public MatcherBase {
public:
  PathParamsMatcher(const std::string &pattern);

  bool match(Request &request) const override;

private:
  // Treat segment separators as the end of path parameter capture
  // Does not need to handle query parameters as they are parsed before path
  // matching
  static constexpr char separator = '/';

  // Contains static path fragments to match against, excluding the '/' after
  // path params
  // Fragments are separated by path params
  std::vector<std::string> static_fragments_;
  // Stores the names of the path parameters to be used as keys in the
  // Request::path_params map
  std::vector<std::string> param_names_;
};


inline PathParamsMatcher::PathParamsMatcher(const std::string &pattern)
    : MatcherBase(pattern) {
  constexpr const char marker[] = "/:";

  // One past the last ending position of a path param substring
  std::size_t last_param_end = 0;

#ifndef CPPHTTPLIB_NO_EXCEPTIONS
  // Needed to ensure that parameter names are unique during matcher
  // construction
  // If exceptions are disabled, only last duplicate path
  // parameter will be set
  std::unordered_set<std::string> param_name_set;
#endif

  while (true) {
    const auto marker_pos = pattern.find(
        marker, last_param_end == 0 ? last_param_end : last_param_end - 1);
    if (marker_pos == std::string::npos) { break; }

    static_fragments_.push_back(
        pattern.substr(last_param_end, marker_pos - last_param_end + 1));

    const auto param_name_start = marker_pos + str_len(marker);

    auto sep_pos = pattern.find(separator, param_name_start);
    if (sep_pos == std::string::npos) { sep_pos = pattern.length(); }

    auto param_name =
        pattern.substr(param_name_start, sep_pos - param_name_start);

#ifndef CPPHTTPLIB_NO_EXCEPTIONS
    if (param_name_set.find(param_name) != param_name_set.cend()) {
      std::string msg = "Encountered path parameter '" + param_name +
                        "' multiple times in route pattern '" + pattern + "'.";
      throw std::invalid_argument(msg);
    }
#endif

    param_names_.push_back(std::move(param_name));

    last_param_end = sep_pos + 1;
  }

  if (last_param_end < pattern.length()) {
    static_fragments_.push_back(pattern.substr(last_param_end));
  }
}

inline bool PathParamsMatcher::match(Request &request) const {
  request.matches = std::smatch();
  request.path_params.clear();
  request.path_params.reserve(param_names_.size());

  // One past the position at which the path matched the pattern last time
  std::size_t starting_pos = 0;
  for (size_t i = 0; i < static_fragments_.size(); ++i) {
    const auto &fragment = static_fragments_[i];

    if (starting_pos + fragment.length() > request.path.length()) {
      return false;
    }

    // Avoid unnecessary allocation by using strncmp instead of substr +
    // comparison
    if (std::strncmp(request.path.c_str() + starting_pos, fragment.c_str(),
                     fragment.length()) != 0) {
      return false;
    }

    starting_pos += fragment.length();

    // Should only happen when we have a static fragment after a param
    // Example: '/users/:id/subscriptions'
    // The 'subscriptions' fragment here does not have a corresponding param
    if (i >= param_names_.size()) { continue; }

    auto sep_pos = request.path.find(separator, starting_pos);
    if (sep_pos == std::string::npos) { sep_pos = request.path.length(); }

    const auto &param_name = param_names_[i];

    request.path_params.emplace(
        param_name, request.path.substr(starting_pos, sep_pos - starting_pos));

    // Mark everything up to '/' as matched
    starting_pos = sep_pos + 1;
  }
  // Returns false if the path is longer than the pattern
  return starting_pos >= request.path.length();
}

100行即可实现优雅的参数提取,用法如下:

cpp 复制代码
svr.Get("/user/:userId/", [](const httplib::Request& req, httplib::Response& res) {
    auto userId = req.path_params.at("userId");
    res.set_content("User ID: " + userId, "text/plain");
});

MatcherBase

  • 定义接口
  • 保存占位符,例如/user/:id,会保存:id

PathParamsMatcher

构造函数:把路径拆成"静态片段数组"+"参数名数组"

match接口:用静态片段做"锚点",把两段锚点之间的子串当成参数值,塞进 request.path_params

路径:/api/v1/users/:id/books/:isbn/chapter

拆完以后:

静态片段数组 static_fragments_ 的内容依次是,可以理解为非变量,此部分是

  1. /api/v1/users/
  2. /books/
  3. /chapter

可以把静态片段数组理解为非变量,此部分是固定的

拆完以后:

参数名数组 param_names_ 的内容依次是

  1. "id"
  2. "isbn"

可以把参数名数组理解为非变量,此部分是根据不同的用户进行变更的

构造函数

cpp 复制代码
constexpr const char marker[] = "/:";	

// constexpr const char marker[] = "/:";定义匹配方式,后续代码用这个找出变量数组
// 注意:已经声明了constexpr,marker已经是编译期常量,不需要再加const,不过无所谓
cpp 复制代码
std::size_t last_param_end = 0;

// 上次匹配的下标
cpp 复制代码
while (true) {
    // code..
}

// 不断匹配
cpp 复制代码
const auto marker_pos = pattern.find(
        marker, last_param_end == 0 ? last_param_end : last_param_end - 1);
if (marker_pos == std::string::npos) { break; }

// 开始在路径里查找标记,如果是第一次匹配,则从0开始,不然从上次的前一个下标开始
// 第一次匹配last_param_end为0

// 如果没有找到,则跳出循环
cpp 复制代码
static_fragments_.push_back(
    pattern.substr(last_param_end, marker_pos - last_param_end + 1));

// 裁剪从上次匹配的下标开始的字符串,字符串的长度为:查找到的新一处的标记的下标 - 上次匹配的下标 + 1
// 也就是裁剪区间:[上次匹配的下标,查找到的新一处的标记的下标]

// 第一次运行的话,上次匹配的下标为0,查找到的新一处的标记的下标为x,则中间都是静态数组
// 例如:/api/v1/users/:id/books/:isbn/chapter
// 则last_param_end == 0,marker_pos == 12(users后面的:/)
// 此时会裁剪出/api/v1/users,存放到静态数组里

// /:id/user
// -> static_fragments[0] == '/';
cpp 复制代码
const auto param_name_start = marker_pos + str_len(marker);

// 查找到的新一处的标记的下标 + 标记的长度就是占位符起始下标
cpp 复制代码
auto sep_pos = pattern.find(separator, param_name_start);
if (sep_pos == std::string::npos) { sep_pos = pattern.length(); }

// 注:separator为"/"
// 在从参数名开始,路径里查找/
// 如果没有找到,说明参数名就是路径的最后一节,则sep_pos更改为路径尾
// 如果找到了,说明参数名是路径里中间一节,后面还有静态节
cpp 复制代码
auto param_name =
        pattern.substr(param_name_start, sep_pos - param_name_start);

// 裁剪字符串,字符串从参数名开始,长度为分割符 - 参数名
// 也就是裁剪区间,[参数名起始下标,分割符前一位]
// 例如:/api/:id/123
// sep_pos == 8(/)
// param_name_start == 6(i)
// /api/:id
// param_name_start == 6(i), 
// sep_pos == 7(d)
cpp 复制代码
param_names_.push_back(std::move(param_name));
last_param_end = sep_pos + 1;

// 把参数名存入数组
// 更新上次参数尾

match

cpp 复制代码
std::size_t starting_pos = 0;

// 起点
cpp 复制代码
for (size_t i = 0; i < static_fragments_.size(); ++i) {
    // code...
}

// 遍历静态数组
cpp 复制代码
const auto &fragment = static_fragments_[i];

if (starting_pos + fragment.length() > request.path.length()) {
  return false;
}

// 先获取当前成员
// 起点 + 当前成员的长度超过了http请求的路径的长度,则说明出错了
cpp 复制代码
if (std::strncmp(request.path.c_str() + starting_pos, fragment.c_str(),
                 fragment.length()) != 0) {
  return false;
}

// 比较http路径和静态数组当前成员是否匹配,如果不匹配则表示出错了
// 第一次fragement为"/"
cpp 复制代码
starting_pos += fragment.length();
// 跳过静态片段,接下来是参数段
cpp 复制代码
if (i >= param_names_.size()) { continue; }

// 如果当前索引超过参数格式,说明已经全匹配完毕
cpp 复制代码
auto sep_pos = request.path.find(separator, starting_pos);
if (sep_pos == std::string::npos) { sep_pos = request.path.length(); }

// 从HTTP路径里以starting_pos为起点,开始查找分割符/
// 如果没有找到,说明路径参数已经被匹配完全
cpp 复制代码
const auto &param_name = param_names_[i];

request.path_params.emplace(
    param_name, request.path.substr(starting_pos, sep_pos - starting_pos));

// Mark everything up to '/' as matched
starting_pos = sep_pos + 1;

// 获取参数数组的当前成员
// 裁剪字符串,以starting_pos为起点,长度为sep_pos - starting_pos
// 把结果存成map,key是参数名,值是从路径里裁剪出来的
// 更新每次匹配的起点
cpp 复制代码
return starting_pos >= request.path.length();

// 每次匹配必须完全,否则说明中间出错了

实现一个自己的路径参数提取

cpp 复制代码
// @author: NemaleSu
// @brief: http请求路径里提取参数


#pragma once

#include <string>
#include <vector>
#include <unordered_map>

/*
 * todo 
 * add
 * - 非 /: 格式的占位符
 * - 路径分隔符非 /  
*/
class HttpPathMatcher 
{
public:
    explicit HttpPathMatcher(const std::string& pat);
    bool match(const std::string& path, std::unordered_map<std::string, std::string>& out) const;

private:
    struct Segment 
    {
        bool        is_param = false;
        std::string literal;
        std::string name;
    };

    std::vector<Segment> segments_;
    void build(const std::string& pat);
};

测试

cpp 复制代码
#include <iostream>
#include <string>
#include <vector>
#include <unordered_map>
#include "httppathmatcher.h"

using namespace std;

// 测试框架宏
#define TEST(name, expr) do { \
    if (!(expr)) { \
        std::cerr << "❌  " << name << "  FAILED\n"; \
        std::abort(); \
    } else { \
        std::cout << "✅  " << name << "  PASSED\n"; \
    } \
} while (0)

// 测试用例
int main() {
    std::unordered_map<std::string, std::string> params;

    // 测试根路径
    HttpPathMatcher root("/");
    TEST("root match /", root.match("/", params));
    TEST("root not match /extra", !root.match("/extra", params));

    // 测试单参数路径
    HttpPathMatcher id("/:id");
    TEST("id match /123", id.match("/123", params) && params["id"] == "123");
    TEST("id match /123/", id.match("/123/", params) && params["id"] == "123");
    TEST("id not match /", !id.match("/", params));
    TEST("id not match /123/extra", !id.match("/123/extra", params));

    // 测试多参数路径
    HttpPathMatcher file("/:id/file/:filename");
    TEST("file match /42/file/report.pdf",
         file.match("/42/file/report.pdf", params) &&
         params["id"] == "42" && params["filename"] == "report.pdf");
    TEST("file match /42/file/report.pdf/",
         file.match("/42/file/report.pdf/", params) &&
         params["id"] == "42" && params["filename"] == "report.pdf");
    TEST("file not match /42/file", !file.match("/42/file", params));
    TEST("file not match /42/file/", !file.match("/42/file/", params));

    // 测试多段参数路径
    HttpPathMatcher files("/:id/dir/:dirname/file/:filename");
    TEST("files match /42/dir/testdir/file/report.pdf",
        files.match("/42/dir/testdir/file/report.pdf", params) &&
        params["id"] == "42" && params["dirname"] == "testdir" && params["filename"] == "report.pdf");
    TEST("files match /42/dir/testdir/file/report.pdf/",
        files.match("/42/dir/testdir/file/report.pdf/", params) &&
        params["id"] == "42" && params["dirname"] == "testdir" && params["filename"] == "report.pdf");
    TEST("files not match /42/dir/file/report.pdf",
        !files.match("/42/dir/file/report.pdf", params));

    std::cout << "\n🎉 All tests passed!\n";
    return 0;
}

测试结果:

shell 复制代码
✅  root match /  PASSED
✅  root not match /extra  PASSED
✅  id match /123  PASSED
✅  id match /123/  PASSED
✅  id not match /  PASSED
✅  id not match /123/extra  PASSED
✅  file match /42/file/report.pdf  PASSED
✅  file match /42/file/report.pdf/  PASSED
✅  file not match /42/file  PASSED
✅  file not match /42/file/  PASSED
✅  files match /42/dir/testdir/file/report.pdf  PASSED
✅  files match /42/dir/testdir/file/report.pdf/  PASSED
✅  files not match /42/dir/file/report.pdf  PASSED

🎉 All tests passed!
相关推荐
曼巴UE56 小时前
UE C++ 字符串的操作
java·开发语言·c++
天天进步20156 小时前
Linux 实战:如何像查看文件一样“实时监控” System V 共享内存?
开发语言·c++·算法
liulilittle7 小时前
C++ OS相关。
c++
仰泳的熊猫7 小时前
1176 The Closest Fibonacci Number
数据结构·c++·算法·pat考试
点云SLAM7 小时前
C++中constexpr 与 explicit关键字使用详解
c++·explicit关键字·隐式转换·c++编译·constexpr关键字·c++11/17/20
宠..7 小时前
获取输入内容
开发语言·c++·qt
郝学胜-神的一滴7 小时前
Linux系统调用中断机制深度解析
linux·运维·服务器·开发语言·c++·程序人生
chenyuhao20247 小时前
Linux系统编程:Ext文件系统
linux·运维·服务器·开发语言·网络·c++·后端
hd51cc8 小时前
MFC运行原理
c++·mfc