基于正倒排索引的boost搜索引擎

cpp-httplib库
- [cpp-httplib 库介绍](#cpp-httplib 库介绍)
- 常用功能与函数
- - [1. 服务器相关](#1. 服务器相关)
  - [2. 客户端相关](#2. 客户端相关)
- 下载与使用
网页模块
完整代码
- common.h
- Index.hpp
- Log.hpp
- main.cc
- makefile
- Parser.cc
- Parser.h
- Search.hpp
- Util.hpp
结果展示

cpp-httplib库

cpp-httplib 库介绍

cpp-httplib 是一个轻量级的 C++ HTTP 客户端 / 服务器库，由日本开发者 yhirose 开发。它的特点是：

单文件设计（仅需包含 httplib.h 即可使用）
支持 HTTP 1.1
同时提供客户端和服务器功能
跨平台（Windows、Linux、macOS 等）
无需额外依赖（仅需 C++11 及以上标准）
支持 SSL/TLS（需配合 OpenSSL）

常用功能与函数

1. 服务器相关

创建服务器

cpp 复制代码

httplib::Server svr;

注册路由处理函数

cpp 复制代码

// GET 请求处理
svr.Get("/hello", [](const httplib::Request& req, httplib::Response& res) {
  res.set_content("Hello World!", "text/plain");
});

// POST 请求处理
svr.Post("/submit", [](const httplib::Request& req, httplib::Response& res) {
  // 处理表单数据 req.body
  res.set_content("Received!", "text/plain");
});

启动服务器

cpp 复制代码

// 监听 0.0.0.0:8080
if (svr.listen("0.0.0.0", 8080)) {
  // 服务器启动成功
}

Request 类主要成员

method: 请求方法（GET/POST 等）
path: 请求路径
body: 请求体内容
headers: 请求头集合
params: URL 查询参数
get_param(key): 获取查询参数

Response 类主要成员

status: 状态码（200, 404 等）
body: 响应体内容
headers: 响应头集合
set_content(content, content_type): 设置响应内容和类型
set_header(name, value): 设置响应头

2. 客户端相关

创建客户端

cpp 复制代码

httplib::Client cli("http://example.com");

发送 GET 请求

cpp 复制代码

auto res = cli.Get("/api/data");
if (res && res->status == 200) {
  // 处理响应 res->body
}

发送 POST 请求

cpp 复制代码

httplib::Params params;
params.emplace("name", "test");
params.emplace("value", "123");

auto res = cli.Post("/api/submit", params);

发送带请求体的 POST

cpp 复制代码

std::string json_data = R"({"key": "value"})";
auto res = cli.Post("/api/json", json_data, "application/json");

下载与使用

下载路径

GitHub 仓库：https://github.com/yhirose/cpp-httplib

直接下载头文件：https://raw.githubusercontent.com/yhirose/cpp-httplib/master/httplib.h

使用方法

1.下载 httplib.h 文件

2.在项目中包含该文件：#include "httplib.h"

3.编译时需指定 C++11 及以上标准（如 g++ -std=c++11 main.cpp）

4.若使用 SSL 功能，需定义 CPPHTTPLIB_OPENSSL_SUPPORT 并链接 OpenSSL 库

5.编译器版本低可能会报错，升级一下编译器即可

简单示例

下面是一个完整的服务器示例：

cpp 复制代码

#include "httplib.h"
#include <iostream>

int main() {
  httplib::Server svr;

  // 处理根路径请求
  svr.Get("/", [](const httplib::Request& req, httplib::Response& res) {
    res.set_content("<h1>Hello World!</h1>", "text/html");
  });

  // 处理带参数的请求
  svr.Get("/greet", [](const httplib::Request& req, httplib::Response& res) {
    auto name = req.get_param_value("name");
    if (name.empty()) {
      res.status = 400;
      res.set_content("Name parameter is required", "text/plain");
    } else {
      res.set_content("Hello, " + name + "!", "text/plain");
    }
  });

  std::cout << "Server running on http://localhost:8080" << std::endl;
  svr.listen("localhost", 8080);
  
  return 0;
}

这个库非常适合快速开发小型 HTTP 服务或客户端，由于其轻量性和易用性，在 C++ 社区中非常受欢迎。

网页模块

仿照其它成熟搜索页面

这是一个大公司建立的成熟的搜索页面，我们写的可以仿照着来。

**经过搜索之后，网页地址上会带上搜索的关键词，从而到数据库内部或者其它建立好的搜索模块中查找，在通过网页映射出来。
**

编写主程序入口

**当外部通过网页访问建立好的端口的时候，搜索模块会初始化一次，文档是已经建立好的，先绑定主网页html的路径，然后注册Get方法，网页访问/s的时候实用？word=来带参数，从而出发搜索模块的查找，然后把结果json串返回给浏览器。启动后绑定host和端口号，则开始运行。

cpp 复制代码

#include "Log.hpp"
#include "common.h"
#include "Parser.h"
#include "Search.hpp"
#include "httplib.h"
#include <cstdio>
#include <cstring>
#include <string>
const bool INIT = false;
int main()
{
if(INIT)
{
Parser parser(Orignaldir, Tragetfile);
parser.Init();
}
ns_search::Search search;
httplib::Server svr;
svr.set_base_dir(Basewwwroot);
svr.Get("/s", [&](const httplib::Request& req, httplib::Response& rep){
std::string param = "word";
std::string word;
std::string out;
out.clear();
if(req.has_param(param))
{
word = req.get_param_value(param);
Log(LogModule::DEBUG) << "查找关键词：" << word;
}
// rep.set_content("Search: " + word, "text/plain");
bool b = search.SearchBy(word, out);
if(b)
rep.set_content(out, "application/json");
else
Log(DEBUG) << "查找失败";
});
svr.listen("0.0.0.0", 8080);
return 0;
}

编写网页

编写网页是从一个大概的框架开始先写主要部分，再用css美化，然后注册相关函数。
**```text

<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Boost搜索引擎</title> <script src="https://apps.bdimg.com/libs/jquery/2.1.4/jquery.min.js"></script> <style> * { margin: 0; padding: 0; box-sizing: border-box; } html, body { height: 100%; font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif; background-color: #f5f7fa; } .container { max-width: 1000px; margin: 0 auto; padding: 20px; } .header { text-align: center; margin-bottom: 30px; } .header h1 { color: #2c3e50; font-size: 2.5rem; margin-bottom: 10px; } .header p { color: #7f8c8d; font-size: 1.1rem; } .search-box { display: flex; margin-bottom: 30px; box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1); border-radius: 8px; overflow: hidden; } .search-box input { flex: 1; height: 60px; padding: 0 20px; border: none; font-size: 1.2rem; background-color: white; } .search-box input:focus { outline: none; background-color: #f8f9fa; } .search-box button { width: 140px; height: 60px; border: none; background-color: #3498db; color: white; font-size: 1.2rem; cursor: pointer; transition: background-color 0.3s; } .search-box button:hover { background-color: #2980b9; } .intro { background-color: white; padding: 20px; border-radius: 8px; margin-bottom: 20px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05); } .intro h2 { color: #2c3e50; margin-bottom: 10px; } .results-container { display: none; /* 初始隐藏，有结果时显示 */ } .result-item { background-color: white; padding: 20px; border-radius: 8px; margin-bottom: 15px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05); transition: transform 0.2s, box-shadow 0.2s; } .result-item:hover { transform: translateY(-2px); box-shadow: 0 4px 12px rgba(0, 0, 0, 0.1); } .result-title { font-size: 1.3rem; color: #3498db; margin-bottom: 10px; text-decoration: none; display: block; } .result-title:hover { text-decoration: underline; } .result-desc { color: #5a6c7d; line-height: 1.5; margin-bottom: 10px; } .result-url { color: #95a5a6; font-size: 0.9rem; font-style: italic; } .no-results { text-align: center; padding: 40px; color: #7f8c8d; background-color: white; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05); } .loading { text-align: center; padding: 30px; color: #3498db; } .footer { text-align: center; margin-top: 40px; color: #95a5a6; font-size: 0.9rem; } @media (max-width: 600px) { .container { padding: 10px; } .search-box { flex-direction: column; } .search-box input { height: 50px; border-radius: 8px 8px 0 0; } .search-box button { width: 100%; height: 50px; border-radius: 0 0 8px 8px; } } </style> </head> <body>

Boost搜索引擎

基于正倒排索引的高效文档检索系统

欢迎使用Boost搜索引擎

这是一个基于C++和Boost库实现的搜索引擎，采用正倒排索引技术，提供高效的文档检索功能。

在搜索框中输入关键词，点击搜索按钮即可查找相关文档。

搜索中，请稍候...

'); // 发送搜索请求 $.ajax({ url: "/s?word=" + encodeURIComponent(keywords), type: "GET", dataType: "json", success: function(data) { buildResults(data); }, error: function(xhr) { console.log("error", xhr.status); resultsContainer.html('

搜索失败，请稍后重试

'); } }); } function buildResults(data) { const resultsContainer = $('#results-container'); // 清空之前的结果 resultsContainer.empty(); if (!data || data.length === 0) { resultsContainer.html('

未找到相关结果，请尝试其他关键词

'); return; } // 构建结果列表 data.forEach(function(item, index) { const resultItem = $('

', { class: 'result-item' }); const title = $('', { class: 'result-title', href: item.url || '#', text: item.title || '无标题', target: '_blank' }); const desc = $('

', { class: 'result-desc', text: item.desc || '无描述信息' }); const url = $('