C++AI大模型接入SDK---Genimi接入封装
文章目录
- C++AI大模型接入SDK---Genimi接入封装
-
- 1、大模型初始化
- [2、Gemini 的 API](#2、Gemini 的 API)
- 3、发送消息--全量返回
- 4、发送消息--全量返回测试
- 5、发送消息--流式返回
- 6、发送消息---流式返回测试
项目地址: 橘子师兄/ai-model-acess-tech - Gitee.com
博客专栏:C++AI大模型接入SDK_橘子师兄的博客-CSDN博客
博主首页:橘子师兄-CSDN博客
1、大模型初始化
设置模型api key, base URL、模型简介。
c++
///////////////////////////////// GeminiProvider.h
////////////////////////////////
#pragma once
#include "ILLMProvider.h"
namespace ai_chat_sdk {
class GeminiProvider : public ILLMProvider {
public:
// 初始化模型 key: api_key, value: api_key
virtual bool initModel(const std::map<std::string, std::string>& model_config) override;
// 检测模型是否有效
virtual bool isAvailable() override;
// 获取模型名称
virtual std::string getModelName() const override;
// 获取模型描述信息
virtual std::string getModelDesc() const override;
// 发送消息给模型 - 全量返回
virtual std::string sendMessage(
const std::vector<Message>& messages,
const std::map<std::string, std::string>& request_param) override;
// 发送消息给模型 - 流式响应
virtual std::string sendMessageStream(
const std::vector<Message>& messages,
const std::map<std::string, std::string>& request_param,
std::function<void(const std::string&, bool)> callback) override;
};
} // end ai_chat_sdk
///////////////////////////////// GeminiProvider.cpp
////////////////////////////////
#include "GeminiProvider.h"
#include "../../util/my_logger.h"
#include <string>
namespace ai_chat_sdk {
// 初始化模型 key: api_key, value: api_key
bool GeminiProvider::initModel(const std::map<std::string, std::string>& model_config) {
// 设置 api key
auto it = model_config.find("api_key");
if (it != model_config.end()) {
_api_key = it->second;
} else {
// API Key 未设置
ERR("GeminiProvider initModel failed, api_key not found!");
return false;
}
// 设置模型地址
it = model_config.find("endpoint");
if (it != model_config.end()) {
_endpoint = it->second;
} else {
_endpoint = "https://generativelanguage.googleapis.com";
}
_isAvailable = true;
INFO("GeminiProvider initialized success with endpoint: {}", _endpoint);
return true;
}
// 检测模型是否有效
bool GeminiProvider::isAvailable() {
return _isAvailable;
}
// 获取模型名称
std::string GeminiProvider::getModelName() const {
return "gemini-2.0-flash";
}
// 获取模型描述信息
std::string GeminiProvider::getModelDesc() const {
return "Google 的极速响应模型,专为大模型部署和快速交互场景设计。";
}
} // end ai_chat_sdk
2、Gemini 的 API
官方文档:https://ai.google.dev/gemini-api/docs?hl=zh-cn
API 文档:https://ai.google.dev/api?hl=zh-cn\&lang=python
Gemini模型也兼容OPenAI,即可以采用类似OpenAI的格式访问gemini模型。
Base URL:https://generativelanguage.googleapis.com
gpt-4o-mini模型的聊天补全接口设置如下:
请求URL POST /v1beta/openai/chat/completions
请求头参数:
| 字段名称 | 字段类型 | 字段说明 |
|---|---|---|
| Content-Type | string | application/json |
| Authorization | string | "Bearer" + _api_key |
请求体参数:
| 字段名称 | 字段类型 | 字段说明 |
|---|---|---|
| model | string | 模型名称 |
| messages | array | 历史对话,内部为每个对话的object,包含role和content两个字段 |
| temperature | string | 采样温度 |
| max_tokens | integer | 最大token数 |
json
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "我是一个大型语言模型,由 Google 训练。\n",
"role": "assistant"
}
}
],
"created": 1754897666,
"id": "Ap2ZaPGBEMS1nvgPu6DPwAg",
"model": "gemini-2.0-flash",
"object": "chat.completion",
"usage": {
"completion_tokens": 12,
"prompt_tokens": 3,
"total_tokens": 15
}
}
与DeepSeek类似,Gemini也不会保存历史会话记录,因此在给Gemini发送请求时,需要提供之前聊天的上下文记录,Gemini才会根据上下文记录提供对应的响应。
3、发送消息--全量返回
接入Gemini系列模型时,google提供了专门的api接口,同时也兼容OpenAI api。为了减少复杂的实现快速接入,本文使用OpenAI兼容的API快速接入。
URL: /v1beta/openai/chat/completions
请求参数:
| 字段名称 | 字段类型 | 字段说明 |
|---|---|---|
| model | string | 模型名称 |
| messages | array | 历史对话,内部为每个对话的object,包括role和content两个字段 |
| temperature | string | 采样温度 |
| max_comletion_tokens | integer | 最大tokens数 |
响应格式:
json
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "你好!很高兴为你服务。有什么我可以帮助你的吗\n",
"role": "assistant"
}
}
],
"created": 1756716818,
"id": "El-1aPv6Etau1MkPr-6ymQg",
"model": "gemini-2.0-flash",
"object": "chat.completion",
"usage": {
"completion_tokens": 14,
"prompt_tokens": 1,
"total_tokens": 15
}
}
c++
//////////////////////////// GeminiProvider.cpp
////////////////////////////////////
// ...
// 发送消息给模型 - 全量返回
std::string GeminiProvider::sendMessage(const std::vector<Message>& messages,
const std::map<std::string, std::string>& request_param) {
// 检查模型是否有效
if (!_isAvailable) {
ERR("GeminiProvider is not init!");
return "";
}
// 获取采样温度 和 max_tokens
double temperature = 0.7;
int max_tokens = 2048;
if (request_param.find("temperature") != request_param.end()) {
temperature = std::stof(request_param.at("temperature"));
}
if (request_param.find("max_tokens") != request_param.end()) {
max_tokens = std::stoi(request_param.at("max_tokens"));
}
// 构建历史消息
Json::Value messages_array(Json::arrayValue);
for (const auto& message : messages) {
Json::Value msg;
msg["role"] = message.role;
msg["content"] = message.content;
messages_array.append(msg);
}
// 构建请求体
Json::Value request_body;
request_body["model"] = "gemini-2.0-flash";
request_body["messages"] = messages_array;
request_body["temperature"] = temperature;
request_body["max_tokens"] = max_tokens;
// 序列化
Json::StreamWriterBuilder writer;
std::string json_string = Json::writeString(writer, request_body);
DBG("GeminiProvider: request_body: {}", json_string);
// 创建HTTP Client
httplib::Client client(_endpoint);
client.set_connection_timeout(30, 0); // 30秒超时
client.set_read_timeout(60, 0); // 60秒读取超时
client.set_proxy("127.0.0.1", 7890); // 注意必须设置代理,否则无法访问
// 设置请求头
httplib::Headers headers = {
{"Authorization", "Bearer " + _api_key}
};
// 发送POST请求
auto response = client.Post("/v1beta/openai/chat/completions", headers,
json_string, "application/json");
if (!response) {
ERR("Failed to connect to Gemini API - check network and SSL");
return "";
}
DBG("Gemini API response status: {}", response->status);
DBG("Gemini API response body: {}", response->body);
// 检查响应是否成功
if (response->status != 200) {
ERR("Gemini API returned non-200 status: {} - {}", response->status,
response->body);
return "";
}
// 解析响应体
Json::Value response_json;
Json::CharReaderBuilder reader_builder;
std::string parse_errors;
std::istringstream response_stream(response->body);
if (!Json::parseFromStream(reader_builder, response_stream, &response_json,
&parse_errors)) {
ERR("Failed to parse ChatGPT API response: {}", parse_errors);
return "";
}
// 解析大模型回复内容
// 大模型回复包含在choices json数组中
if (response_json.isMember("choices") &&
response_json["choices"].isArray() &&
!response_json["choices"].empty()) {
auto& choice = response_json["choices"][0];
if (choice.isMember("message") && choice["message"].isArray()) {
std::string reply_content = choice["content"]["content"].asString();
INFO("Received Gemini response: {}", reply_content);
return reply_content;
}
}
// 解析失败,返回错误信息
ERR("Invalid response format from Gemini API");
return "Invalid response format from Gemini API";
}
4、发送消息--全量返回测试
c++
///////////////////////////////// testLLM.cpp
///////////////////////////////////
// ...
TEST(GeminiProviderTest, sendMessageGemini) {
auto provider = std::make_shared<ai_chat_sdk::GeminiProvider>();
ASSERT_TRUE(provider != nullptr);
std::map<std::string, std::string> modelParam;
modelParam["api_key"] = std::getenv("gemini_apikey");
modelParam["endpoint"] = "https://generativelanguage.googleapis.com";
provider->initModel(modelParam);
ASSERT_TRUE(provider->isAvailable());
std::map<std::string, std::string> requestParam = {
{"temperature", "0.7"},
{"max_tokens", "2048"}
};
std::vector<ai_chat_sdk::Message> messages;
messages.push_back({"user", "你是谁?"});
// 调用sendMessage方法
std::string fullData = provider->sendMessage(messages, requestParam);
ASSERT_FALSE(fullData.empty());
INFO("response : {}", fullData);
}
5、发送消息--流式返回
URL: /v1beta/openai/chat/completions
请求参数:
| 字段名称 | 字段类型 | 字段说明 |
|---|---|---|
| model | string | 模型名称 |
| messages | array | 历史对话,内部为每个对话的object,包含role和content两个字段 |
| templerature | string | 采样温度 |
| max_tokens | integer | 最大tokens数 |
| stream | boolean | 是否开启流式响应,默认为false |
相应格式:
json
data: {
"choices": [
{
"delta": {
"content": " 通常",
"role": "assistant"
},
"index": 0
}
],
"created": 1756720923,
"id": "Gm-1aKjSM-ag7dcP77rtoAs",
"model": "gemini-2.0-flash",
"object": "chat.completion.chunk"
}
...
data: {
"choices": [
{
"delta": {
"content": "你在研究股票市场吗? (提示: Shanghai Stock Exchange)\n\n一旦你提供了更多信息,我就能给出更精确的解释\n",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0
}
],
"id": "Gm-1aKjSM-ag7dcP77rtoAs",
"model": "gemini-2.0-flash",
"object": "chat.completion.chunk"
}
data: [DONE]
c++
////////////////////////////// geminiProvider.cpp
//////////////////////////////////
// ...
std::string GeminiProvider::sendMessageStream(const std::vector<Message>& messages,
const std::map<std::string, std::string>& request_param,
std::function<void(const std::string&, bool)> callback) {
if (!_isAvailable) {
ERR("GeminiProvider is not available");
return "";
}
// 获取采样温度 和 max_tokens
double temperature = 0.7;
int max_tokens = 2048;
if (request_param.find("temperature") != request_param.end()) {
temperature = std::stof(request_param.at("temperature"));
}
if (request_param.find("max_tokens") != request_param.end()) {
max_tokens = std::stoi(request_param.at("max_tokens"));
}
// 构建历史消息
Json::Value messages_array(Json::arrayValue);
for (const auto& message : messages) {
Json::Value msg;
msg["role"] = message.role;
msg["content"] = message.content;
messages_array.append(msg);
}
// 构建请求体
Json::Value request_body;
request_body["model"] = "gemini-2.0-flash";
request_body["messages"] = messages_array;
request_body["temperature"] = temperature;
request_body["max_tokens"] = max_tokens;
request_body["stream"] = true;
// 序列化
Json::StreamWriterBuilder writer;
std::string json_string = Json::writeString(writer, request_body);
DBG("GeminiProvider: request_body: {}", json_string);
// 创建HTTP Client
httplib::Client client(_endpoint);
client.set_connection_timeout(60, 0); // 30秒超时
client.set_read_timeout(300, 0); // 60秒读取超时
client.set_proxy("127.0.0.1", 7890); // 注意必须设置代理,否则无法访问
// 设置请求头
httplib::Headers headers = {
{"Authorization", "Bearer " + _api_key}
};
// 流式处理变量
std::string buffer;
bool gotError = false;
std::string errorMsg;
int statusCode = 0;
bool streamFinished = false;
// 用于累积完整响应
std::string fullResponse;
// 创建请求对象
httplib::Request req;
req.method = "POST";
req.path = "/v1beta/openai/chat/completions";
req.headers = headers;
req.body = json_string;
// 处理响应头
req.response_handler = [&](const httplib::Response& response) {
statusCode = response.status;
DBG("Received HTTP status {}", statusCode);
if (200 != statusCode) {
gotError = true;
errorMsg = "HTTP Error" + std::to_string(statusCode);
return false; // 终止请求
}
return true; // 继续接收数据
};
// 处理流式响应
req.content_receiver = [&](const char* data, size_t len, uint64_t offset,
uint64_t totalLength) {
// 如果HTTP请求失败,不用继续接收后续数据
if (gotError) {
return false;
}
// 追加数据到缓冲区
buffer.append(data, len);
DBG("buffer : {}", buffer);
// 处理所有完整的事件,事件之间以\n\n分隔
size_t pos = 0;
while ((pos = buffer.find("\n\n")) != std::string::npos) {
std::string event = buffer.substr(0, pos);
buffer.erase(0, pos + 2);
if (event.empty() || event[0] == ':') {
continue;
}
DBG("buffer : {}", event);
// 检查事件类型
if (0 == event.compare(0, 6, "data: ")) {
std::string jsonStr = event.substr(6);
// 处理流结束标记
if (jsonStr == "[DONE]") {
callback("", true);
streamFinished = true;
return true;
}
// 解析Json数据
Json::Value chunk;
Json::CharReaderBuilder readerBuild;
std::string errs;
std::istringstream jsonStream(jsonStr);
if (Json::parseFromStream(readerBuild, jsonStream, &chunk, &errs)) {
// 提取增量内容
if (chunk.isMember("choices") &&
chunk["choices"].isArray() &&
!chunk["choices"].empty() &&
chunk["choices"][0].isMember("delta") &&
chunk["choices"][0]["delta"].isMember("content")) {
std::string content = chunk["choices"][0]["delta"]["content"].asString();
// 累积全量内容
fullResponse += content;
callback(content, false);
}
} else {
ERR("Gemini SSE JSON parse error {}", errs);
return false;
}
}
}
return true; // 继续接收数据
};
// 给模型发送请求
auto result = client.send(req);
if (!result) {
// 请求发送失败,出现网络问题,比如DNS解析失败、连接超时
ERR("Network error {}", result.error());
return "";
}
// 确保流式操作正确结束
if (!streamFinished) {
WARN("stream ended without [DONE] marker");
callback("", true);
}
return fullResponse;
}
6、发送消息---流式返回测试
c++
///////////////////////////////// testLLM.cpp
//////////////////////////////////////
// ...
TEST(GeminiProviderTest, sendMessageGemini) {
auto provider = std::make_shared<ai_chat_sdk::GeminiProvider>();
ASSERT_TRUE(provider != nullptr);
std::map<std::string, std::string> modelParam;
modelParam["api_key"] = std::getenv("gemini_apikey");
modelParam["endpoint"] = "https://generativelanguage.googleapis.com";
provider->initModel(modelParam);
ASSERT_TRUE(provider->isAvailable());
std::map<std::string, std::string> requestParam = {
{"temperature", "0.7"},
{"max_tokens", "2048"}
};
std::vector<ai_chat_sdk::Message> messages;
messages.push_back({"user", "你是谁?"});
auto write_chunk = [&](const std::string& chunk, bool last) {
INFO(chunk);
if (last) {
INFO("[DONE]");
}
};
std::string fulldata = provider->sendMessageStream(messages, requestParam, write_chunk);
ASSERT_FALSE(fulldata.empty());
INFO("fulldata {}", fulldata);
}
plaintext
// 运行结果部分截图:
// ...
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from GeminiProviderTest
[ RUN ] GeminiProviderTest.sendMessage
[07:47:23][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/sdk/src/GeminiProvider.cpp:32] GeminiProvider::initModel: init model success, endpoint: https://generativelanguage.googleapis.com
[07:47:25][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:103] chunk : 我
[07:47:25][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:103] chunk : 是一个大型语言
[07:47:25][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:103] chunk : 模型,由 Google 训练。
[07:47:25][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:103] chunk :
[07:47:25][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:105] [DONE]
[07:47:33][testLLM][info ][/home/bit/will/ai-model-acess-tech/ai-model-acesstech/AIModelAcessTech/test/testLLM.cpp:110] response : 我是一个大型语言模型,由 Google 训练。
[ OK ] GeminiProviderTest.sendMessage (9534 ms)
[----------] 1 test from GeminiProviderTest (9534 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (9534 ms total)
[ PASSED ] 1 test.