ChatClientAgent的管道具有如下的结构,整个结构从右到左大体上由三部分组成:连接LLM的IChatClient及其中间件链条;旨在实现输入输出增强的多一个AIContextProvider链条;AIAgent中间件链条。本篇文章主要关注第一部分,我们将其称为IChatClient管道。

1. IChatClient
IChatClient作为Agent与LLM交互的连接器,如果将LLM比作数据库,那么IChatClient就相当于IDbConnection。IDbConnection抽象了数据库的具体实现,让我们可以采用一种编程模式操作数据库,IChatClient让我们在写代码时不需要关心背后到底是哪家的模型。IChatClient接口的GetResponseAsync和GetStreamingResponseAsync方法采用两种不同的形式与LLM交互,前者采用阻塞式调用的方式,后者采用流式调用的方式。
csharp
public interface IChatClient : IDisposable
{
Task<ChatResponse> GetResponseAsync(
IEnumerable<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
IEnumerable<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
object? GetService(Type serviceType, object? serviceKey = null);
}
GetResponseAsync和GetStreamingResponseAsync方法的参数除了表示一段对话历史的ChatMessage集合之外,还可以接受一个ChatOptions对象来设置一些与当前对话相关的选项。ChatOptions将各大模型供应商(OpenAI, Anthropic, Google等)常用的参数进行了标准化。
csharp
public class ChatOptions
{
public string? ConversationId { get; set; }
public string? Instructions { get; set; }
public float? Temperature { get; set; }
public int? MaxOutputTokens { get; set; }
public float? TopP { get; set; }
public int? TopK { get; set; }
public float? FrequencyPenalty { get; set; }
public float? PresencePenalty { get; set; }
public long? Seed { get; set; }
public ReasoningOptions? Reasoning { get; set; }
public ChatResponseFormat? ResponseFormat { get; set; }
public string? ModelId { get; set; }
public IList<string>? StopSequences { get; set; }
public bool? AllowMultipleToolCalls { get; set; }
public ChatToolMode? ToolMode { get; set; }
public IList<AITool>? Tools { get; set; }
public bool? AllowBackgroundResponses{ get; set; }
public ResponseContinuationToken? ContinuationToken{ get; set; }
public Func<IChatClient, object?>? RawRepresentationFactory { get; set; }
public AdditionalPropertiesDictionary? AdditionalProperties { get; set; }
}
具体配置选项说明如下:
- ConversationId:对话ID,用于将多个请求关联到同一个对话中;
- Instructions:对模型的系统指令或者系统提示词,用于引导模型生成符合预期的响应;
- Temperature:控制生成文本的随机程度,值越大生成的文本越随机,值越小生成的文本越确定;
- MaxOutputTokens:生成文本的最大Token数量,用于控制生成文本的长度;
- TopP:控制生成文本的多样性,值越小生成的文本越集中在概率较高的选项上,值越大生成的文本越分散;
- TopK:控制生成文本的多样性,值表示在生成每个Token时考虑的候选Token数量,值越小生成的文本越集中在概率较高的选项上,值越大生成的文本越分散;
- FrequencyPenalty:控制生成文本中重复Token的惩罚程度,值越大生成的文本中重复Token越少;
- PresencePenalty:控制生成文本中已经出现过的Token的惩罚程度,值越大生成的文本中已经出现过的Token越少;
- Seed:随机数种子,用于控制生成文本的随机性,设置相同的种子可以得到相同的生成结果;
- Reasoning:推理选项,用于控制模型在生成文本时的推理过程,如是否启用链式思维、推理的深度等;
- ResponseFormat:响应格式,用于指定模型生成的响应的格式,如纯文本、JSON等。如果设置成具有某种格式的JSON Schema,可以实现结构化输出;
- ModelId:模型ID,用于指定使用哪个模型来生成响应;
- StopSequences:停止序列,用于指定在生成文本时遇到这些序列就停止生成;
- AllowMultipleToolCalls:是否允许在生成响应的过程中调用多个工具;
- ToolMode:工具模式,用于指定在生成响应时如何使用工具;
- Tools:工具列表,用于指定在生成响应时可用的工具;
- AllowBackgroundResponses:是否允许生成后台响应;
- ContinuationToken:续订令牌,用于在生成响应时继续之前的对话;
- RawRepresentationFactory:原始表示工厂,用于生成原始表示对象;
- AdditionalProperties:附加属性字典,用于存储额外的配置信息。
1.1 ReasoningOptions
设置推理配置选项的ReasoningOptions类型定义如下所示。它的Effort属性用于控制推理的努力程度,Output属性用于控制推理输出的详细程度。ReasoningOptions可以帮助我们更好地控制模型在生成文本时的推理过程,从而得到更符合预期的响应。
csharp
public sealed class ReasoningOptions
{
public ReasoningEffort? Effort { get; set; }
public ReasoningOutput? Output { get; set; }
}
public enum ReasoningEffort
{
None,
Low,
Medium,
High,
ExtraHigh
}
public enum ReasoningOutput
{
None,
Summary,
Full
}
1.2 ChatToolMode
ChatToolMode定义了AI模型在对话中如何对待和选择工具。你可以把它理解为给AI下达的工具使用指令。
csharp
public class ChatToolMode
{
public static AutoChatToolMode Auto { get; } = new AutoChatToolMode();
public static NoneChatToolMode None { get; } = new NoneChatToolMode();
public static RequiredChatToolMode RequireAny { get; } = new RequiredChatToolMode(null);
public static RequiredChatToolMode RequireSpecific(string functionName) => new RequiredChatToolMode(functionName);
}
public sealed class AutoChatToolMode : ChatToolMode
{
public override bool Equals(object? obj)
public override int GetHashCode()
}
public sealed class NoneChatToolMode : ChatToolMode
{
public override bool Equals(object? obj)=>obj is NoneChatToolMode;
public override int GetHashCode()=>typeof(NoneChatToolMode).GetHashCode();
}
public sealed class RequiredChatToolMode : ChatToolMode
{
public string? RequiredFunctionName { get; }
public RequiredChatToolMode(string? requiredFunctionName)
{
if (requiredFunctionName != null)
{
Throw.IfNullOrWhitespace(requiredFunctionName, "requiredFunctionName");
}
RequiredFunctionName = requiredFunctionName;
}
public override bool Equals(object? obj)
{
if (obj is RequiredChatToolMode requiredChatToolMode)
{
return RequiredFunctionName == requiredChatToolMode.RequiredFunctionName;
}
return false;
}
public override int GetHashCode()=>RequiredFunctionName?.GetHashCode(StringComparison.Ordinal) ?? typeof(RequiredChatToolMode).GetHashCode();
}
ChatToolMode的四个静态属性返回的四个ChatToolMode对象分别表示四种工具使用模式:
- Auto:自动模式,AI模型会根据对话的上下文自动决定是否使用工具以及使用哪个工具;
- None:无工具模式,AI模型在生成响应时不会使用任何工具;
- RequireAny:要求使用任意工具模式,AI模型在生成响应时必须使用至少一个工具;
- RequireSpecific:要求使用特定工具模式,AI模型在生成响应时必须使用指定的工具;
2. DelegatingChatClient
IChatClient管道的构建得益于如下这个DelegatingChatClient类。DelegatingChatClient实现了IChatClient接口,并且持有一个InnerClient属性来引用管道中的下一个IChatClient对象。我们可以通过继承DelegatingChatClient来创建一个个的中间件组件,在这些组件中我们可以在调用InnerClient的方法前后添加一些自定义的逻辑来对请求和响应进行处理,从而实现对整个管道的控制和定制。
csharp
public class DelegatingChatClient : IChatClient, IDisposable
{
protected IChatClient InnerClient { get; }
protected DelegatingChatClient(IChatClient innerClient)
=> InnerClient = Throw.IfNull(innerClient, "innerClient");
public virtual Task<ChatResponse> GetResponseAsync(IEnumerable<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken cancellationToken = default)
=>InnerClient.GetResponseAsync(messages, options, cancellationToken);
public virtual IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
IEnumerable<ChatMessage> messages, ChatOptions? options = null,
CancellationToken cancellationToken = default)
=> InnerClient.GetStreamingResponseAsync(messages, options, cancellationToken);
}
我们可以通过继承DelegatingChatClient来创建一个自定义的ChatClient,并通过重写GetResponseAsync和GetStreamingResponseAsync方法将调用请求状态给被封装的InnerClient,同时在调用前后添加一些自定义的逻辑来处理请求和响应。实际上这就是中间件的一种实现方式,这些DelegatingChatClient组成的委托链与中间件管道是一回事。通过这种方式,我们可以在不修改原有IChatClient实现的基础上,灵活地添加一些额外的功能,如日志记录、性能监控、请求修改等,从而增强整个IChatClient Pipeline的功能和可定制性。

2.1 IChatClient管道执行流程
相面的程序很好的演示了将DelegatingChatClient作为IChatClient中间件。我们通过继承DelegatingChatClient创建了一个名为Middleware的中间件类,在这个类中我们可以通过传入两个委托来分别处理请求和响应。在这个示例中,我们创建了三个Middleware对象,并将它们按照foo、bar、baz的顺序进行嵌套。每个Middleware对象在处理请求和响应时都会打印出相应的日志信息来展示它们的调用顺序。最后我们调用GetResponseAsync方法来触发整个IChatClient Pipeline的执行,并打印出最终的响应内容。
csharp
using Microsoft.Extensions.AI;
IChatClient chatClient = new LLMChatClient();
chatClient = new Middleware(chatClient,
preHandler: (messages, options) => {
Console.WriteLine("baz.pre-handler");
return ValueTask.FromResult(messages);
},
postHandler: (response, options) => {
Console.WriteLine("baz.post-handler");
return ValueTask.FromResult(response);
});
chatClient = new Middleware(chatClient,
preHandler: (messages, options) => {
Console.WriteLine("bar.pre-handler");
return ValueTask.FromResult(messages);
},
postHandler: (response, options) => {
Console.WriteLine("bar.post-handler");
return ValueTask.FromResult(response);
});
chatClient = new Middleware(chatClient,
preHandler: (messages, options) => {
Console.WriteLine("foo.pre-handler");
return ValueTask.FromResult(messages);
},
postHandler: (response, options) => {
Console.WriteLine("foo.post-handler");
return ValueTask.FromResult(response);
});
var response = await chatClient.GetResponseAsync([]);
Console.WriteLine($"response: {response.Messages.Single().Text}");
class LLMChatClient : IChatClient
{
public void Dispose() { }
public Task<ChatResponse> GetResponseAsync(IEnumerable<ChatMessage> messages,
ChatOptions? options = null, CancellationToken cancellationToken = default)
=> Task.FromResult(new ChatResponse(new ChatMessage(role: ChatRole.Assistant, content: "Hello world!")));
public object? GetService(Type serviceType, object? serviceKey = null) =>null;
public IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
IEnumerable<ChatMessage> messages, ChatOptions? options = null,
CancellationToken cancellationToken = default)
=> throw new NotImplementedException();
}
public class Middleware(IChatClient innerClient,
Func<IEnumerable<ChatMessage>, ChatOptions, ValueTask<IEnumerable<ChatMessage>>>? preHandler = null,
Func<ChatResponse, ChatOptions, ValueTask<ChatResponse>>? postHandler = null) : DelegatingChatClient(innerClient)
{
private readonly Func<IEnumerable<ChatMessage>, ChatOptions, ValueTask<IEnumerable<ChatMessage>>>? _preHandler = preHandler;
private readonly Func<ChatResponse, ChatOptions, ValueTask<ChatResponse>>? _postHandler = postHandler;
public override async Task<ChatResponse> GetResponseAsync(IEnumerable<ChatMessage> messages,
ChatOptions? options = null, CancellationToken cancellationToken = default)
{
messages = _preHandler != null ? await _preHandler.Invoke(messages, options!) : messages;
var response = await base.GetResponseAsync(messages, options, cancellationToken);
if (_postHandler != null)
{
response = await _postHandler.Invoke(response, options!);
}
return response;
}
}
输出
foo.pre-handler
bar.pre-handler
baz.pre-handler
baz.post-handler
bar.post-handler
foo.post-handler
response: Hello world!
2.2 预定义的IChatClient中间件
系统通过继承DelegatingChatClient的方式预定义了很多这样的中间件,我们这里随便列举了一些:
- LoggingChatClient: 在不修改业务逻辑的前提下,透明地记录所有与AI模型的交互细节;
- FunctionInvokingChatClient :这是最强大的内置中间件。它拦截模型的回复,如果模型返回的是函数调用请求(Function Call) ,由它实施最终的调用,然后将结果反馈给模型,直到模型给出最终文本回复。我们可以使用它实现联网搜索 、查询数据库等自动化插件功能。FunctionInvokingChatClient将最重要的ReAct循环引入ChatClientAgent;
- CachingChatClient/DistributedCachingChatClient:对对话请求和响应进行缓存管理。当发送相同的对话历史时,它会先检查缓存(如Redis或内存)。如果命中,则直接返回缓存结果,不再调用昂贵的AI API。有效地使用它可以节省Token成本、提高重复问题的响应速度;
- OpenTelemetryChatClient:集成分布式追踪(Tracing)和指标(Metrics)。它利用OpenTelemetry自动记录每个请求的耗时、Token消耗量、模型名称等元数据。在生产环境中监控AI服务的稳定性、性能及费用;
- ConfigureOptionsChatClient :在请求发起前动态修改
ChatOptions。它接收一个回调函数,允许你在不修改业务代码的情况下,统一为所有请求注入特定参数(如设置默认的Temperature或MaxTokens)。可以利用实现全局策略控制,例如根据用户等级动态限制输出长度; - ReducingChatClient:管理超长对话上下文。当对话历史过长超过模型限制时,该客户端可以执行压缩、截断或总结逻辑,确保请求能成功发送给模型。使用它可以处理超长会话,防止Token溢出;
- ImageGeneratingChatClient :为普通的文本聊天客户端增加了图像生成和处理的能力。它的核心逻辑可以概括为偷梁换柱" 与自动翻译 。它的作用是让一个原本只能处理文字的模型,通过**函数调用(Function Calling)**具备生成图片的能力;
- AIContextProviderChatClient :它利用指定的一组
AIContextProvider来作为上下文的AIContext,并使用此上下文包含的消息列表作为输入; - PerServiceCallChatHistoryPersistingChatClient:在每次调用AI服务时,自动持久化和管理聊天历史记录,确保AI模型在处理请求时能看到之前的对话背景,并在请求结束时把新的对话存回去;
3. ChatClientBuilder
为了方便用户构建一个具有多个中间件的IChatClient管道,系统提供了一个ChatClientBuilder类。如下面的代码片段所示,一个ChatClientBuilder对象可以通过传入一个IChatClient对象或者一个工厂方法来创建。四个Use方法会根据给定的参数创建并注册一个作为中间件的DelegatingChatClient对象,Build方法返回的IChatClient对象就是由这些中间件组成的IChatClient管道的入口点了。
csharp
public sealed class ChatClientBuilder
{
public ChatClientBuilder(IChatClient innerClient)
public ChatClientBuilder(Func<IServiceProvider, IChatClient> innerClientFactory)
public IChatClient Build(IServiceProvider? services = null)
public ChatClientBuilder Use(Func<IChatClient, IChatClient> clientFactory)
public ChatClientBuilder Use(Func<IChatClient, IServiceProvider, IChatClient> clientFactory)
public ChatClientBuilder Use(
Func<IEnumerable<ChatMessage>, ChatOptions?, Func<IEnumerable<ChatMessage>, ChatOptions?, CancellationToken, Task>, CancellationToken, Task> sharedFunc)
public ChatClientBuilder Use(
Func<IEnumerable<ChatMessage>, ChatOptions?, IChatClient, CancellationToken, Task<ChatResponse>>? getResponseFunc,
Func<IEnumerable<ChatMessage>, ChatOptions?, IChatClient, CancellationToken, IAsyncEnumerable<ChatResponseUpdate>>? getStreamingResponseFunc)
}
对于四个Use方法,前两个都好理解,都是通过封装指定的IChatClient来创建作为中间件的DelegatingChatClient对象;第三个Use方法根据指定的委托来重写DelegatingChatClient的GetResponseAsync和GetStreamingResponseAsync方法,它只会利用指定的委托来加工作为输入的消息列表和ChatOptions,并直接返回InnerClient的的响应结果。第四个Use方法根据指定的两个委托来重写DelegatingChatClient的GetResponseAsync和GetStreamingResponseAsync方法。
csharp
namespace Microsoft.Extensions.AI;
public static class ChatClientBuilderChatClientExtensions
{
public static ChatClientBuilder AsBuilder(this IChatClient innerClient)
=>new ChatClientBuilder(innerClient);
}
系统还为IChatClient提供了一个AsBuilder的扩展方法,方便我们直接将一个IChatClient对象转换成一个ChatClientBuilder对象来进行中间件的构建。所以前面演示的实例可以改写成如下的形式:
csharp
using Microsoft.Extensions.AI;
var chatClient = new LLMChatClient()
.AsBuilder()
.Use(getResponseFunc: async (messages, options, client, cancelToken) =>
{
Console.WriteLine("foo.pre-handler");
var response = await client.GetResponseAsync(messages, options, cancelToken);
Console.WriteLine("foo.post-handler");
return response;
},getStreamingResponseFunc:null)
.Use(getResponseFunc: async (messages, options, client, cancelToken) =>
{
Console.WriteLine("bar.pre-handler");
var response = await client.GetResponseAsync(messages, options, cancelToken);
Console.WriteLine("bar.post-handler");
return response;
},getStreamingResponseFunc: null)
.Use(getResponseFunc: async (messages, options, client, cancelToken) =>
{
Console.WriteLine("baz.pre-handler");
var response = await client.GetResponseAsync(messages, options, cancelToken);
Console.WriteLine("baz.post-handler");
return response;
},getStreamingResponseFunc: null)
.Build();
var response = await chatClient.GetResponseAsync([]);
Console.WriteLine($"response: {response.Messages.Single().Text}");
class LLMChatClient : IChatClient
{
public void Dispose() { }
public Task<ChatResponse> GetResponseAsync(IEnumerable<ChatMessage> messages,
ChatOptions? options = null, CancellationToken cancellationToken = default)
=> Task.FromResult(new ChatResponse(new ChatMessage(role: ChatRole.Assistant, content: "Hello world!")));
public object? GetService(Type serviceType, object? serviceKey = null) => null;
public IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
IEnumerable<ChatMessage> messages, ChatOptions? options = null,
CancellationToken cancellationToken = default)
=> throw new NotImplementedException();
}
输出:
foo.pre-handler
bar.pre-handler
baz.pre-handler
baz.post-handler
bar.post-handler
foo.post-handler
response: Hello world!
对于上面列出的那些系统预定义的中间件,系统定义了如下的扩展方法进行注册:
csharp
public static ChatClientBuilder UsePerServiceCallChatHistoryPersistence(this ChatClientBuilder builder)
public static ChatClientBuilder UseAIContextProviders(
this ChatClientBuilder builder,
params AIContextProvider[] providers)
public static ChatClientBuilder UseChatReducer(
this ChatClientBuilder builder,
IChatReducer? reducer = null,
Action<ReducingChatClient>? configure = null)
public static ChatClientBuilder UseDistributedCache(
this ChatClientBuilder builder,
IDistributedCache? storage = null,
Action<DistributedCachingChatClient>? configure = null)
public static ChatClientBuilder UseLogging(
this ChatClientBuilder builder,
ILoggerFactory? loggerFactory = null,
Action<LoggingChatClient>? configure = null)
public static ChatClientBuilder UseOpenTelemetry(
this ChatClientBuilder builder,
ILoggerFactory? loggerFactory = null,
string? sourceName = null,
Action<OpenTelemetryChatClient>? configure = null)
public static ChatClientBuilder UseFunctionInvocation(
this ChatClientBuilder builder,
ILoggerFactory? loggerFactory = null,
Action<FunctionInvokingChatClient>? configure = null)
public static ChatClientBuilder UseImageGeneration(
this ChatClientBuilder builder,
IImageGenerator? imageGenerator = null,
Action<ImageGeneratingChatClient>? configure = null)