项目简介
SuperSonic
SuperSonic 是腾讯音乐开源的下一代 AI+BI 平台 ,统一了 Chat BI (由 LLM 驱动)和 Headless BI(由语义层驱动)两种范式。
核心特性:
- Chat BI 界面:业务用户可通过自然语言查询数据,并以图表形式可视化结果
- Headless BI 界面:分析工程师可构建语义数据模型(指标/维度/标签的定义及其含义和关系)
- 可扩展架构:通过 Java SPI 机制支持自定义实现
- 语义增强:将数据语义(业务术语、列值等)注入 Prompt,减少 LLM 幻觉
- 查询简化:将复杂 SQL 语法(JOIN、公式等)从 LLM 卸载到语义层
架构组件:
| 组件 | 功能 |
|---|---|
| Knowledge Base | 从语义模型提取 Schema 信息,构建字典和索引 |
| Schema Mapper | 识别用户查询中的 Schema 元素引用 |
| Semantic Parser | 理解用户查询,生成语义查询语句(规则+LLM) |
| Semantic Corrector | 检查语义查询语句的有效性并修正 |
| Semantic Translator | 将语义查询语句转换为可执行的 SQL |
| Chat Plugin | 通过第三方工具扩展功能 |
| Chat Memory | 封装历史查询轨迹,用于 few-shot prompting |
GitHub: https://github.com/tencentmusic/supersonic (⭐ 4.9k)
Apache DolphinScheduler
Apache DolphinScheduler 是 Apache 基金会的现代数据编排平台,专注于处理数据管道中的复杂任务依赖。
核心特性:
- 易于部署:支持 Standalone、Cluster、Docker、Kubernetes 四种部署模式
- 易于使用:通过 Web UI、Python SDK 或 Open API 创建和管理工作流
- 高可靠:去中心化、多 Master 多 Worker 架构,原生支持水平扩展
- 高性能:性能是其他编排平台的数倍,每天可处理数千万任务
- 云原生:支持跨多云和数据中心编排工作流
- 版本控制:提供工作流和任务实例的版本控制
- 灵活状态控制:支持随时暂停、停止和恢复工作流
- 多租户支持
GitHub: https://github.com/apache/dolphinscheduler (⭐ 14.3k)
dsctl (DolphinScheduler CLI)
dsctl 是一个命令行工具,用于管理 DolphinScheduler 资源。
主要功能:
- 项目管理(project)
- 工作流管理(workflow)
- 任务管理(task)
- 数据源管理(datasource)
- 环境管理(environment)
- 用户管理(user)
- 监控和审计(monitor/audit)
示例命令:
bash
dsctl project list # 列出所有项目
dsctl workflow run daily-etl # 运行工作流
dsctl workflow-instance watch 123 # 监控工作流实例
dsctl task-instance log 456 --raw # 获取任务日志
集成背景
为什么需要集成?
| 场景 | 传统方式 | 集成后 |
|---|---|---|
| 运行工作流 | 登录 DS Web UI → 找到工作流 → 点击运行 | 对话输入"运行 daily-etl 工作流" |
| 监控实例 | 登录 DS Web UI → 查看实例列表 → 找到目标 | 对话输入"监控实例 123" |
| 查看日志 | 登录 DS Web UI → 找到任务实例 → 查看日志 | 对话输入"查看任务 456 的日志" |
| 批量操作 | 重复多次 UI 操作 | 对话输入批量指令 |
集成价值:
- 提升效率:减少 UI 操作步骤,一句话完成复杂操作
- 降低门槛:非技术人员可通过自然语言操作工作流
- 统一入口:在 SuperSonic 对话界面中统一管理数据查询和工作流操作
架构设计
用户自然语言输入
↓
WorkflowParser(ChatQueryParser SPI)
├── accept(): 检查 Agent 是否配置了 DSCTL 工具
└── parse(): 调用 LLM 将自然语言映射为 dsctl 子命令
结果写入 SemanticParseInfo.properties
↓
WorkflowExecutor(ChatQueryExecutor SPI)
├── accept(): 检查 queryMode == "WORKFLOW_CTL"
└── execute(): 调用 ProcessBuilder 执行 dsctl 命令
注入 DS_API_URL / DS_TOKEN 环境变量
返回 QueryResult.textResult(Markdown 格式)
↓
前端 ChatItem(修改白名单,支持 WORKFLOW_CTL 模式渲染)
数据流:
- 用户输入自然语言 → SuperSonic Chat API
- WorkflowParser接收请求 → 调用 LLM 生成 dsctl 命令
- WorkflowExecutor执行命令 → 调用 dsctl 进程
- dsctl 调用 DolphinScheduler API → 返回结果
- 结果格式化为 Markdown → 前端渲染
后端实现
文件 1:AgentToolType.java(修改)
路径:chat/server/src/main/java/com/tencent/supersonic/chat/server/agent/AgentToolType.java
java
package com.tencent.supersonic.chat.server.agent;
import java.util.HashMap;
import java.util.Map;
public enum AgentToolType {
DATASET("Text2SQL数据集"),
PLUGIN("第三方插件"),
WORK_FLOW_CTL("工作流命令行");
private final String title;
AgentToolType(String title) {
this.title = title;
}
public static Map<AgentToolType, String> getToolTypes() {
Map<AgentToolType, String> map = new HashMap<>();
map.put(DATASET, DATASET.title);
map.put(PLUGIN, PLUGIN.title);
map.put(DSCTL, DSCTL.title); // 新增
return map;
}
}
注册后,前端 getToolTypes() API 会自动返回新类型,Agent 管理页面的工具类型下拉框中会出现 "DolphinScheduler CLI"。
文件 2:WorkflowTool.java(新建)
路径:chat/server/src/main/java/com/tencent/supersonic/chat/server/agent/WorkflowTool.java
java
package com.tencent.supersonic.chat.server.agent;
import java.util.List;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
@Data
@NoArgsConstructor
@AllArgsConstructor
public class WorkflowTool extends AgentTool {
/** dsctl 可执行文件路径,默认 "dsctl" */
private String dsctlPath = "dsctl";
/** 默认使用的 DolphinScheduler 项目名 */
private String defaultProject;
private String dsApiUrl;
private String dsToken;
/** 示例问题,用于 embedding 召回 */
private List<String> exampleQuestions;
}
文件 3:WorkflowParser.java(新建)
路径:chat/server/src/main/java/com/tencent/supersonic/chat/server/parser/WorkflowParser.java
java
package com.tencent.supersonic.chat.server.parser;
import com.alibaba.fastjson2.JSON;
import com.alibaba.fastjson2.JSONArray;
import com.alibaba.fastjson2.JSONObject;
import com.tencent.supersonic.chat.server.agent.AgentToolType;
import com.tencent.supersonic.chat.server.agent.WorkflowTool;
import com.tencent.supersonic.chat.server.pojo.ParseContext;
import com.tencent.supersonic.common.pojo.ChatApp;
import com.tencent.supersonic.common.pojo.enums.AppModule;
import com.tencent.supersonic.common.util.ChatAppManager;
import com.tencent.supersonic.headless.api.pojo.SemanticParseInfo;
import com.tencent.supersonic.headless.api.pojo.response.ParseResp;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.input.Prompt;
import dev.langchain4j.model.input.PromptTemplate;
import dev.langchain4j.model.output.structured.Description;
import dev.langchain4j.provider.ModelProvider;
import dev.langchain4j.service.AiServices;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import lombok.Data;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.collections4.CollectionUtils;
/**
* Agent 管理页面 └── 添加工具: type=DSCTL, dsctlPath=/usr/bin/dsctl, defaultProject=etl-prod ↓ 存入
* Agent.toolConfig (JSON) WorkflowParser.accept() └── agent.getTools(AgentToolType.DSCTL) 非空 → 激活
* WorkflowParser.parse() └── 读取 DsctlTool 配置 → LLM 解析命令 → 写入 parseInfo.properties
* WorkflowExecutor.execute() └── 读取 dsctl_path + dsctl_cmd → Runtime.exec() → 返回结果
*/
@Slf4j
public class WorkflowParser implements ChatQueryParser {
public static final String QUERY_MODE = "WORKFLOW_CTL";
public static final String APP_KEY = "WORKFLOW_PARSER";
/** 缓存 dsctl schema 结果,避免重复执行 */
private static volatile String cachedSchema;
private static volatile long schemaCacheTime;
private static final long SCHEMA_CACHE_TTL_MS = 5 * 60 * 1000; // 5 minutes
private static final String INSTRUCTION =
"""
# Role: You are a DolphinScheduler CLI expert using dsctl.
# Task: Convert user's natural language into dsctl commands.
# dsctl Usage:
{{dsctl_help}}
# Available Commands:
{{dsctl_commands}}
# Command Generation Rules:
1. Match user intent to the appropriate command
2. For resource selectors, use discovery commands listed below
3. Include required arguments and options
4. Output ONLY the dsctl command
# Intent Patterns:
- "list/view/show" + resource → dsctl <resource> list
- "get/detail" + resource → dsctl <resource> get <selector>
- "create/new" + resource → dsctl <resource> create --name <name>
- "delete/remove" + resource → dsctl <resource> delete <selector> --force
- "run/execute" + workflow → dsctl workflow run <workflow>
- "stop" + instance → dsctl workflow-instance stop <id>
- "watch/monitor" + instance → dsctl workflow-instance watch <id>
- "log" + task → dsctl task-instance log <id> --raw
# Resource Selectors:
- project: name or code (discover: dsctl project list)
- workflow: name or code (discover: dsctl workflow list)
- task: name or code (discover: dsctl task list)
- workflow-instance: numeric id (discover: dsctl workflow-instance list)
- task-instance: numeric id (discover: dsctl task-instance list)
# Output Rules:
1. Output ONLY the dsctl command
2. If intent is unclear, output: unknown
3. Do NOT include explanations
# Question: {{question}}
# Command:
""";
public WorkflowParser() {
ChatAppManager.register(
APP_KEY,
ChatApp.builder()
.prompt(INSTRUCTION)
.name("工作流命令行")
.appModule(AppModule.CHAT)
.description("将自然语言转换为工作流命令")
.enable(true)
.build());
}
/** 结构化输出类,@Description 告诉 AiServices 每个字段的含义 */
@Data
static class DsctlCommand {
@Description("the dsctl subcommand without 'dsctl' prefix, e.g. 'workflow run daily-etl'")
private String command;
@Description("brief explanation in Chinese of what this command does")
private String thought;
}
/** 提取接口,AiServices 自动生成实现 */
interface DsctlCommandExtractor {
DsctlCommand extractCommand(String text);
}
@Override
public boolean accept(ParseContext parseContext) {
// 只有 Agent 配置了 Workflow Cli 工具才激活
List<String> tools = parseContext.getAgent().getTools(AgentToolType.WORK_FLOW_CTL);
return !CollectionUtils.isEmpty(tools);
}
@Override
public void parse(ParseContext parseContext) {
// 读取 Agent 上配置的 workflowTool(取第一个)
List<String> toolJsonList = parseContext.getAgent().getTools(AgentToolType.WORK_FLOW_CTL);
WorkflowTool workflowTool = JSONObject.parseObject(toolJsonList.getFirst(), WorkflowTool.class);
// 读取 ChatApp 配置(LLM 模型绑定 + prompt 模板)
ChatApp chatApp = parseContext.getAgent().getChatAppConfig().get(APP_KEY);
if (Objects.isNull(chatApp) || !chatApp.isEnable()) {
log.warn("WorkflowParser ChatApp [{}] is not enabled, skip.", APP_KEY);
return;
}
// 构建 prompt,替换占位符
Map<String, Object> variables = new HashMap<>();
// 1. 执行 dsctl --help 获取使用说明
String dsctlHelp = executeDsctlCommand(workflowTool.getDsctlPath(), workflowTool, "--help");
// 2. 执行 dsctl schema 获取命令列表并格式化(带缓存)
String dsctlSchema = getCachedOrExecuteSchema(workflowTool);
String formattedCommands = formatCommandsFromSchema(dsctlSchema);
variables.put("dsctl_help", dsctlHelp);
variables.put("dsctl_commands", formattedCommands);
variables.put("question", parseContext.getRequest().getQueryText());
Prompt prompt = PromptTemplate.from(chatApp.getPrompt()).apply(variables);
// 调用 LLM,使用 AiServices 做结构化输出
ChatModel model = ModelProvider.getChatModel(chatApp.getChatModelConfig());
DsctlCommandExtractor extractor = AiServices.create(DsctlCommandExtractor.class, model);
DsctlCommand cmd = extractor.extractCommand(prompt.toUserMessage().singleText());
log.info(
"WorkflowParser input=[{}] -> command=[{}] thought=[{}]",
parseContext.getRequest().getQueryText(),
cmd == null ? "null" : cmd.getCommand(),
cmd == null ? "null" : cmd.getThought());
if (cmd == null || "unknown".equalsIgnoreCase(cmd.getCommand())) {
// 无法识别,不设置 parseInfo,交给后续 parser 处理
return;
}
// 将解析结果写入 parseInfo,供 DsctlExecutor 使用
SemanticParseInfo parseInfo = new SemanticParseInfo();
parseInfo.setQueryMode(QUERY_MODE);
parseInfo.setId(1);
Map<String, Object> props = new HashMap<>();
// e.g. "workflow run daily-etl"
props.put("workflow_ctl_cmd", cmd.getCommand());
// LLM 的解释
props.put("workflow_ctl_thought", cmd.getThought());
// 可执行文件路径
props.put("workflow_ctl_path", workflowTool.getDsctlPath());
if (workflowTool.getDefaultProject() != null) {
props.put("default_project", workflowTool.getDefaultProject());
}
if (workflowTool.getDsApiUrl() != null) {
props.put("ds_api_url", workflowTool.getDsApiUrl());
}
if (workflowTool.getDsToken() != null) {
props.put("ds_token", workflowTool.getDsToken());
}
parseInfo.setProperties(props);
parseContext.getResponse().getSelectedParses().add(parseInfo);
parseContext.getResponse().setState(ParseResp.ParseState.COMPLETED);
}
/**
* 执行 dsctl 命令并返回输出
*
* @param dsctlPath dsctl 可执行文件路径
* @param workflowTool 工作流工具配置(包含环境变量)
* @param args 命令参数
* @return 命令输出
*/
private String executeDsctlCommand(String dsctlPath, WorkflowTool workflowTool, String... args) {
try {
String[] cmdArray = new String[args.length + 1];
cmdArray[0] = dsctlPath;
System.arraycopy(args, 0, cmdArray, 1, args.length);
ProcessBuilder pb = new ProcessBuilder(cmdArray);
pb.redirectErrorStream(true);
// 添加环境变量
Map<String, String> env = pb.environment();
if (workflowTool.getDsApiUrl() != null) {
env.put("DS_API_URL", workflowTool.getDsApiUrl());
}
if (workflowTool.getDsToken() != null) {
env.put("DS_API_TOKEN", workflowTool.getDsToken());
}
Process process = pb.start();
String output;
try (BufferedReader reader =
new BufferedReader(new InputStreamReader(process.getInputStream()))) {
output = reader.lines().collect(Collectors.joining("\n"));
}
boolean finished = process.waitFor(30, TimeUnit.SECONDS);
if (!finished) {
process.destroyForcibly();
log.warn("dsctl command timed out: {}", String.join(" ", cmdArray));
return "";
}
if (process.exitValue() != 0) {
log.warn("dsctl command failed with exit code {}: {}", process.exitValue(), output);
}
return output;
} catch (Exception e) {
log.error("Failed to execute dsctl command: {}", e.getMessage(), e);
return "";
}
}
/**
* 获取缓存的 schema 或执行新命令
*
* @param workflowTool 工作流工具配置
* @return dsctl schema JSON
*/
private String getCachedOrExecuteSchema(WorkflowTool workflowTool) {
long now = System.currentTimeMillis();
if (cachedSchema != null && (now - schemaCacheTime) < SCHEMA_CACHE_TTL_MS) {
log.debug("Using cached dsctl schema");
return cachedSchema;
}
synchronized (WorkflowParser.class) {
// Double-check locking
if (cachedSchema != null && (now - schemaCacheTime) < SCHEMA_CACHE_TTL_MS) {
return cachedSchema;
}
String schema = executeDsctlCommand(workflowTool.getDsctlPath(), workflowTool, "schema");
if (schema != null && !schema.isBlank()) {
cachedSchema = schema;
schemaCacheTime = now;
}
return schema;
}
}
/**
* 从 dsctl schema JSON 中格式化命令列表
*
* @param schemaJson dsctl schema 的 JSON 输出
* @return 格式化后的命令列表
*/
private String formatCommandsFromSchema(String schemaJson) {
if (schemaJson == null || schemaJson.isBlank()) {
return getDefaultCommands();
}
try {
JSONObject schema = JSON.parseObject(schemaJson);
JSONObject data = schema.getJSONObject("data");
if (data == null) {
return getDefaultCommands();
}
JSONArray commands = data.getJSONArray("commands");
if (commands == null) {
return getDefaultCommands();
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < commands.size(); i++) {
JSONObject group = commands.getJSONObject(i);
if (!"group".equals(group.getString("kind"))) {
continue;
}
String groupName = group.getString("name");
String groupSummary = group.getString("summary");
sb.append("\n## ").append(groupName).append("\n");
sb.append("# ").append(groupSummary).append("\n");
JSONArray cmdList = group.getJSONArray("commands");
if (cmdList != null) {
for (int j = 0; j < cmdList.size(); j++) {
JSONObject cmd = cmdList.getJSONObject(j);
String cmdName = cmd.getString("name");
String summary = cmd.getString("summary");
// Build command syntax
StringBuilder syntax = new StringBuilder();
syntax.append("dsctl ").append(groupName).append(" ").append(cmdName);
// Add required arguments
JSONArray args = cmd.getJSONArray("arguments");
if (args != null) {
for (int k = 0; k < args.size(); k++) {
JSONObject arg = args.getJSONObject(k);
if (arg.getBooleanValue("required")) {
syntax.append(" <").append(arg.getString("name")).append(">");
}
}
}
sb.append("- ").append(syntax).append(": ").append(summary).append("\n");
// Add discovery command if available
if (args != null) {
for (int k = 0; k < args.size(); k++) {
JSONObject arg = args.getJSONObject(k);
String discovery = arg.getString("discovery_command");
if (discovery != null && !discovery.isEmpty()) {
sb.append(" Discovery: ").append(discovery).append("\n");
break;
}
}
}
}
}
}
return sb.toString();
} catch (Exception e) {
log.error("Failed to parse dsctl schema: {}", e.getMessage(), e);
return getDefaultCommands();
}
}
/**
* 返回默认的命令列表(当 schema 解析失败时使用)
*
* @return 默认命令列表
*/
private String getDefaultCommands() {
return """
- dsctl project list: List projects
- dsctl project get <project>: Get one project
- dsctl project create --name <name>: Create a project
- dsctl project update <project>: Update a project
- dsctl project delete <project>: Delete a project
- dsctl workflow list: List workflows
- dsctl workflow get <workflow>: Get one workflow
- dsctl workflow run <workflow>: Run a workflow
- dsctl workflow delete <workflow>: Delete a workflow
- dsctl workflow-instance list: List instances
- dsctl workflow-instance get <id>: Get one instance
- dsctl workflow-instance watch <id>: Watch instance
- dsctl workflow-instance stop <id>: Stop instance
- dsctl workflow-instance digest <id>: Get instance summary
- dsctl task list --workflow <workflow>: List tasks
- dsctl task get <task>: Get one task
- dsctl task-instance list --workflow-instance <id>: List task instances
- dsctl task-instance log <id>: Get task log
- dsctl cluster list: List clusters
- dsctl datasource list: List datasources
- dsctl environment list: List environments
- dsctl user list: List users
- dsctl tenant list: List tenants
- dsctl schedule list: List schedules
- dsctl doctor: Run diagnostics
""";
}
}
关键设计说明:
ChatAppManager.register()在构造函数中注册 prompt 模板,使其在 Agent 管理页面中可配置、可绑定不同 LLMAiServices.create()自动在 prompt 中追加 JSON schema 约束,将 LLM 输出反序列化为DsctlCommand对象accept()通过检查 Agent 是否配置了DSCTL类型工具来决定是否激活,实现按 Agent 粒度控制
文件 4:WorkflowExecutor.java(新建)
路径:chat/server/src/main/java/com/tencent/supersonic/chat/server/executor/WorkflowExecutor.java
java
package com.tencent.supersonic.chat.server.executor;
import com.tencent.supersonic.chat.api.pojo.response.QueryResult;
import com.tencent.supersonic.chat.server.parser.WorkflowParser;
import com.tencent.supersonic.chat.server.pojo.ExecuteContext;
import com.tencent.supersonic.headless.api.pojo.response.QueryState;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import lombok.extern.slf4j.Slf4j;
@Slf4j
public class WorkflowExecutor implements ChatQueryExecutor {
@Override
public boolean accept(ExecuteContext executeContext) {
return WorkflowParser.QUERY_MODE.equals(executeContext.getParseInfo().getQueryMode());
}
@Override
public QueryResult execute(ExecuteContext executeContext) {
Map<String, Object> props = executeContext.getParseInfo().getProperties();
String dsctlPath = (String) props.getOrDefault("workflow_ctl_path", "dsctl");
String subCmd = (String) props.get("workflow_ctl_cmd");
String thought = (String) props.getOrDefault("workflow_ctl_thought", "");
QueryResult result = new QueryResult();
result.setQueryId(executeContext.getRequest().getQueryId());
result.setQueryMode(WorkflowParser.QUERY_MODE);
result.setQueryMode("PLAIN_TEXT");
if (subCmd == null || subCmd.isBlank()) {
result.setQueryState(QueryState.INVALID);
result.setErrorMsg("无法识别 dsctl 命令");
return result;
}
try {
String output = runDsctl(dsctlPath, subCmd, props);
result.setQueryState(QueryState.SUCCESS);
// textResult 会显示在对话界面中
result.setTextResult(buildTextResult(thought, subCmd, output));
} catch (Exception e) {
log.error("WorkflowExecutor failed, cmd=[dsctl {}]", subCmd, e);
result.setQueryState(QueryState.SEARCH_EXCEPTION);
result.setErrorMsg("dsctl 执行失败: " + e.getMessage());
}
return result;
}
/** 执行 dsctl 子命令,返回标准输出内容。 超时时间 60 秒(watch 类命令可能需要更长,可按需调整)。 */
private String runDsctl(String dsctlPath, String subCmd, Map<String, Object> props)
throws Exception {
String fullCmd = dsctlPath + " " + subCmd;
log.info("DsctlExecutor executing: {}", fullCmd);
ProcessBuilder pb = new ProcessBuilder(fullCmd.split("\\s+"));
pb.redirectErrorStream(true);
// 显式注入环境变量到子进程
Map<String, String> env = pb.environment();
String dsApiUrl = (String) props.get("ds_api_url");
String dsToken = (String) props.get("ds_token");
if (dsApiUrl != null && !dsApiUrl.isBlank()) {
env.put("DS_API_URL", dsApiUrl);
}
if (dsToken != null && !dsToken.isBlank()) {
env.put("DS_API_TOKEN", dsToken);
}
Process process = pb.start();
boolean finished = process.waitFor(60, TimeUnit.SECONDS);
if (!finished) {
process.destroyForcibly();
throw new RuntimeException("dsctl 命令超时(60s): " + fullCmd);
}
try (BufferedReader reader =
new BufferedReader(
new InputStreamReader(process.getInputStream(), StandardCharsets.UTF_8))) {
StringBuilder sb = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
sb.append(line).append("\n");
}
return sb.toString().trim();
}
}
private String buildTextResult(String thought, String subCmd, String output) {
StringBuilder sb = new StringBuilder();
if (thought != null && !thought.isBlank()) {
sb.append("**").append(thought).append("**\n\n");
}
sb.append("```\n$ dsctl ").append(subCmd).append("\n");
sb.append(output).append("\n```");
return sb.toString();
}
}
关键设计说明:
ProcessBuilder.environment()显式注入DS_API_URL/DS_API_TOKEN,解决 JVM 子进程不继承 shell 环境变量的问题redirectErrorStream(true)将 stderr 合并到 stdout,方便统一处理错误输出buildTextResult()将输出格式化为 Markdown,供前端渲染
文件 5:spring.factories(修改)
路径:launchers/standalone/src/main/resources/META-INF/spring.factories
在现有内容基础上追加(DsctlParser 放在 PlainTextParser 前,DsctlExecutor 放在 SqlExecutor 前):
properties
com.tencent.supersonic.chat.server.parser.ChatQueryParser=\
com.tencent.supersonic.chat.server.parser.NL2PluginParser, \
com.tencent.supersonic.chat.server.parser.NL2SQLParser,\
com.tencent.supersonic.chat.server.parser.WorkflowParser,\
com.tencent.supersonic.chat.server.parser.PlainTextParser
com.tencent.supersonic.chat.server.executor.ChatQueryExecutor=\
com.tencent.supersonic.chat.server.executor.PluginExecutor, \
com.tencent.supersonic.chat.server.executor.WorkflowExecutor,\
com.tencent.supersonic.chat.server.executor.SqlExecutor,\
com.tencent.supersonic.chat.server.executor.PlainTextExecutor
注意 :
WorkflowExecutor必须排在SqlExecutor前面,因为SqlExecutor.accept()永远返回true,会拦截所有请求。
前端实现
文件 6:type.ts(修改)
路径:webapp/packages/supersonic-fe/src/pages/Agent/type.ts
typescript
export enum AgentToolTypeEnum {
NL2SQL_RULE = 'NL2SQL_RULE',
NL2SQL_LLM = 'NL2SQL_LLM',
PLUGIN = 'PLUGIN',
DATASET = 'DATASET',
WORK_FLOW_CTL= 'WORK_FLOW_CTL', // 新增
}
export type AgentToolType = {
id?: string;
type: AgentToolTypeEnum;
name: string;
queryModes?: QueryModeEnum[];
plugins?: number[];
metricOptions?: MetricOptionType[];
exampleQuestions?: string[];
modelIds?: number[];
// 新增 WORK_FLOW_CTL相关字段
dsctlPath?: string;
dsApiUrl?: string;
dsToken?: string;
defaultProject?: string;
};
文件 7:ToolModal.tsx(修改)
路径:webapp/packages/supersonic-fe/src/pages/Agent/ToolModal.tsx
在现有 PLUGIN 的 FormItem 之后追加 WORK_FLOW_CTL 的表单字段:
tsx
{toolType === AgentToolTypeEnum.WORK_FLOW_CTL&& (
<>
<FormItem
name="dsApiUrl"
label="DS API 地址"
rules={[{ required: true, message: '请输入 DolphinScheduler API 地址' }]}
>
<Input placeholder="例如:http://ds-host:12345" allowClear />
</FormItem>
<FormItem name="dsToken" label="DS Token">
<Input.Password placeholder="DolphinScheduler API Token" allowClear />
</FormItem>
<FormItem name="dsctlPath" label="dsctl 路径">
<Input placeholder="dsctl 可执行文件路径,默认 dsctl" allowClear />
</FormItem>
<FormItem name="defaultProject" label="默认项目">
<Input placeholder="默认使用的 DolphinScheduler 项目名" allowClear />
</FormItem>
</>
)}
工具类型下拉框的选项由后端
getToolTypes()API 动态加载,后端枚举添加WORK_FLOW_CTL后前端下拉框自动出现 "DolphinScheduler CLI",无需额外修改选项列表。
文件 8:index.tsx(修改)
路径:webapp/packages/chat-sdk/src/components/ChatItem/index.tsx
在 updateData 函数的白名单中添加 WORKFLOW_CTL,并兼容 code: 1 的响应格式:
typescript
const updateData = (res: Result<MsgDataType>) => {
let tip: string = '';
let data: MsgDataType | undefined = undefined;
const { queryColumns, queryResults, queryState, queryMode, response, chatContext, errorMsg } =
res.data || {};
setExecuteErrorMsg(errorMsg);
if (res.code === 400 || res.code === 401 || res.code === 412) {
tip = res.msg;
} else if (res.code !== 200 && res.code !== 1) { // 兼容 code:1
tip = SEARCH_EXCEPTION_TIP;
} else if (queryState !== 'SUCCESS') {
tip = response && typeof response === 'string' ? response : SEARCH_EXCEPTION_TIP;
} else if (
(queryColumns && queryColumns.length > 0 && queryResults) ||
queryMode === 'WEB_PAGE' ||
queryMode === 'WEB_SERVICE' ||
queryMode === 'PLAIN_TEXT' ||
queryMode === 'WORKFLOW_CTL' // 新增
) {
data = res.data;
tip = '';
}
// ... 其余逻辑不变
文件 9:ExecuteItem.tsx(修改)
路径:webapp/packages/chat-sdk/src/components/ChatItem/ExecuteItem.tsx
修改 1:标题前缀(第 50 行附近)
typescript
const titlePrefix =
queryMode === 'PLAIN_TEXT' || queryMode === 'WEB_SERVICE' || queryMode === 'WORKFLOW_CTL'
? '问答'
: '数据';
修改 2 :渲染逻辑(第 163 行附近),添加 WORKFLOW_CTL 分支,使用 ReactMarkdown 渲染 Markdown 格式的命令输出:
tsx
{renderCustomExecuteNode && executeItemNode ? (
executeItemNode
) : data?.queryMode === 'PLAIN_TEXT' || data?.queryMode === 'WEB_SERVICE' ? (
data?.textResult
) : data?.queryMode === 'WORKFLOW_CTL' ? (
<ReactMarkdown>{data?.textResult ?? ''}</ReactMarkdown>
) : data?.queryMode === 'WEB_PAGE' ? (
<WebPage id={queryId!} data={data} />
) : (
<ChatMsg
isSimpleMode={isSimpleMode}
forceShowTable={showMsgContentTable}
queryId={queryId}
question={question}
data={data}
chartIndex={chartIndex}
triggerResize={triggerResize}
onMsgContentTypeChange={setMsgContentType}
/>
)}
ReactMarkdown已在该文件中 import,无需额外引入。
使用配置
前置条件
- 安装 dsctl:
bash
pip install -e .
- 配置环境变量:
bash
export DS_API_URL=http://your-dolphinscheduler-host:12345/dolphinscheduler
export DS_API_TOKEN=your-api-token
- 验证安装:
bash
dsctl doctor
Agent 配置步骤
- 进入 SuperSonic Agent 管理页面
- 新建或编辑 Agent,在"工具"Tab 中点击"新增工具"
- 工具类型选择 DolphinScheduler CLI
- 填写配置项:
| 字段 | 说明 | 示例 |
|---|---|---|
| DS API 地址 | DolphinScheduler API 地址 | http://ds-host:12345/dolphinscheduler |
| DS Token | API 认证 Token | your_token_here |
| dsctl 路径 | dsctl 可执行文件路径 | /usr/local/bin/dsctl |
| 默认项目 | 默认使用的项目名 | etl-prod |
- 在 Agent 的 ChatApp 配置中,确认
DsctlParser已启用并绑定了 LLM 模型
支持的自然语言指令
| 用户输入示例 | 对应 dsctl 命令 |
|---|---|
| "健康检查" | dsctl doctor |
| "列出所有项目" | dsctl project list |
| "切换到 etl-prod 项目" | dsctl use project etl-prod |
| "列出所有工作流" | dsctl workflow list |
| "运行 daily-etl 工作流" | dsctl workflow run daily-etl |
| "监控工作流实例 123" | dsctl workflow-instance watch 123 |
| "查看工作流实例 123 的摘要" | dsctl workflow-instance digest 123 |
| "列出工作流实例 123 的任务" | dsctl task-instance list --workflow-instance 123 |
| "查看任务 456 的原始日志" | dsctl task-instance log 456 --raw |
扩展要点总结
| 扩展点 | 接口 | 作用 |
|---|---|---|
| 意图识别 | ChatQueryParser |
将自然语言映射为 dsctl 命令 |
| 命令执行 | ChatQueryExecutor |
调用 dsctl 进程,返回结果 |
| 工具类型 | AgentToolType |
在 Agent 管理页面中配置连接参数 |
| 工具配置 | AgentTool 子类 |
携带 dsctl 路径、API 地址等配置 |
| LLM 调用 | AiServices + ChatAppManager |
结构化提取命令,prompt 可在 UI 中修改 |
整个集成无需修改 SuperSonic 框架核心代码 ,只需实现接口并在 spring.factories 中注册,体现了 SPI 机制的可插拔设计。