目前在higress社区探索场景化BI的相关落地, 看到google开源的genai-toolbox。探索出一种基于BI代理进行分解任务,计算指标的AI agent
- 架构设计

基于 genai-toolbox
的架构,自动化BI报表系统将包含以下组件:
- Toolbox服务器:作为中间层,连接AI代理和数据源
- 数据源:支持BigQuery、PostgreSQL等分析型数据库
- AI代理:使用LangChain或Core SDK构建的报表生成代理
- 报表工具集:预定义的SQL查询工具用于生成各类报表
交互图

- 数据源配置
在 tools.yaml
中配置BI数据源。以BigQuery为例
yaml
sources:
bi-bigquery:
kind: bigquery
project: your-gcp-project
dataset: analytics_data
bi-postgres:
kind: postgres
host: ${DB_HOST}
port: 5432
database: ${BI_DATABASE}
user: ${DB_USER}
password: ${DB_PASSWORD}
- BI报表工具定义
创建专门的BI报表工具,支持各种分析查询:
less
tools:
# 销售趋势报表
sales-trend-report:
kind: bigquery-sql
source: bi-bigquery
description: Generate sales trend report by date range and product category
parameters:
- name: start_date
type: string
description: Start date in YYYY-MM-DD format
- name: end_date
type: string
description: End date in YYYY-MM-DD format
- name: category
type: string
description: Product category filter (optional)
required: false
statement: |
SELECT
DATE(order_date) as date,
product_category,
SUM(revenue) as total_revenue,
COUNT(DISTINCT order_id) as order_count,
AVG(order_value) as avg_order_value
FROM `your-project.analytics_data.sales`
WHERE DATE(order_date) BETWEEN @start_date AND @end_date
AND (@category IS NULL OR product_category = @category)
GROUP BY DATE(order_date), product_category
ORDER BY date DESC
# 用户行为分析报表
user-behavior-report:
kind: bigquery-sql
source: bi-bigquery
description: Analyze user behavior patterns and engagement metrics
parameters:
- name: time_period
type: string
description: Time period (7d, 30d, 90d)
statement: |
WITH user_metrics AS (
SELECT
user_id,
COUNT(DISTINCT session_id) as sessions,
SUM(page_views) as total_page_views,
AVG(session_duration) as avg_session_duration
FROM `your-project.analytics_data.user_activity`
WHERE DATE(event_timestamp) >= DATE_SUB(CURRENT_DATE(), INTERVAL
CASE @time_period
WHEN '7d' THEN 7
WHEN '30d' THEN 30
WHEN '90d' THEN 90
ELSE 30
END DAY)
GROUP BY user_id
)
SELECT
COUNT(*) as total_users,
AVG(sessions) as avg_sessions_per_user,
AVG(total_page_views) as avg_page_views_per_user,
AVG(avg_session_duration) as avg_session_duration
FROM user_metrics
# 财务汇总报表
financial-summary:
kind: postgres-sql
source: bi-postgres
description: Generate financial summary report
parameters:
- name: month
type: string
description: Month in YYYY-MM format
statement: |
SELECT
'Revenue' as metric,
SUM(amount) as value
FROM transactions
WHERE DATE_TRUNC('month', transaction_date) = $1::date
AND transaction_type = 'revenue'
UNION ALL
SELECT
'Expenses' as metric,
SUM(amount) as value
FROM transactions
WHERE DATE_TRUNC('month', transaction_date) = $1::date
AND transaction_type = 'expense'
- 工具集组织
将报表工具按业务领域分组:
yaml
toolsets:
sales-analytics:
- sales-trend-report
- product-performance-report
- customer-segmentation-report
user-analytics:
- user-behavior-report
- user-retention-report
- engagement-metrics-report
financial-reports:
- financial-summary
- revenue-breakdown
- cost-analysis-report
executive-dashboard:
- sales-trend-report
- user-behavior-report
- financial-summary
- AI代理实现
使用LangChain SDK构建自动化报表生成代理
python
from toolbox_langchain import ToolboxClient
from langchain.agents import create_openai_functions_agent
from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
class BIReportAgent:
def init(self, toolbox_url="http://127.0.0.1:5000"):
self.client = ToolboxClient(toolbox_url)
self.llm = ChatOpenAI(model="gpt-4")
async def initialize(self):
# 加载所有BI工具
self.tools = self.client.load_toolset("executive-dashboard")
# 创建报表生成提示模板
self.prompt = ChatPromptTemplate.from_messages([
("system", """你是一个专业的BI分析师。你可以使用以下工具生成各种业务报表:
- 销售趋势分析
- 用户行为分析
- 财务汇总报表
根据用户的需求,选择合适的工具并生成相应的报表。
对于生成的数据,请提供专业的分析和洞察。"""),
("user", "{input}"),
("assistant", "{agent_scratchpad}")
])
# 创建代理
self.agent = create_openai_functions_agent(
llm=self.llm,
tools=self.tools,
prompt=self.prompt
)
async def generate_report(self, request: str):
"""根据自然语言请求生成报表"""
response = await self.agent.ainvoke({"input": request})
return response
# 使用示例
async def main():
agent = BIReportAgent()
await agent.initialize()
# 自动生成报表
reports = [
"生成上个月的销售趋势报表",
"分析最近30天的用户行为数据",
"提供本月的财务汇总报表"
]
for request in reports:
result = await agent.generate_report(request)
print(f"请求: {request}")
print(f"结果: {result}")
print("-" * 50)
- 定时报表生成
python
import schedule
import time
from datetime import datetime, timedelta
class ScheduledReportGenerator:
def init(self, agent: BIReportAgent):
self.agent = agent
def setup_schedules(self):
# 每日销售报表
schedule.every().day.at("09:00").do(
self.generate_daily_sales_report
)
# 每周用户分析报表
schedule.every().monday.at("10:00").do(
self.generate_weekly_user_report
)
# 每月财务报表
schedule.every().month.do(
self.generate_monthly_financial_report
)
async def generate_daily_sales_report(self):
yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
request = f"生成{yesterday}的销售数据报表"
result = await self.agent.generate_report(request)
# 发送邮件或保存到文件
self.save_report("daily_sales", result)
def save_report(self, report_type: str, content: str):
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"reports/{report_type}_{timestamp}.txt"
with open(filename, 'w', encoding='utf-8') as f:
f.write(content)