Mcp+Agent - 自动化BI报表实现方案探索

目前在higress社区探索场景化BI的相关落地, 看到google开源的genai-toolbox。探索出一种基于BI代理进行分解任务,计算指标的AI agent

  1. 架构设计

基于 genai-toolbox 的架构,自动化BI报表系统将包含以下组件:

  • Toolbox服务器:作为中间层,连接AI代理和数据源
  • 数据源:支持BigQuery、PostgreSQL等分析型数据库
  • AI代理:使用LangChain或Core SDK构建的报表生成代理
  • 报表工具集:预定义的SQL查询工具用于生成各类报表

交互图

  1. 数据源配置

tools.yaml 中配置BI数据源。以BigQuery为例

yaml 复制代码
sources:  
  bi-bigquery:  
    kind: bigquery  
    project: your-gcp-project  
    dataset: analytics_data  
    
  bi-postgres:  
    kind: postgres  
    host: ${DB_HOST}  
    port: 5432  
    database: ${BI_DATABASE}  
    user: ${DB_USER}  
    password: ${DB_PASSWORD}
  1. BI报表工具定义

创建专门的BI报表工具,支持各种分析查询:

less 复制代码
tools:  
  # 销售趋势报表
  sales-trend-report:  
    kind: bigquery-sql  
    source: bi-bigquery  
    description: Generate sales trend report by date range and product category  
    parameters:  
      - name: start_date  
        type: string  
        description: Start date in YYYY-MM-DD format  
      - name: end_date  
        type: string  
        description: End date in YYYY-MM-DD format  
      - name: category  
        type: string  
        description: Product category filter (optional)  
        required: false  
    statement: |  
      SELECT   
        DATE(order_date) as date,  
        product_category,  
        SUM(revenue) as total_revenue,  
        COUNT(DISTINCT order_id) as order_count,  
        AVG(order_value) as avg_order_value  
      FROM `your-project.analytics_data.sales`  
      WHERE DATE(order_date) BETWEEN @start_date AND @end_date  
        AND (@category IS NULL OR product_category = @category)  
      GROUP BY DATE(order_date), product_category  
      ORDER BY date DESC  
  
  # 用户行为分析报表
  user-behavior-report:  
    kind: bigquery-sql  
    source: bi-bigquery  
    description: Analyze user behavior patterns and engagement metrics  
    parameters:  
      - name: time_period  
        type: string  
        description: Time period (7d, 30d, 90d)  
    statement: |  
      WITH user_metrics AS (  
        SELECT   
          user_id,  
          COUNT(DISTINCT session_id) as sessions,  
          SUM(page_views) as total_page_views,  
          AVG(session_duration) as avg_session_duration  
        FROM `your-project.analytics_data.user_activity`  
        WHERE DATE(event_timestamp) >= DATE_SUB(CURRENT_DATE(), INTERVAL   
          CASE @time_period   
            WHEN '7d' THEN 7  
            WHEN '30d' THEN 30  
            WHEN '90d' THEN 90  
            ELSE 30  
          END DAY)  
        GROUP BY user_id  
      )  
      SELECT   
        COUNT(*) as total_users,  
        AVG(sessions) as avg_sessions_per_user,  
        AVG(total_page_views) as avg_page_views_per_user,  
        AVG(avg_session_duration) as avg_session_duration  
      FROM user_metrics  
  
  # 财务汇总报表
  financial-summary:  
    kind: postgres-sql  
    source: bi-postgres  
    description: Generate financial summary report  
    parameters:  
      - name: month  
        type: string  
        description: Month in YYYY-MM format  
    statement: |  
      SELECT   
        'Revenue' as metric,  
        SUM(amount) as value  
      FROM transactions   
      WHERE DATE_TRUNC('month', transaction_date) = $1::date  
        AND transaction_type = 'revenue'  
      UNION ALL  
      SELECT   
        'Expenses' as metric,  
        SUM(amount) as value  
      FROM transactions   
      WHERE DATE_TRUNC('month', transaction_date) = $1::date  
        AND transaction_type = 'expense'
  1. 工具集组织

将报表工具按业务领域分组:

yaml 复制代码
toolsets:  
  sales-analytics:  
    - sales-trend-report  
    - product-performance-report  
    - customer-segmentation-report  
    
  user-analytics:  
    - user-behavior-report  
    - user-retention-report  
    - engagement-metrics-report  
    
  financial-reports:  
    - financial-summary  
    - revenue-breakdown  
    - cost-analysis-report  
    
  executive-dashboard:  
    - sales-trend-report  
    - user-behavior-report  
    - financial-summary
  1. AI代理实现

使用LangChain SDK构建自动化报表生成代理

python 复制代码
from toolbox_langchain import ToolboxClient  
from langchain.agents import create_openai_functions_agent  
from langchain.prompts import ChatPromptTemplate  
from langchain_openai import ChatOpenAI  
  
class BIReportAgent:  
    def init(self, toolbox_url="http://127.0.0.1:5000"):  
        self.client = ToolboxClient(toolbox_url)  
        self.llm = ChatOpenAI(model="gpt-4")  
          
    async def initialize(self):  
        # 加载所有BI工具
        self.tools = self.client.load_toolset("executive-dashboard")  
          
        # 创建报表生成提示模板
        self.prompt = ChatPromptTemplate.from_messages([  
            ("system", """你是一个专业的BI分析师。你可以使用以下工具生成各种业务报表:  
            - 销售趋势分析  
            - 用户行为分析    
            - 财务汇总报表  
              
            根据用户的需求,选择合适的工具并生成相应的报表。  
            对于生成的数据,请提供专业的分析和洞察。"""),  
            ("user", "{input}"),  
            ("assistant", "{agent_scratchpad}")  
        ])  
          
        # 创建代理
        self.agent = create_openai_functions_agent(

            llm=self.llm,

            tools=self.tools,

            prompt=self.prompt

        )  
      
    async def generate_report(self, request: str):  
        """根据自然语言请求生成报表"""  
        response = await self.agent.ainvoke({"input": request})  
        return response  
  
# 使用示例
async def main():  
    agent = BIReportAgent()  
    await agent.initialize()  
      
    # 自动生成报表
    reports = [  
        "生成上个月的销售趋势报表",  
        "分析最近30天的用户行为数据",   
        "提供本月的财务汇总报表"  
    ]  
      
    for request in reports:  
        result = await agent.generate_report(request)  
        print(f"请求: {request}")  
        print(f"结果: {result}")  
        print("-" * 50)
  1. 定时报表生成
python 复制代码
import schedule  
import time  
from datetime import datetime, timedelta  
  
class ScheduledReportGenerator:  
    def init(self, agent: BIReportAgent):  
        self.agent = agent  
          
    def setup_schedules(self):  
        # 每日销售报表
        schedule.every().day.at("09:00").do(  
            self.generate_daily_sales_report

        )  
          
        # 每周用户分析报表
        schedule.every().monday.at("10:00").do(  
            self.generate_weekly_user_report

        )  
          
        # 每月财务报表
        schedule.every().month.do(

            self.generate_monthly_financial_report

        )  
      
    async def generate_daily_sales_report(self):  
        yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")  
        request = f"生成{yesterday}的销售数据报表"  
        result = await self.agent.generate_report(request)  
        # 发送邮件或保存到文件
        self.save_report("daily_sales", result)  
      
    def save_report(self, report_type: str, content: str):  
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")  
        filename = f"reports/{report_type}_{timestamp}.txt"  
        with open(filename, 'w', encoding='utf-8') as f:  
            f.write(content)
相关推荐
崔lc14 分钟前
Springboot项目集成Ai模型(阿里云百炼-DeepSeek)
java·spring boot·后端·ai
寒山李白1 小时前
Spring Boot面试题精选汇总
java·spring boot·后端·面试
磊叔的技术博客2 小时前
随笔小记:SpringBoot 3 集成 SpringDoc OpenAPI
spring boot·后端
JohnYan2 小时前
Bun技术评估 - 05 SQL
javascript·后端·bun
喵个咪2 小时前
开箱即用的GO后台管理系统 Kratos Admin - 后端权限控制
后端·go·api
用户6757049885022 小时前
如何判断两张图片的相似度?原来图片对比也可以如此简单!
后端
轻松Ai享生活2 小时前
超越可观察性:使用 eBPF 修改系统调用行为
后端
一眼万年042 小时前
Kafka ReplicaManager 深度解析:副本管理的核心引擎
后端
梁凌锐2 小时前
重构手法——代码健壮性增强类 | 防御性编程 | 引入断言
后端