Mcp+Agent - 自动化BI报表实现方案探索

目前在higress社区探索场景化BI的相关落地, 看到google开源的genai-toolbox。探索出一种基于BI代理进行分解任务,计算指标的AI agent

  1. 架构设计

基于 genai-toolbox 的架构,自动化BI报表系统将包含以下组件:

  • Toolbox服务器:作为中间层,连接AI代理和数据源
  • 数据源:支持BigQuery、PostgreSQL等分析型数据库
  • AI代理:使用LangChain或Core SDK构建的报表生成代理
  • 报表工具集:预定义的SQL查询工具用于生成各类报表

交互图

  1. 数据源配置

tools.yaml 中配置BI数据源。以BigQuery为例

yaml 复制代码
sources:  
  bi-bigquery:  
    kind: bigquery  
    project: your-gcp-project  
    dataset: analytics_data  
    
  bi-postgres:  
    kind: postgres  
    host: ${DB_HOST}  
    port: 5432  
    database: ${BI_DATABASE}  
    user: ${DB_USER}  
    password: ${DB_PASSWORD}
  1. BI报表工具定义

创建专门的BI报表工具,支持各种分析查询:

less 复制代码
tools:  
  # 销售趋势报表
  sales-trend-report:  
    kind: bigquery-sql  
    source: bi-bigquery  
    description: Generate sales trend report by date range and product category  
    parameters:  
      - name: start_date  
        type: string  
        description: Start date in YYYY-MM-DD format  
      - name: end_date  
        type: string  
        description: End date in YYYY-MM-DD format  
      - name: category  
        type: string  
        description: Product category filter (optional)  
        required: false  
    statement: |  
      SELECT   
        DATE(order_date) as date,  
        product_category,  
        SUM(revenue) as total_revenue,  
        COUNT(DISTINCT order_id) as order_count,  
        AVG(order_value) as avg_order_value  
      FROM `your-project.analytics_data.sales`  
      WHERE DATE(order_date) BETWEEN @start_date AND @end_date  
        AND (@category IS NULL OR product_category = @category)  
      GROUP BY DATE(order_date), product_category  
      ORDER BY date DESC  
  
  # 用户行为分析报表
  user-behavior-report:  
    kind: bigquery-sql  
    source: bi-bigquery  
    description: Analyze user behavior patterns and engagement metrics  
    parameters:  
      - name: time_period  
        type: string  
        description: Time period (7d, 30d, 90d)  
    statement: |  
      WITH user_metrics AS (  
        SELECT   
          user_id,  
          COUNT(DISTINCT session_id) as sessions,  
          SUM(page_views) as total_page_views,  
          AVG(session_duration) as avg_session_duration  
        FROM `your-project.analytics_data.user_activity`  
        WHERE DATE(event_timestamp) >= DATE_SUB(CURRENT_DATE(), INTERVAL   
          CASE @time_period   
            WHEN '7d' THEN 7  
            WHEN '30d' THEN 30  
            WHEN '90d' THEN 90  
            ELSE 30  
          END DAY)  
        GROUP BY user_id  
      )  
      SELECT   
        COUNT(*) as total_users,  
        AVG(sessions) as avg_sessions_per_user,  
        AVG(total_page_views) as avg_page_views_per_user,  
        AVG(avg_session_duration) as avg_session_duration  
      FROM user_metrics  
  
  # 财务汇总报表
  financial-summary:  
    kind: postgres-sql  
    source: bi-postgres  
    description: Generate financial summary report  
    parameters:  
      - name: month  
        type: string  
        description: Month in YYYY-MM format  
    statement: |  
      SELECT   
        'Revenue' as metric,  
        SUM(amount) as value  
      FROM transactions   
      WHERE DATE_TRUNC('month', transaction_date) = $1::date  
        AND transaction_type = 'revenue'  
      UNION ALL  
      SELECT   
        'Expenses' as metric,  
        SUM(amount) as value  
      FROM transactions   
      WHERE DATE_TRUNC('month', transaction_date) = $1::date  
        AND transaction_type = 'expense'
  1. 工具集组织

将报表工具按业务领域分组:

yaml 复制代码
toolsets:  
  sales-analytics:  
    - sales-trend-report  
    - product-performance-report  
    - customer-segmentation-report  
    
  user-analytics:  
    - user-behavior-report  
    - user-retention-report  
    - engagement-metrics-report  
    
  financial-reports:  
    - financial-summary  
    - revenue-breakdown  
    - cost-analysis-report  
    
  executive-dashboard:  
    - sales-trend-report  
    - user-behavior-report  
    - financial-summary
  1. AI代理实现

使用LangChain SDK构建自动化报表生成代理

python 复制代码
from toolbox_langchain import ToolboxClient  
from langchain.agents import create_openai_functions_agent  
from langchain.prompts import ChatPromptTemplate  
from langchain_openai import ChatOpenAI  
  
class BIReportAgent:  
    def init(self, toolbox_url="http://127.0.0.1:5000"):  
        self.client = ToolboxClient(toolbox_url)  
        self.llm = ChatOpenAI(model="gpt-4")  
          
    async def initialize(self):  
        # 加载所有BI工具
        self.tools = self.client.load_toolset("executive-dashboard")  
          
        # 创建报表生成提示模板
        self.prompt = ChatPromptTemplate.from_messages([  
            ("system", """你是一个专业的BI分析师。你可以使用以下工具生成各种业务报表:  
            - 销售趋势分析  
            - 用户行为分析    
            - 财务汇总报表  
              
            根据用户的需求,选择合适的工具并生成相应的报表。  
            对于生成的数据,请提供专业的分析和洞察。"""),  
            ("user", "{input}"),  
            ("assistant", "{agent_scratchpad}")  
        ])  
          
        # 创建代理
        self.agent = create_openai_functions_agent(

            llm=self.llm,

            tools=self.tools,

            prompt=self.prompt

        )  
      
    async def generate_report(self, request: str):  
        """根据自然语言请求生成报表"""  
        response = await self.agent.ainvoke({"input": request})  
        return response  
  
# 使用示例
async def main():  
    agent = BIReportAgent()  
    await agent.initialize()  
      
    # 自动生成报表
    reports = [  
        "生成上个月的销售趋势报表",  
        "分析最近30天的用户行为数据",   
        "提供本月的财务汇总报表"  
    ]  
      
    for request in reports:  
        result = await agent.generate_report(request)  
        print(f"请求: {request}")  
        print(f"结果: {result}")  
        print("-" * 50)
  1. 定时报表生成
python 复制代码
import schedule  
import time  
from datetime import datetime, timedelta  
  
class ScheduledReportGenerator:  
    def init(self, agent: BIReportAgent):  
        self.agent = agent  
          
    def setup_schedules(self):  
        # 每日销售报表
        schedule.every().day.at("09:00").do(  
            self.generate_daily_sales_report

        )  
          
        # 每周用户分析报表
        schedule.every().monday.at("10:00").do(  
            self.generate_weekly_user_report

        )  
          
        # 每月财务报表
        schedule.every().month.do(

            self.generate_monthly_financial_report

        )  
      
    async def generate_daily_sales_report(self):  
        yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")  
        request = f"生成{yesterday}的销售数据报表"  
        result = await self.agent.generate_report(request)  
        # 发送邮件或保存到文件
        self.save_report("daily_sales", result)  
      
    def save_report(self, report_type: str, content: str):  
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")  
        filename = f"reports/{report_type}_{timestamp}.txt"  
        with open(filename, 'w', encoding='utf-8') as f:  
            f.write(content)
相关推荐
weixin_437398216 分钟前
转Go学习笔记
linux·服务器·开发语言·后端·架构·golang
程序员爱钓鱼1 小时前
Go语言中的反射机制 — 元编程技巧与注意事项
前端·后端·go
paopaokaka_luck4 小时前
基于SpringBoot+Vue的电影售票系统(协同过滤算法)
vue.js·spring boot·后端
IT_102410 小时前
Spring Boot项目开发实战销售管理系统——系统设计!
大数据·spring boot·后端
ai小鬼头11 小时前
AIStarter最新版怎么卸载AI项目?一键删除操作指南(附路径设置技巧)
前端·后端·github
Touper.11 小时前
SpringBoot -- 自动配置原理
java·spring boot·后端
一只叫煤球的猫11 小时前
普通程序员,从开发到管理岗,为什么我越升职越痛苦?
前端·后端·全栈
一只鹿鹿鹿11 小时前
信息化项目验收,软件工程评审和检查表单
大数据·人工智能·后端·智慧城市·软件工程
专注VB编程开发20年12 小时前
开机自动后台运行,在Windows服务中托管ASP.NET Core
windows·后端·asp.net
程序员岳焱12 小时前
Java 与 MySQL 性能优化:MySQL全文检索查询优化实践
后端·mysql·性能优化