Android 平台 AI Agent 技术架构深度解析

从意图识别到安全隔离 ------ 构建生产级 Agentic AI 能力的完整指南

一、背景：为什么 Android 需要 AI Agent

2026年2月26日，谷歌在三星 Galaxy S26 发布会上正式预告 Android 17，宣布将 Android 从传统"操作系统"全面升级为**"智能系统（Intelligent System）"**。Gemini 3 系列模型已深度嵌入系统底层，能理解用户意图、主动追问、生成建议和执行指令。

这意味着 Android 应用开发正式进入 Agentic AI 时代：AI 不再只是被动回答问题，而是能自主规划任务、调用工具、协调多个应用完成复杂目标。

核心变化一句话概括：用户说"帮我叫个车去机场"，Agent 自动识别意图 → 查询航班时间 → 打开网约车 App → 填入目的地 → 确认下单。

二、AI Agent 核心技术架构

一个完整的 Android AI Agent 系统由四个核心阶段构成：

sql 复制代码

┌─────────────────────────────────────────────────────────────────┐
│                     用户输入（自然语言/语音/手势）                 │
└──────────────────────────┬──────────────────────────────────────┘
                           ▼
               ┌───────────────────────┐
               │   ① 意图识别模块       │  NLU / Gemini 语义理解
               │   Intent Recognition  │  多模态输入解析
               └───────────┬───────────┘
                           ▼
               ┌───────────────────────┐
               │   ② 任务规划引擎       │  CoT 推理 / ReAct 模式
               │   Task Planning       │  子任务拆解与依赖排序
               └───────────┬───────────┘
                           ▼
               ┌───────────────────────┐
               │   ③ 工具调用层         │  Function Calling
               │   Tool Invocation     │  API 编排 / App 操控
               └───────────┬───────────┘
                           ▼
               ┌───────────────────────┐
               │   ④ 结果反馈与记忆     │  结果聚合 / 状态更新
               │   Result & Memory     │  多轮对话上下文管理
               └───────────────────────┘

2.1 意图识别（Intent Recognition）

意图识别是 Agent 的"耳朵和眼睛"，负责将用户的非结构化输入转换为结构化的任务描述。

技术实现层次：

层次	技术方案	适用场景
轻量端侧	Gemini Nano + AICore	离线场景、隐私敏感
标准云端	Gemini Pro API	复杂多模态理解
混合方案	端侧预处理 + 云端精调	兼顾延迟与精度

关键代码 ------ 基于 Gemini Nano 的端侧意图识别：

kotlin 复制代码

// build.gradle 依赖
dependencies {
    implementation("com.google.ai.edge:ai-edge:0.3.0")
    implementation("com.google.ai.edge.aicore:aicore:0.0.4-exp")
}

kotlin 复制代码

class IntentRecognizer(private val context: Context) {

    data class UserIntent(
        val action: String,                    // e.g., "order_ride", "order_food"
        val entities: Map<String, String>,      // e.g., {"destination": "机场"}
        val confidence: Float
    )

    private val generativeModel by lazy {
        GenerativeModel(
            generationConfig = generationConfig {
                this.context = this@IntentRecognizer.context
                temperature = 0.1f  // 低温度保证输出稳定
                topK = 16
                maxOutputTokens = 256
            }
        )
    }

    suspend fun recognize(userInput: String): UserIntent {
        val systemPrompt = """
            你是一个意图识别引擎。将用户输入解析为JSON格式：
            {"action": "动作类型", "entities": {"参数名": "参数值"}, "confidence": 0.0-1.0}
            支持的动作类型：order_ride, order_food, send_message, search_info, set_reminder
            只输出JSON，不要其他内容。
        """.trimIndent()

        val response = generativeModel.generateContent(
            content {
                text(systemPrompt)
                text("用户输入：$userInput")
            }
        )
        return parseIntentFromJson(response.text ?: "")
    }
}

2.2 任务规划（Task Planning）

任务规划是 Agent 的"大脑"，负责将高层意图拆解为可执行的子任务序列。采用 ReAct（Reasoning + Acting） 模式：

vbnet 复制代码

Thought: 用户想叫车去机场，需要先确认出发地点
Action:  call(get_current_location)
Observation: 当前位置 = 北京市海淀区中关村

Thought: 已知出发地和目的地，可以调用打车服务
Action:  call(order_ride, from="中关村", to="首都机场")
Observation: 订单已创建，预计15分钟到达

Thought: 任务完成，需要通知用户
Action:  call(notify_user, message="车辆已预约")

任务规划引擎实现：

kotlin 复制代码

class TaskPlanner {

    data class SubTask(
        val id: String,
        val type: String,          // "api_call", "app_action", "user_confirm"
        val toolName: String,
        val params: Map<String, Any>,
        val dependsOn: List<String> = emptyList()
    )

    data class ExecutionPlan(
        val tasks: List<SubTask>,
        val requiresUserConfirmation: Boolean
    )

    private fun planRideOrder(entities: Map<String, String>): ExecutionPlan {
        val tasks = mutableListOf<SubTask>()

        // Step 1: 获取当前位置（如果用户未指定出发地）
        if (!entities.containsKey("from")) {
            tasks.add(SubTask(id = "t1", type = "api_call",
                toolName = "get_current_location", params = emptyMap()))
        }
        // Step 2: 调用打车服务
        tasks.add(SubTask(id = "t2", type = "app_action",
            toolName = "ride_service",
            params = mapOf(
                "from" to (entities["from"] ?: "{t1.result}"),
                "to" to (entities["destination"] ?: "unknown")),
            dependsOn = if (!entities.containsKey("from")) listOf("t1") else emptyList()))
        // Step 3: 通知用户
        tasks.add(SubTask(id = "t3", type = "user_confirm",
            toolName = "notify_result",
            params = mapOf("summary" to "{t2.result}"),
            dependsOn = listOf("t2")))

        return ExecutionPlan(tasks = tasks, requiresUserConfirmation = true)
    }
}

2.3 工具调用（Tool Invocation / Function Calling）

工具调用是 Agent 的"双手"。Android 平台有三种工具调用模式：

css 复制代码

┌──────────────┬──────────────────┬──────────────────────┐
│  模式 A      │  模式 B           │  模式 C              │
│  API 直调    │  Accessibility    │  Intent/Deep Link   │
│              │  Service 操控     │  协议调用            │
├──────────────┼──────────────────┼──────────────────────┤
│ 调用后端接口  │ 模拟用户点击      │ 通过 Intent 跳转    │
│ 结构化数据流  │ 操控第三方 App UI │ 利用 App 暴露的     │
│ 最可靠       │ 通用但脆弱       │ DeepLink 入口       │
└──────────────┴──────────────────┴──────────────────────┘

Function Calling 注册与调度核心实现：

kotlin 复制代码

// 工具接口定义
interface AgentTool {
    val name: String
    val description: String
    val parameters: Map<String, ParameterDef>
    suspend fun execute(params: Map<String, Any>): ToolResult
}

data class ToolResult(val success: Boolean, val data: Any?, val errorMessage: String? = null)

// 打车服务工具 ------ 通过 Deep Link 调用
class RideServiceTool(private val context: Context) : AgentTool {
    override val name = "ride_service"
    override val description = "调用网约车服务叫车"
    override val parameters = mapOf(
        "from" to ParameterDef("string", "出发地址"),
        "to" to ParameterDef("string", "目的地地址"))

    override suspend fun execute(params: Map<String, Any>): ToolResult {
        val from = params["from"] as? String ?: return ToolResult(false, null, "缺少出发地")
        val to = params["to"] as? String ?: return ToolResult(false, null, "缺少目的地")

        val deepLink = Uri.parse("rideapp://order?from=${Uri.encode(from)}&to=${Uri.encode(to)}")
        val intent = Intent(Intent.ACTION_VIEW, deepLink).apply {
            addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
        }
        return if (intent.resolveActivity(context.packageManager) != null) {
            context.startActivity(intent)
            ToolResult(true, mapOf("status" to "app_launched", "from" to from, "to" to to))
        } else {
            callRideApi(from, to) // 降级方案：HTTP API 直接下单
        }
    }
}

// 工具调度器 ------ 按依赖拓扑排序，并行执行无依赖任务
class ToolDispatcher {
    private val registry = mutableMapOf<String, AgentTool>()

    fun register(tool: AgentTool) { registry[tool.name] = tool }

    suspend fun executePlan(plan: TaskPlanner.ExecutionPlan): List<ToolResult> {
        val results = mutableMapOf<String, ToolResult>()
        val completed = mutableSetOf<String>()
        val pending = plan.tasks.toMutableList()

        while (pending.isNotEmpty()) {
            val ready = pending.filter { task -> task.dependsOn.all { it in completed } }
            if (ready.isEmpty()) throw IllegalStateException("存在循环依赖")

            coroutineScope {
                ready.map { task ->
                    async {
                        val resolvedParams = resolveParams(task.params, results)
                        val tool = registry[task.toolName] ?: return@async task.id to
                            ToolResult(false, null, "工具 ${task.toolName} 未注册")
                        task.id to tool.execute(resolvedParams)
                    }
                }.awaitAll()
            }.forEach { (id, result) ->
                results[id] = result; completed.add(id)
            }
            pending.removeAll(ready)
        }
        return results.values.toList()
    }
}

2.4 结果反馈与上下文记忆（Result & Memory）

kotlin 复制代码

class AgentMemory {
    private val conversationHistory = mutableListOf<MemoryEntry>()
    private val preferences = mutableMapOf<String, Any>()

    data class MemoryEntry(val role: String, val content: String, val timestamp: Long = System.currentTimeMillis())

    fun addEntry(entry: MemoryEntry) {
        conversationHistory.add(entry)
        if (conversationHistory.size > 20) summarizeEarlyEntries() // 超20轮自动摘要
    }

    fun buildContext(): String = buildString {
        appendLine("=== 用户偏好 ===")
        preferences.forEach { (k, v) -> appendLine("$k: $v") }
        appendLine("=== 对话历史 ===")
        conversationHistory.takeLast(10).forEach { appendLine("[${it.role}] ${it.content}") }
    }

    fun learnFromResults(results: List<ToolResult>) {
        // 例如：记录用户常用目的地，优化下次推荐
        results.filter { it.success }.forEach { result ->
            val data = result.data as? Map<*, *> ?: return@forEach
            data["to"]?.let { dest ->
                val freq = (preferences["frequent_destinations"] as? MutableList<String>)
                    ?: mutableListOf<String>().also { preferences["frequent_destinations"] = it }
                if (dest.toString() !in freq) freq.add(dest.toString())
            }
        }
    }
}

三、实战案例：组装"智能出行助手" Agent

kotlin 复制代码

class SmartTravelAgent(private val context: Context) {
    private val intentRecognizer = IntentRecognizer(context)
    private val taskPlanner = TaskPlanner()
    private val toolDispatcher = ToolDispatcher()
    private val memory = AgentMemory()
    private val security = AgentSecurityGuard()

    init {
        toolDispatcher.register(LocationTool(context))
        toolDispatcher.register(RideServiceTool(context))
        toolDispatcher.register(WeatherTool())
        toolDispatcher.register(NotificationTool(context))
    }

    suspend fun handleUserInput(input: String): AgentResponse {
        memory.addEntry(AgentMemory.MemoryEntry(role = "user", content = input))

        // ① 意图识别
        val intent = intentRecognizer.recognize(input)
        if (intent.confidence < 0.6f)
            return AgentResponse.Clarify("我不太确定你的意思，能再说清楚一点吗？")

        // ② 安全评估
        val risk = security.evaluateRisk(intent.action)

        // ③ 任务规划
        val plan = taskPlanner.plan(intent)

        // ④ 高风险操作需用户确认
        if (security.requiresExplicitConfirmation(risk))
            return AgentResponse.ConfirmPlan(describePlan(plan), plan)

        // ⑤ 执行并反馈
        return executePlan(plan)
    }
}

Activity 中集成：

kotlin 复制代码

class MainActivity : AppCompatActivity() {
    private lateinit var agent: SmartTravelAgent

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        agent = SmartTravelAgent(applicationContext)

        binding.sendButton.setOnClickListener {
            lifecycleScope.launch {
                when (val resp = agent.handleUserInput(binding.inputField.text.toString())) {
                    is AgentResponse.Success -> showResult(resp.message)
                    is AgentResponse.Clarify -> showMessage(resp.question)
                    is AgentResponse.ConfirmPlan -> showConfirmDialog(resp.summary)
                }
            }
        }
    }
}

四、安全隔离机制设计要点

4.1 整体安全架构

scss 复制代码

┌──────────────────────────────────────────────────────────────┐
│                    用户交互层 (UI Layer)                       │
│              用户确认 / 权限授权 / 操作审计                     │
├──────────────────────────────────────────────────────────────┤
│                  Agent 隔离沙箱 (Sandbox)                     │
│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │
│  │ Agent 进程   │  │ 工具白名单    │  │ 数据访问控制(DAC)   │ │
│  │ (独立进程)   │  │ (Tool ACL)   │  │ 最小权限原则        │ │
│  └─────────────┘  └──────────────┘  └─────────────────────┘ │
├──────────────────────────────────────────────────────────────┤
│            Private Compute Core (PCC)                        │
│         端侧 AI 推理环境 / 数据不出设备                       │
├──────────────────────────────────────────────────────────────┤
│                 Android OS Kernel                            │
│             SELinux / App Sandbox / 权限系统                  │
└──────────────────────────────────────────────────────────────┘

4.2 六大安全设计原则

原则一：用户主动触发 + 分级确认

kotlin 复制代码

class AgentSecurityGuard {
    enum class RiskLevel { LOW, MEDIUM, HIGH, CRITICAL }

    private val riskMap = mapOf(
        "search_info" to RiskLevel.LOW,       // 信息查询 → 静默执行
        "set_reminder" to RiskLevel.LOW,
        "send_message" to RiskLevel.MEDIUM,    // 发消息 → Toast 提示
        "order_ride" to RiskLevel.HIGH,        // 叫车(付费) → 弹窗确认
        "make_payment" to RiskLevel.CRITICAL   // 直接支付 → 生物认证
    )

    fun requiresExplicitConfirmation(risk: RiskLevel) = risk >= RiskLevel.HIGH
    fun requiresBiometricAuth(risk: RiskLevel) = risk >= RiskLevel.CRITICAL
}

原则二：隔离沙箱执行

Agent 运行在独立进程，通过 AIDL 与主应用通信：

xml 复制代码

<service
    android:name=".agent.SandboxedAgentService"
    android:process=":agent_sandbox"
    android:isolatedProcess="true"
    android:exported="false" />

kotlin 复制代码

class SandboxedAgentService : Service() {
    private val agentBinder = object : IAgentService.Stub() {
        override fun requestAction(request: AgentActionRequest): AgentActionResult {
            val callingUid = Binder.getCallingUid()
            if (!isAuthorizedCaller(callingUid))
                return AgentActionResult.denied("未授权的调用方")
            if (!isAllowedAction(request.actionType))
                return AgentActionResult.denied("操作不在白名单中")
            return executeInSandbox(request)
        }
    }
}

原则三：工具访问白名单（Tool ACL）

kotlin 复制代码

class ToolAccessControl {
    private val toolPermissions = mapOf(
        "get_current_location" to ToolPermission(
            androidPermissions = listOf(Manifest.permission.ACCESS_FINE_LOCATION),
            userAuthLevel = AuthLevel.ONE_TIME),
        "make_payment" to ToolPermission(
            androidPermissions = listOf(Manifest.permission.USE_BIOMETRIC),
            userAuthLevel = AuthLevel.BIOMETRIC))

    suspend fun checkAccess(toolName: String, context: Context): AccessDecision {
        val permission = toolPermissions[toolName]
            ?: return AccessDecision.Denied("未知工具")
        // 检查 Android 权限 → 检查用户授权级别 → 返回决策
    }
}

原则四：Private Compute Core（PCC）端侧隐私

Android 12 起引入的 PCC 是系统级隔离的数据处理环境：

传感器/GPS/麦克风数据只在 PCC 内处理
推理结果以结构化 Intent 输出，原始数据不可被任何 App 直接访问
仅通过 Private Compute Services（开源、可审计）与外部通信

原则五：操作可审计

kotlin 复制代码

@Entity(tableName = "audit_events")
data class AuditEvent(
    @PrimaryKey(autoGenerate = true) val id: Long = 0,
    val timestamp: Long,
    val agentAction: String,
    val toolName: String,
    val riskLevel: String,
    val userConfirmed: Boolean,
    val reversible: Boolean   // 用户可一键撤销
)

原则六：限速与熔断

kotlin 复制代码

class AgentRateLimiter {
    private val limits = mapOf(
        "order_ride" to RateLimit(maxCount = 3, windowMinutes = 60),
        "send_message" to RateLimit(maxCount = 10, windowMinutes = 5),
        "make_payment" to RateLimit(maxCount = 1, windowMinutes = 10))
    // 超频 → 拒绝; 连续超频 → 熔断该操作类型
}

五、完整数据流回顾

css 复制代码

用户: "帮我叫个车，半小时后去首都机场"
  ↓
[意图识别] Gemini Nano (PCC内) → {action: "order_ride", entities: {to: "首都机场", time: "+30min"}}
  ↓
[安全评估] HIGH → 需要用户确认
  ↓
[任务规划] t1: get_location → t2: ride_service(依赖t1) → t3: notify(依赖t2)
  ↓
[用户确认] "将叫车前往首都机场，预计¥120，确认？" → 用户点击确认
  ↓
[沙箱执行] t1→"中关村" → t2→DeepLink唤起打车App → t3→通知
  ↓
[结果反馈] "已预约30分钟后从中关村到首都机场的快车，预计¥118"
  ↓
[记忆更新] 记录常用目的地"首都机场"

六、最佳实践 Checklist

#	要点	说明
1	端侧优先	意图识别和隐私数据在 Gemini Nano + PCC 中完成
2	分级确认	LOW 静默、HIGH 弹窗、CRITICAL 生物认证
3	沙箱隔离	`isolatedProcess` 独立进程 + AIDL 通信
4	工具白名单	显式注册，未注册一律拒绝，最小权限
5	操作可审计	本地数据库记录，用户可查看和撤销
6	限速熔断	敏感操作频率限制，异常时自动熔断
7	优雅降级	DeepLink → API → 端侧，层层 fallback
8	上下文压缩	超20轮自动摘要，防上下文溢出