拆解 OpenHands（11）--- Runtime主要组件

- 对于浏览动作，ActionExecutor 会使用 BrowserEnv 来处理。
- 如果涉及插件，ActionExecutor 会通过插件系统处理
AgentSkillsPlugin：提供智能体技能功能的插件
- AgentSkillsPlugin 是一个插件，继承自 Plugin 基类。
- Runtime 初始化时，插件会被加载到插件字典中。插件通过 PluginRequirement 机制被注册到系统中。
- 特定动作触发时调用相应插件功能。
BrowserEnv：浏览器环境封装，使用 BrowserGym 库。https://modelscope.cn/learn/434066
- ActionExecutor 在初始化时根据配置决定是否启用浏览器环境。
- 当需要执行浏览相关的动作时，ActionExecutor 会调用 BrowserEnv 的方法。
- BrowserEnv 运行在一个独立的多进程环境中。

0x02 数据流https://modelscope.cn/learn/434066

Runtime 的数据流如下：

Runtime 会发起动作请求 → ActionExecutor.run_action()
ActionExecutor 根据动作类型调用相应的处理方法；
如果涉及插件，通过插件系统处理；
如果涉及浏览器，调用 BrowserEnv 处理；https://modelscope.cn/learn/434066
返回观察结果给智能体。

0x03 插件系统

Runtime会遇到如下问题：新增模块（如自定义工具、新 LLM 模型）时，需修改核心代码，扩展性差；多任务并发执行时，模块间交互频繁，易出现性能瓶颈；框架部署与运维复杂，难以适配不同环境（本地、云端、边缘端）。https://modelscope.cn/learn/434066

因此，业界大多采用微服务架构或插件化设计，模块间通过标准化接口通信，新增功能只需开发插件并注册。

3.1 sandbox_pluginshttps://modelscope.cn/learn/434066

sandbox_plugins 在 OpenHands 的 CodeActAgent 中起到了关键作用，主要用于定义和配置代理在沙箱环境中可以使用的工具和功能。这些插件是代理能够与环境交互并完成任务的基础工具集。

sandbox_plugins 的定义和作用https://modelscope.cn/learn/434066

在 CodeActAgent 类中，sandbox_plugins 是一个类属性，定义了代理在沙箱环境中需要的插件：

复制代码

sandbox_plugins: list[PluginRequirement] = [
    AgentSkillsRequirement(),
    JupyterRequirement(),
]

这些插件为代理提供了在沙箱环境中执行任务所需的工具和功能。

具体插件功能

AgentSkillsRequirement 和 JupyterRequirement 是两个插件需求类。

AgentSkillsRequirement：提供了一系列 Python 函数和工具，使代理能够执行各种操作，包括文件操作、目录浏览、代码执行等基本技能。需要在 JupyterRequirement 之前初始化，因为 Jupyter 需要使用这些函数。
JupyterRequirement：提供了交互式 Python 解释器环境，允许代理执行 Python 代码，依赖于 AgentSkillsRequirement 提供的函数。

插件在系统中的使用https://modelscope.cn/learn/434066

从代码中可以看出，这些插件在多个地方被使用：

在 Runtime 初始化时：https://modelscope.cn/learn/434066

在 agent_session.py 中

self.runtime = runtime_cls(
plugins=agent.sandbox_plugins,
)
在 Runtime 中设置插件：

在 base.py 中

self.plugins = copy.deepcopy(plugins) if plugins is not None and len(plugins) > 0 else []

这些插件为代理提供了以下能力：

执行 Bash 命令：通过 AgentSkills 中的命令执行功能
执行 Python 代码：通过 Jupyter 插件提供 IPython 环境
文件系统操作：读取、写入、编辑文件https://modelscope.cn/learn/434066
目录浏览：查看和导航文件系统
其他实用工具：各种辅助函数和工具

我们接下来具体分析基类Plugin，AgentSkillsRequirement 和 JupyterPlugin

3.2 Plugin 基类https://modelscope.cn/learn/434066

复制代码

class Plugin:
    """Base class for a plugin.

    This will be initialized by the runtime client, which will run inside docker.
    """

    name: str

    @abstractmethod
    async def initialize(self, username: str) -> None:
        """Initialize the plugin."""
        pass

    @abstractmethod
    async def run(self, action: Action) -> Observation:
        """Run the plugin for a given action."""
        pass


@dataclass
class PluginRequirement:
    """Requirement for a plugin."""

    name: str

插件为：

复制代码

ALL_PLUGINS = {
    'jupyter': JupyterPlugin,
    'agent_skills': AgentSkillsPlugin,
    'vscode': VSCodePlugin,
}

3.3 JupyterPlugin

JupyterPlugin 是 OpenHands 框架中的 Jupyter 内核插件，基于 Plugin 基类实现，核心职责是启动 Jupyter Kernel Gateway（内核网关）服务，提供 IPython 代码单元格的异步执行能力，支持代码运行、输出捕获（文本 / 图片）及 Python 解释器路径获取，是框架中集成交互式数据分析、代码调试等 Jupyter 相关功能的核心组件。

核心特色https://modelscope.cn/learn/434066

跨平台适配 ：兼容 Windows、Linux、macOS 系统，针对不同系统采用差异化的进程启动方式（Windows 用 subprocess.Popen，类 Unix 用 asyncio.create_subprocess_shell）。
灵活的运行时支持：区分本地运行时（LocalRuntime）与非本地运行时，适配不同部署场景（如沙箱环境、本地开发环境），自动处理工作目录与环境变量配置。
端口自动分配 ：在 40000-49999 端口范围内自动查找可用https://modelscope.cn/learn/434066 TCP 端口，避免端口冲突。
异步代码执行 ：基于 JupyterKernel 封装异步代码执行逻辑，支持超时控制，能捕获文本输出与图片 URL 等结构化结果。
环境隔离与兼容 ：通过 micromamba 虚拟环境或本地环境变量确https://modelscope.cn/learn/434066保依赖一致性，支持 Poetry 项目的路径配置，适配 OpenHands 框架的工程化部署。

流程图

代码

复制代码

@dataclass
class JupyterRequirement(PluginRequirement):
    """Jupyter插件的依赖声明类，用于框架识别插件依赖。"""
    name: str = 'jupyter'  # 依赖名称，固定为'jupyter'

class JupyterPlugin(Plugin):
    """Jupyter插件，提供Jupyter Kernel Gateway启动与IPython代码执行能力。"""
    name: str = 'jupyter'  # 插件名称，固定为'jupyter'
    kernel_gateway_port: int  # Jupyter Kernel Gateway服务端口
    kernel_id: str  # Jupyter内核ID
    gateway_process: asyncio.subprocess.Process | subprocess.Popen  # 内核网关进程对象
    python_interpreter_path: str  # Python解释器路径

    async def initialize(
        self, username: str, kernel_id: str = 'openhands-default'
    ) -> None:
        """初始化Jupyter插件，启动Kernel Gateway服务，配置运行环境。

        参数:
            username: 执行用户名称（非本地运行时使用）
            kernel_id: Jupyter内核ID（默认：openhands-default）
        """
        # 在40000-49999端口范围内查找可用TCP端口，避免冲突
        self.kernel_gateway_port = find_available_tcp_port(40000, 49999)
        self.kernel_id = kernel_id
        # 判断是否为本地运行时（通过环境变量LOCAL_RUNTIME_MODE标记）
        is_local_runtime = os.environ.get('LOCAL_RUNTIME_MODE') == '1'
        # 判断是否为Windows系统
        is_windows = sys.platform == 'win32'

        if not is_local_runtime:
            # 非本地运行时：配置用户切换前缀与Poetry虚拟环境
            # 若启用SU_TO_USER，则添加"su - 用户名 -s "前缀（切换用户执行命令）
            prefix = f'su - {username} -s ' if SU_TO_USER else ''
            # 命令前缀：切换到代码仓库目录，配置环境变量，使用micromamba虚拟环境
            poetry_prefix = (
                'cd /openhands/code\n'
                'export POETRY_VIRTUALENVS_PATH=/openhands/poetry;\n'
                'export PYTHONPATH=/openhands/code:$PYTHONPATH;\n'
                'export MAMBA_ROOT_PREFIX=/openhands/micromamba;\n'
                '/openhands/micromamba/bin/micromamba run -n openhands '
            )
        else:
            # 本地运行时：无需用户切换，直接使用本地环境
            prefix = ''
            # 从环境变量获取代码仓库路径（本地运行时必须配置）
            code_repo_path = os.environ.get('OPENHANDS_REPO_PATH')
            if not code_repo_path:
                raise ValueError(
                    'OPENHANDS_REPO_PATH environment variable is not set. '
                    'This is required for the jupyter plugin to work with LocalRuntime.'
                )
            # 命令前缀：切换到代码仓库目录（本地环境依赖PATH确保环境正确）
            poetry_prefix = f'cd {code_repo_path}\n'

        if is_windows:
            # Windows系统：构建CMD格式的启动命令
            jupyter_launch_command = (
                f'cd /d "{code_repo_path}" && '  # 切换到代码仓库目录（/d参数支持跨盘符切换）
                f'"{sys.executable}" -m jupyter kernelgateway '  # 启动Jupyter Kernel Gateway
                '--KernelGatewayApp.ip=0.0.0.0 '  # 绑定所有网络接口
                f'--KernelGatewayApp.port={self.kernel_gateway_port}'  # 指定端口
            )

            # Windows系统使用同步subprocess.Popen启动进程（asyncio在Windows有兼容性限制）
            self.gateway_process = subprocess.Popen(  # type: ignore[ASYNC101] # noqa: ASYNC101
                jupyter_launch_command,
                stdout=subprocess.PIPE,  # 捕获标准输出
                stderr=subprocess.STDOUT,  # 标准错误重定向到标准输出
                shell=True,  # 使用shell执行命令
                text=True,  # 输出以文本模式返回
            )

            # Windows系统同步等待Kernel Gateway启动（读取输出直到包含'at'字符，标识服务就绪）
            output = ''
            while should_continue():
                if self.gateway_process.stdout is None:
                    time.sleep(1)  # 无输出时等待1秒
                    continue

                line = self.gateway_process.stdout.readline()  # 读取一行输出
                if not line:
                    time.sleep(1)
                    continue

                output += line
                if 'at' in line:  # 服务启动成功的标识（输出含"at"，如"Listening at..."）
                    break

                time.sleep(1)
        else:
            # 类Unix系统（Linux/macOS）：构建Bash格式的启动命令
            jupyter_launch_command = (
                f"{prefix}/bin/bash << 'EOF'\n"  # 切换到bash执行，EOF避免变量解析
                f'{poetry_prefix}'  # 环境配置前缀（虚拟环境/工作目录）
                f'"{sys.executable}" -m jupyter kernelgateway '  # 启动Kernel Gateway
                '--KernelGatewayApp.ip=0.0.0.0 '  # 绑定所有网络接口
                f'--KernelGatewayApp.port={self.kernel_gateway_port}\n'  # 指定端口
                'EOF'
            )

            # 类Unix系统使用asyncio创建异步子进程（避免阻塞事件循环）
            self.gateway_process = await asyncio.create_subprocess_shell(
                jupyter_launch_command,
                stderr=asyncio.subprocess.STDOUT,  # 标准错误重定向到标准输出
                stdout=asyncio.subprocess.PIPE,  # 捕获标准输出
            )
            # 异步等待Kernel Gateway启动（读取输出直到包含'at'字符）
            output = ''
            while should_continue() and self.gateway_process.stdout is not None:
                line_bytes = await self.gateway_process.stdout.readline()  # 异步读取一行输出
                line = line_bytes.decode('utf-8')  # 字节转字符串
                output += line
                if 'at' in line:
                    break
                await asyncio.sleep(1)  # 等待1秒

        # 执行测试代码，获取当前Python解释器路径（验证环境正确性）
        _obs = await self.run(
            IPythonRunCellAction(code='import sys; print(sys.executable)')
        )
        self.python_interpreter_path = _obs.content.strip()  # 提取并保存解释器路径

    async def _run(self, action: Action) -> IPythonRunCellObservation:
        """内部方法：在Jupyter内核中执行代码单元格。

        参数:
            action: 待执行的动作（必须是IPythonRunCellAction类型）

        返回:
            IPythonRunCellObservation: 代码执行结果的观察值（含文本内容、图片URL等）
        """
        # 校验动作类型：仅支持IPythonRunCellAction
        if not isinstance(action, IPythonRunCellAction):
            raise ValueError(
                f'Jupyter plugin only supports IPythonRunCellAction, but got {action}'
            )

        # 初始化JupyterKernel（若未初始化）
        if not hasattr(self, 'kernel'):
            self.kernel = JupyterKernel(
                f'localhost:{self.kernel_gateway_port}',  # 内核网关地址（本地+端口）
                self.kernel_id  # 内核ID
            )

        # 若内核未初始化，执行初始化（建立连接）
        if not self.kernel.initialized:
            await self.kernel.initialize()

        # 异步执行代码，支持超时控制（超时时间从action获取）
        output = await self.kernel.execute(action.code, timeout=action.timeout)

        # 从结构化输出中提取文本内容与图片URL
        text_content = output.get('text', '')  # 文本输出（stdout/stderr）
        image_urls = output.get('images', [])  # 图片URL列表（如matplotlib绘图结果）

        # 返回封装后的观察结果
        return IPythonRunCellObservation(
            content=text_content,  # 文本内容
            code=action.code,  # 执行的代码
            image_urls=image_urls if image_urls else None,  # 图片URL（无则为None）
        )

    async def run(self, action: Action) -> IPythonRunCellObservation:
        """公开接口：执行IPython代码动作，返回观察结果。

        参数:
            action: 待执行的IPythonRunCellAction动作

        返回:
            IPythonRunCellObservation: 代码执行结果
        """
        # 调用内部_run方法执行代码，返回结果
        obs = await self._run(action)
        return obs

3.4 AgentSkillsPlugin

功能概述

AgentSkillsPlugin 是 OpenHands 框架中管理智能体技能（Agent Skills）的核心插件，负责整合文件操作（file_ops）、文件读取（file_reader）、代码仓库操作（repo_ops）等基础技能模块，通过动态导入机制将分散的技能函数统一暴露给框架，同时提供插件依赖声明与文档自动生成能力，是智能体获取文件处理、仓库管理等核心操作能力的关键组件。

复制代码

class AgentSkillsPlugin(Plugin):
    name: str = 'agent_skills'

    async def initialize(self, username: str) -> None:
        """Initialize the plugin."""
        pass

    async def run(self, action: Action) -> Observation:
        """Run the plugin for a given action."""
        raise NotImplementedError('AgentSkillsPlugin does not support run method')

核心特色

模块化技能整合 ：通过动态导入机制，将 file_ops、file_reader、repo_ops 等独立模块的技能函数统一聚合，简化框架对技能的调用与管理。
自动文档生成 ：扫描所有导入的技能函数，提取函数签名与文档字符串（__doc__），自动生成标准化文档，提升开发可维护性。
柔性依赖处理 ：对 repo_ops 模块采用可选导入策略，导入失败时仅跳过该模块，不影响其他技能的正常使用，增强插件兼容性。
极简初始化设计：插件初始化逻辑为空实现，无需额外配置，聚焦于技能函数的聚合与暴露，降低使用门槛。
明确的接口约束 ：禁用 run 方法（抛出未实现异常），明确该插件的核心作用是技能聚合而非直接执行动作，避免误用。

AgentSkillsRequirement

AgentSkillsRequirement 是一个插件需求类，它定义了代理在沙箱环境中运行所需的基本技能集合，这些技能主要通过 Python 函数的形式提供，使代理能够执行各种操作：

为代理提供了与文件系统交互的基本能力
提供了执行命令和脚本的工具
为其他高级插件（如 Jupyter）提供了基础函数支持
确保代理能够在沙箱环境中完成大多数常见的开发任务

AgentSkillsRequirement 的主要功能如下：

文件系统操作
- 提供读取、写入、编辑文件的能力
- 支持目录浏览和文件管理操作
- 允许代理查看和操作工作区中的文件
命令执行
- 提供执行 shell 命令的能力
- 允许代理在沙箱环境中运行 bash 命令
- 支持与操作系统交互的各种操作
工具函数集合
- 提供一系列实用的 Python 函数
- 这些函数可以被其他插件（如 Jupyter）使用
- 包括各种辅助功能，如字符串处理、数据操作等

在 CodeActAgent 中，AgentSkillsRequirement 被定义在 sandbox_plugins 列表中：

复制代码

sandbox_plugins: list[PluginRequirement] = [
    AgentSkillsRequirement(),
    JupyterRequirement(),
]

AgentSkillsRequirement 与其他组件的关系：

与 JupyterRequirement 的关系
- AgentSkillsRequirement 必须在 JupyterRequirement 之前初始化
- AgentSkillsRequirement 提供的 Python 函数会被 Jupyter 环境使用
- 这种顺序确保了 Jupyter 可以访问所有必要的工具函数
与 Runtime 的关系
- 在 LocalRuntime 和其他运行时环境中，这些插件会被加载和初始化

总的来说，AgentSkillsRequirement 是代理在 OpenHands 环境中执行任务的基础，它提供了一套核心函数，使代理能够与文件系统、命令行和运行环境进行交互。

框架注册与技能发现

OpenHands 框架通过「插件注册机制」识别 AgentSkillsPlugin，并自动发现其聚合的所有具体 Skill 操作，步骤如下：

插件注册与依赖声明

AgentSkillsPlugin 继承自框架的 Plugin 基类，通过 AgentSkillsRequirement 声明依赖，框架启动时会自动扫描并加载该插件：

复制代码

@dataclass
class AgentSkillsRequirement(PluginRequirement):
    name: str = "agent_skills"  # 插件依赖名称，与插件名一致
    documentation: str = agentskills.DOCUMENTATION    

class AgentSkillsPlugin(Plugin):
    name: str = "agent_skills"  # 插件名称，框架通过该名称识别

框架解析技能清单

框架加载 AgentSkillsPlugin 后，会读取其 __all__ 变量和全局命名空间，提取所有 Skill 函数的关键信息：

函数名（如 create_file）：作为 Skill 的唯一标识；
函数签名（参数、返回值）：通过 inspect.signature 解析，用于智能体构造调用参数；
文档字符串（__doc__）：自动生成技能文档，供智能体参考使用。

技能全局注册

框架将解析后的 Skill 信息注册到「全局技能注册表」中，形成 key-value 映射（key：Skill 函数名，value：Skill 函数对象 + 元数据），使智能体可通过函数名快速查找并调用对应 Skill。

智能体调用具体 Skill 操作

智能体（Agent）通过框架提供的接口，从「全局技能注册表」中获取 AgentSkillsPlugin 聚合的具体 Skill，并触发执行

复制代码

logger.debug('Initializing AgentSkills')
if 'agent_skills' in self.plugins and 'jupyter' in self.plugins:
    obs = await self.run_ipython(
        IPythonRunCellAction(
            code='from openhands.runtime.plugins.agent_skills.agentskills import *\n'
        )
    )
    logger.debug(f'AgentSkills initialized: {obs}')

拆解 OpenHands（11）--- Runtime主要组件

0x02 数据流https://modelscope.cn/learn/434066

0x03 插件系统

3.1 sandbox_pluginshttps://modelscope.cn/learn/434066

sandbox_plugins 的定义和作用https://modelscope.cn/learn/434066

具体插件功能

插件在系统中的使用https://modelscope.cn/learn/434066

在 agent_session.py 中

在 base.py 中

3.2 Plugin 基类https://modelscope.cn/learn/434066

3.3 JupyterPlugin

核心特色https://modelscope.cn/learn/434066

流程图

代码

3.4 AgentSkillsPlugin

功能概述

核心特色

AgentSkillsRequirement

框架注册与技能发现