llm使用 AgentScope-Tuner 通过 RL 训练 FrozenLake 智能体

njsgcs2026-02-09 21:31

agentscope-samples/tuner/frozen_lake at main · agentscope-ai/agentscope-samples --- agentscope-samples/tuner/frozen_lake at main · agentscope-ai/agentscope-samples

At least 2 NVIDIA GPUs with CUDA 12.8 or newer

至少需要 2 块 NVIDIA GPU，支持 CUDA 12.8 或更高版本

An example of agent output is given below:

下面给出一个代理输出的示例：

复制代码

From the current observation, let's analyze the situation. The player (P) is at: (4, 0), and the goal (G) is at: (2, 3). There is also a hole (O) at (4, 4). Given this, I can move towards the goal without worrying about slippery tiles right now.

The shortest path from P to G involves moving left (4 steps) followed by moving down (1 step), since going directly would bypass the hole or move us further from the goal. Let's move left first.

Let's take the action ```Left```.

上一篇：AI 视觉连载2：灰度图

下一篇：超宽带脉冲无线电（Ultra Wideband Impulse Radio, UWB）简介