MetaGPT: Meta Programming For A Multi-Agent Collaborative Framework
Github:https://github.com/geekan/MetaGPT
Standard Operating Procedures(SOPs):These SOPs play a critical role in supporting task decomposition and effective coordination. Furthermore, SOPs outline the responsibilities of each team member, while establishing standards for intermediate outputs. Well-defined SOPs improve the consistent and accurate exe- cution of tasks that align with defined roles and quality standards .
本文提出MetaGPT,基于大模型的Meta -Programming(programing-to-program,相当于借用了meta-learning的learning-to-learn思想)生成框架,旨在生成结构化的输出,包括高质量的需求文档、设计图、流程图和具体的实现接口等。
MetaGPT为了能够实现上面的目的,需要建立在多个标准下构建的Agent,使得Agent能够按照相应的标准进行自主地完成问题分析、系统设计、代码生成、更新、执行和调试等工作。
一、方法
MetaGPT是一个基于大模型的meta-programing多智能体框架。
Agent in SOPs
对于一个比较复杂的工程任务(例如让大模型写一个flappy bird游戏),需要通过扮演不同角色的大模型之间相互协同来完成这个庞大的任务。每个角色的大模型完成具体的一个子环节的工作,例如如下图所示:
In MetaGPT, we specify the agent's profile, which includes their name, profile, goal, and constraints for each role. We also initialize the specific context and skills for each role.
定义好各个Agent的角色后,需要设置工作流程(Work Flow):
- 产品经理角色Agent:根据用户的产品需求进行深入分析,并将任务进行分解,形成结构化的需求文档(Product Requirements Document,PRD);(preliminary functional breakdown);
- 架构师角色Agent:接收到产品经理角色大模型生成的PRD,将其转换为具体的系统设计、数据结构设计、文件列表等;
- 项目管理角色Agent:将架构师设计的系统等待开发工作进行任务分解(具体的代码接口功能);
- 软件工程师角色Agent:接收到具体的接口和功能,进行代码生成,包括类、方法等;
- QA工程师角色Agent:对于实现好的功能函数和类进行测试。
- 最后交付给用户,由用户进行评价反馈。
具体的示例如下所示:
用户需求:制作一个2048小游戏。
- 产品经理先分析2048游戏的各个维度,并形成PRD;
- 架构师获得结构化的PRD,分析项目的实现细节,并构建代码框架和实现接口;
- 项目管理:将具体任务进行分解分工;
- 工程师:完成具体的功能函数开发(代码生成);
- QA:代码review和测试;
- 最后交付给用户。
整个流程是一种类似多轮对话的形式进行的,但不同于传统的文本对话,这里多个Agent之间对话采用结构化的文档(documents)或图表(diagrams)作为对话内容。从而避免开放式对话中可能漏掉的一些关键信息。
共享消息池:按照流程,上一个Agent通常只会跟下一个Agent进行交流,然而下一个Agent可能无法获取到上一个Agent之前的Agent的信息。因此设置一个共享消息池,每一个Agent可以将其产生的结构化消息存入到该池中,也可以通过信息检索策略从中提取一个相关的消息来辅助生成。
Sharing all information with every agent can lead to information overload. During task execution, an agent typically prefers to receive only task-related information and avoid distractions through irrelevant details
Self-reflection:代码生成可能会存在幻觉,因此借助执行器和Debug工具来不断地完善代码。
工程师角色的Agent根据PRD和系统设计进行代码生成,随后通过执行器进行执行,如果存在错误则进行Debug,并重新完善代码并在此执行;如果执行通过则反馈执行的结果并将生成的代码和配套的PRD和系统设计等信息加入到消息池中。
二、实验
数据集:
- HumanEval:包含164个人工编写的程序任务;
- MBPP:包含427个Python程序任务;
- SoftwareDev:本文提出的benchmark,包含70个经典的软件开发任务
SoftwareDev中的每个task有对应的prompt,如下表所示:
对应的App结果如下图:
从下列五个维度分别进行人工评估:
(A) Executability: this metric rates code from 1 (failure/non- functional) to 4 (flawless). '1' is for non-functional, '2' for runnable but imperfect, '3' for nearly perfect, and '4' for flawless code.
(B) Cost: the cost evaluations here include the (1) running time, (2) token usage, and (3) expenses.
© Code Statistics: this includes (1) code files, (2) lines of code per file, and (3) total code lines.
(D) Productivity: basically, it is defined as the number of token usage divided by the number of lines of code, which refers to the consumption of tokens per code line.
(E) Human Revision Cost: quantified by the number of rounds of revision needed to ensure the smooth running of the code, this indicates the frequency of human interventions, such as debugging or importing packages.
评估结果对比:
三、复现
python
def generate_repo(
idea,
investment=3.0,
n_round=5,
code_review=True,
run_tests=False,
implement=True,
project_name="",
inc=False,
project_path="",
reqa_file="",
max_auto_summarize_code=0,
recover_path=None,
) -> ProjectRepo:
"""Run the startup logic. Can be called from CLI or other Python scripts."""
from metagpt.config2 import config
from metagpt.context import Context
from metagpt.roles import (
Architect,
Engineer,
ProductManager,
ProjectManager,
QaEngineer,
)
from metagpt.team import Team
config.update_via_cli(project_path, project_name, inc, reqa_file, max_auto_summarize_code)
ctx = Context(config=config)
if not recover_path:
company = Team(context=ctx)
# 添加软件开发相关的Agent角色,即在环境中添加所有角色
company.hire(
[
ProductManager(),
Architect(),
ProjectManager(),
]
)
if implement or code_review:
company.hire([Engineer(n_borg=5, use_code_review=code_review)])
if run_tests:
company.hire([QaEngineer()])
else:
stg_path = Path(recover_path)
if not stg_path.exists() or not str(stg_path).endswith("team"):
raise FileNotFoundError(f"{recover_path} not exists or not endswith `team`")
company = Team.deserialize(stg_path=stg_path, context=ctx)
idea = company.idea
company.invest(investment)
# 针对用户输入的任务(idea)执行一系列的pipeline
company.run_project(idea)
# 重复执行多次
asyncio.run(company.run(n_round=n_round))
return ctx.repo
python
# 将用户的idea消息投放出去
def run_project(self, idea, send_to: str = ""):
"""Run a project from publishing user requirement."""
self.idea = idea
# Human requirement.
self.env.publish_message(
Message(role="Human", content=idea, cause_by=UserRequirement, send_to=send_to or MESSAGE_ROUTE_TO_ALL),
peekable=False,
)
python
# 根据用户的idea,对所有角色依次进行执行
@serialize_decorator
async def run(self, n_round=3, idea="", send_to="", auto_archive=True):
"""Run company until target round or no money"""
if idea:
self.run_project(idea=idea, send_to=send_to)
while n_round > 0:
n_round -= 1
self._check_balance()
await self.env.run()
logger.debug(f"max {n_round=} left.")
self.env.archive(auto_archive)
return self.env.history
例如Engineer角色的Agent
python
class Engineer(Role):
"""
Represents an Engineer role responsible for writing and possibly reviewing code.
Attributes:
name (str): Name of the engineer.
profile (str): Role profile, default is 'Engineer'.
goal (str): Goal of the engineer.
constraints (str): Constraints for the engineer.
n_borg (int): Number of borgs.
use_code_review (bool): Whether to use code review.
"""
name: str = "Alex"
profile: str = "Engineer"
goal: str = "write elegant, readable, extensible, efficient code"
constraints: str = (
"the code should conform to standards like google-style and be modular and maintainable. "
"Use same language as user requirement"
)
n_borg: int = 1
use_code_review: bool = False
code_todos: list = []
summarize_todos: list = []
next_todo_action: str = ""
n_summarize: int = 0
def __init__(self, **kwargs) -> None:
super().__init__(**kwargs)
self.set_actions([WriteCode])
self._watch([WriteTasks, SummarizeCode, WriteCode, WriteCodeReview, FixBug, WriteCodePlanAndChange])
self.code_todos = []
self.summarize_todos = []
self.next_todo_action = any_to_name(WriteCode)
@staticmethod
def _parse_tasks(task_msg: Document) -> list[str]:
m = json.loads(task_msg.content)
return m.get(TASK_LIST.key) or m.get(REFINED_TASK_LIST.key)
async def _act_sp_with_cr(self, review=False) -> Set[str]:
changed_files = set()
for todo in self.code_todos:
"""
# Select essential information from the historical data to reduce the length of the prompt (summarized from human experience):
1. All from Architect
2. All from ProjectManager
3. Do we need other codes (currently needed)?
TODO: The goal is not to need it. After clear task decomposition, based on the design idea, you should be able to write a single file without needing other codes. If you can't, it means you need a clearer definition. This is the key to writing longer code.
"""
coding_context = await todo.run()
# Code review
if review:
action = WriteCodeReview(i_context=coding_context, context=self.context, llm=self.llm)
self._init_action(action)
coding_context = await action.run()
dependencies = {coding_context.design_doc.root_relative_path, coding_context.task_doc.root_relative_path}
if self.config.inc:
dependencies.add(coding_context.code_plan_and_change_doc.root_relative_path)
await self.project_repo.srcs.save(
filename=coding_context.filename,
dependencies=list(dependencies),
content=coding_context.code_doc.content,
)
msg = Message(
content=coding_context.model_dump_json(),
instruct_content=coding_context,
role=self.profile,
cause_by=WriteCode,
)
self.rc.memory.add(msg)
changed_files.add(coding_context.code_doc.filename)
if not changed_files:
logger.info("Nothing has changed.")
return changed_files
async def _act(self) -> Message | None:
"""Determines the mode of action based on whether code review is used."""
if self.rc.todo is None:
return None
if isinstance(self.rc.todo, WriteCodePlanAndChange):
self.next_todo_action = any_to_name(WriteCode)
return await self._act_code_plan_and_change()
if isinstance(self.rc.todo, WriteCode):
self.next_todo_action = any_to_name(SummarizeCode)
return await self._act_write_code()
if isinstance(self.rc.todo, SummarizeCode):
self.next_todo_action = any_to_name(WriteCode)
return await self._act_summarize()
return None
async def _act_write_code(self):
changed_files = await self._act_sp_with_cr(review=self.use_code_review)
return Message(
content="\n".join(changed_files),
role=self.profile,
cause_by=WriteCodeReview if self.use_code_review else WriteCode,
send_to=self,
sent_from=self,
)
async def _act_summarize(self):
tasks = []
for todo in self.summarize_todos:
summary = await todo.run()
summary_filename = Path(todo.i_context.design_filename).with_suffix(".md").name
dependencies = {todo.i_context.design_filename, todo.i_context.task_filename}
for filename in todo.i_context.codes_filenames:
rpath = self.project_repo.src_relative_path / filename
dependencies.add(str(rpath))
await self.project_repo.resources.code_summary.save(
filename=summary_filename, content=summary, dependencies=dependencies
)
is_pass, reason = await self._is_pass(summary)
if not is_pass:
todo.i_context.reason = reason
tasks.append(todo.i_context.model_dump())
await self.project_repo.docs.code_summary.save(
filename=Path(todo.i_context.design_filename).name,
content=todo.i_context.model_dump_json(),
dependencies=dependencies,
)
else:
await self.project_repo.docs.code_summary.delete(filename=Path(todo.i_context.design_filename).name)
logger.info(f"--max-auto-summarize-code={self.config.max_auto_summarize_code}")
if not tasks or self.config.max_auto_summarize_code == 0:
return Message(
content="",
role=self.profile,
cause_by=SummarizeCode,
sent_from=self,
send_to="Edward", # The name of QaEngineer
)
# The maximum number of times the 'SummarizeCode' action is automatically invoked, with -1 indicating unlimited.
# This parameter is used for debugging the workflow.
self.n_summarize += 1 if self.config.max_auto_summarize_code > self.n_summarize else 0
return Message(
content=json.dumps(tasks), role=self.profile, cause_by=SummarizeCode, send_to=self, sent_from=self
)
async def _act_code_plan_and_change(self):
"""Write code plan and change that guides subsequent WriteCode and WriteCodeReview"""
node = await self.rc.todo.run()
code_plan_and_change = node.instruct_content.model_dump_json()
dependencies = {
REQUIREMENT_FILENAME,
str(self.project_repo.docs.prd.root_path / self.rc.todo.i_context.prd_filename),
str(self.project_repo.docs.system_design.root_path / self.rc.todo.i_context.design_filename),
str(self.project_repo.docs.task.root_path / self.rc.todo.i_context.task_filename),
}
code_plan_and_change_filepath = Path(self.rc.todo.i_context.design_filename)
await self.project_repo.docs.code_plan_and_change.save(
filename=code_plan_and_change_filepath.name, content=code_plan_and_change, dependencies=dependencies
)
await self.project_repo.resources.code_plan_and_change.save(
filename=code_plan_and_change_filepath.with_suffix(".md").name,
content=node.content,
dependencies=dependencies,
)
return Message(
content=code_plan_and_change,
role=self.profile,
cause_by=WriteCodePlanAndChange,
send_to=self,
sent_from=self,
)
async def _is_pass(self, summary) -> (str, str):
rsp = await self.llm.aask(msg=IS_PASS_PROMPT.format(context=summary), stream=False)
logger.info(rsp)
if "YES" in rsp:
return True, rsp
return False, rsp
async def _think(self) -> Action | None:
if not self.src_workspace:
self.src_workspace = self.git_repo.workdir / self.git_repo.workdir.name
write_plan_and_change_filters = any_to_str_set([WriteTasks, FixBug])
write_code_filters = any_to_str_set([WriteTasks, WriteCodePlanAndChange, SummarizeCode])
summarize_code_filters = any_to_str_set([WriteCode, WriteCodeReview])
if not self.rc.news:
return None
msg = self.rc.news[0]
if self.config.inc and msg.cause_by in write_plan_and_change_filters:
logger.debug(f"TODO WriteCodePlanAndChange:{msg.model_dump_json()}")
await self._new_code_plan_and_change_action(cause_by=msg.cause_by)
return self.rc.todo
if msg.cause_by in write_code_filters:
logger.debug(f"TODO WriteCode:{msg.model_dump_json()}")
await self._new_code_actions()
return self.rc.todo
if msg.cause_by in summarize_code_filters and msg.sent_from == any_to_str(self):
logger.debug(f"TODO SummarizeCode:{msg.model_dump_json()}")
await self._new_summarize_actions()
return self.rc.todo
return None
async def _new_coding_context(self, filename, dependency) -> CodingContext:
old_code_doc = await self.project_repo.srcs.get(filename)
if not old_code_doc:
old_code_doc = Document(root_path=str(self.project_repo.src_relative_path), filename=filename, content="")
dependencies = {Path(i) for i in await dependency.get(old_code_doc.root_relative_path)}
task_doc = None
design_doc = None
code_plan_and_change_doc = await self._get_any_code_plan_and_change() if await self._is_fixbug() else None
for i in dependencies:
if str(i.parent.as_posix()) == TASK_FILE_REPO:
task_doc = await self.project_repo.docs.task.get(i.name)
elif str(i.parent.as_posix()) == SYSTEM_DESIGN_FILE_REPO:
design_doc = await self.project_repo.docs.system_design.get(i.name)
elif str(i.parent.as_posix()) == CODE_PLAN_AND_CHANGE_FILE_REPO:
code_plan_and_change_doc = await self.project_repo.docs.code_plan_and_change.get(i.name)
if not task_doc or not design_doc:
logger.error(f'Detected source code "{filename}" from an unknown origin.')
raise ValueError(f'Detected source code "{filename}" from an unknown origin.')
context = CodingContext(
filename=filename,
design_doc=design_doc,
task_doc=task_doc,
code_doc=old_code_doc,
code_plan_and_change_doc=code_plan_and_change_doc,
)
return context
async def _new_coding_doc(self, filename, dependency):
context = await self._new_coding_context(filename, dependency)
coding_doc = Document(
root_path=str(self.project_repo.src_relative_path), filename=filename, content=context.model_dump_json()
)
return coding_doc
async def _new_code_actions(self):
bug_fix = await self._is_fixbug()
# Prepare file repos
changed_src_files = self.project_repo.srcs.all_files if bug_fix else self.project_repo.srcs.changed_files
changed_task_files = self.project_repo.docs.task.changed_files
changed_files = Documents()
# Recode caused by upstream changes.
for filename in changed_task_files:
design_doc = await self.project_repo.docs.system_design.get(filename)
task_doc = await self.project_repo.docs.task.get(filename)
code_plan_and_change_doc = await self.project_repo.docs.code_plan_and_change.get(filename)
task_list = self._parse_tasks(task_doc)
for task_filename in task_list:
old_code_doc = await self.project_repo.srcs.get(task_filename)
if not old_code_doc:
old_code_doc = Document(
root_path=str(self.project_repo.src_relative_path), filename=task_filename, content=""
)
if not code_plan_and_change_doc:
context = CodingContext(
filename=task_filename, design_doc=design_doc, task_doc=task_doc, code_doc=old_code_doc
)
else:
context = CodingContext(
filename=task_filename,
design_doc=design_doc,
task_doc=task_doc,
code_doc=old_code_doc,
code_plan_and_change_doc=code_plan_and_change_doc,
)
coding_doc = Document(
root_path=str(self.project_repo.src_relative_path),
filename=task_filename,
content=context.model_dump_json(),
)
if task_filename in changed_files.docs:
logger.warning(
f"Log to expose potential conflicts: {coding_doc.model_dump_json()} & "
f"{changed_files.docs[task_filename].model_dump_json()}"
)
changed_files.docs[task_filename] = coding_doc
self.code_todos = [
WriteCode(i_context=i, context=self.context, llm=self.llm) for i in changed_files.docs.values()
]
# Code directly modified by the user.
dependency = await self.git_repo.get_dependency()
for filename in changed_src_files:
if filename in changed_files.docs:
continue
coding_doc = await self._new_coding_doc(filename=filename, dependency=dependency)
changed_files.docs[filename] = coding_doc
self.code_todos.append(WriteCode(i_context=coding_doc, context=self.context, llm=self.llm))
if self.code_todos:
self.set_todo(self.code_todos[0])
async def _new_summarize_actions(self):
src_files = self.project_repo.srcs.all_files
# Generate a SummarizeCode action for each pair of (system_design_doc, task_doc).
summarizations = defaultdict(list)
for filename in src_files:
dependencies = await self.project_repo.srcs.get_dependency(filename=filename)
ctx = CodeSummarizeContext.loads(filenames=list(dependencies))
summarizations[ctx].append(filename)
for ctx, filenames in summarizations.items():
ctx.codes_filenames = filenames
new_summarize = SummarizeCode(i_context=ctx, context=self.context, llm=self.llm)
for i, act in enumerate(self.summarize_todos):
if act.i_context.task_filename == new_summarize.i_context.task_filename:
self.summarize_todos[i] = new_summarize
new_summarize = None
break
if new_summarize:
self.summarize_todos.append(new_summarize)
if self.summarize_todos:
self.set_todo(self.summarize_todos[0])
self.summarize_todos.pop(0)
async def _new_code_plan_and_change_action(self, cause_by: str):
"""Create a WriteCodePlanAndChange action for subsequent to-do actions."""
files = self.project_repo.all_files
options = {}
if cause_by != any_to_str(FixBug):
requirement_doc = await self.project_repo.docs.get(REQUIREMENT_FILENAME)
options["requirement"] = requirement_doc.content
else:
fixbug_doc = await self.project_repo.docs.get(BUGFIX_FILENAME)
options["issue"] = fixbug_doc.content
code_plan_and_change_ctx = CodePlanAndChangeContext.loads(files, **options)
self.rc.todo = WriteCodePlanAndChange(i_context=code_plan_and_change_ctx, context=self.context, llm=self.llm)
@property
def action_description(self) -> str:
"""AgentStore uses this attribute to display to the user what actions the current role should take."""
return self.next_todo_action
async def _is_fixbug(self) -> bool:
fixbug_doc = await self.project_repo.docs.get(BUGFIX_FILENAME)
return bool(fixbug_doc and fixbug_doc.content)
async def _get_any_code_plan_and_change(self) -> Optional[Document]:
changed_files = self.project_repo.docs.code_plan_and_change.changed_files
for filename in changed_files.keys():
doc = await self.project_repo.docs.code_plan_and_change.get(filename)
if doc and doc.content:
return doc
return None
继承的Role的执行脚本:
python
@role_raise_decorator
async def run(self, with_message=None) -> Message | None:
"""Observe, and think and act based on the results of the observation"""
if with_message:
msg = None
if isinstance(with_message, str):
msg = Message(content=with_message)
elif isinstance(with_message, Message):
msg = with_message
elif isinstance(with_message, list):
msg = Message(content="\n".join(with_message))
if not msg.cause_by:
msg.cause_by = UserRequirement
self.put_message(msg)
if not await self._observe():
# If there is no new information, suspend and wait
logger.debug(f"{self._setting}: no news. waiting.")
return
rsp = await self.react()
# Reset the next action to be taken.
self.set_todo(None)
# Send the response message to the Environment object to have it relay the message to the subscribers.
self.publish_message(rsp)
return rsp
四、总结与不足
MetaGPT中的Agent角色是固定的,且Action空间和顺序也是固定的,即每个流程要执行的Action都是事先定义好的,工具的使用也是在固定的Action之后完成。因此大模型在思考和生成的整个流程是被固定的。