Transforming LLMs into Engineers, Architects, and Managers


Large Language Models (LLMs) based multi-agent systems have exceptional opportunities for mimicking and improving human operations. However, as demonstrated by recent studies, current systems sometimes need to be more accurate in the complexity present in real-world applications. These systems primarily need help encouraging constructive collaboration through verbal and tool-based exchanges, which creates difficulties in generating coherent exchanges, reducing counterproductive feedback loops, and facilitating fruitful collaborative interactions. Well-structured Standardized Operating Procedures (SOPs) are necessary for multifaceted processes to be effective. It is crucial to have a thorough awareness of and integration of real-world practices. 

It is important to address these common constraints and incorporate these insights to improve the design and structure of LLM-based multi-agent systems and increase their efficacy and application. Additionally, through extensive collective practice, people have created SOPs generally recognized in various fields. These SOPs are essential for facilitating effective work breakdown and coordination. For instance, the waterfall process in software engineering establishes logical steps for requirements analysis, system design, coding, testing, and deliverables. 

With the help of this consensus workflow, several engineers may work together productively. Additionally, human jobs have specialized knowledge suited to their duties: software engineers use their programming skills to create code, while product managers use market research to identify customer demands. Collaboration deviates from typical outputs and becomes disorganized. For instance, product managers must conduct thorough competitive studies that look at user wants, market trends, and competing products to drive development. These analyses must be followed by the creation of Product Requirements Documents (PRDs), which have a clear, standardized format and prioritized goals. 

These normative artifacts are essential for advancing complicated, diverse undertakings that call for related contributions from various roles. They crystallize communal understanding. Therefore, it is crucial to use organized documentation, reports, and graphics showing dependencies. In this study, researchers from DeepWisdom, Xiamen University, The Chinese University of Hong Kong Shenzhen, Nanjing University, the University of Pennsylvania and the University of California, Berkeley introduce MetaGPT, a ground-breaking multi-agent framework that includes practical knowledge based on SOPs. First, a job title that describes their duties is used to identify each agent. This enables the system to initialize with the proper role-specific prompt prefix. Instead of clumsy role-playing cues, this incorporates domain knowledge into agent definitions. Second, they examine effective human processes to extract SOPs with the procedural knowledge necessary for group projects.

These SOPs are codified using role-based action specifications in the agent architecture. Thirdly, to facilitate information exchange, agents create standardized action outputs. MetaGPT streamlines the coordination between interdependent jobs by formalizing the artifacts that human experts exchange. Agents are connected by a shared environment that offers insight into activities and shared use of tools and resources. All communications between agents are contained in this environment. They also provide a global memory pool where all cooperation records are stored, allowing any agent to subscribe to or search for the data they need. Agents can retrieve previous messages from this memory pool to get more context. 

In contrast to passively absorbing information via dialogue, this architecture enables agents to watch and pull relevant information actively. The setting mimics the systems found in actual workplaces that encourage teamwork. They display collaborative software development workflows and related code implementation experiments, encompassing both the production of small games and more intricate bigger systems, to illustrate the efficacy of their architecture. MetaGPT manages far more software complexity than GPT-3.5 or other open-source frameworks like AutoGPT and AgentVerse, measured by lines of produced code. 

Additionally, MetaGPT generates high-quality requirement papers, design artifacts, flowcharts, and interface specifications throughout the automated end-to-end process. These intermediate standardized outputs greatly increase the success rate of final code execution. Thanks to the automatically generated documentation, human developers may swiftly learn and improve their subject expertise to further improve their requirements, designs, and code. It also enables more sophisticated human-AI interaction. In conclusion, they validate MetaGPT by extensive research on varied software projects. 

The possibilities made possible by the role-based expert agent cooperation paradigm of MetaGPT are demonstrated through quantitative code production benchmarks and qualitative assessments of whole process outputs. In summary, they mostly contributed the following: 

? They designed a new meta-programming mechanism, which includes role definition, task decomposition, process standardization, and other technical design. 

? They propose MetaGPT, an LLM-based multi-agents collaborative framework that encodes human SOPs into LLM agents and fundamentally extends the capability of complex problem-solving. 

? They do extensive tests on developing CRUD2 code, basic data analysis jobs, and Python games with AutoGPT, AgentVerse, LangChain, and MetaGPT. 

In this way, MetaGPT can create complex software by employing the SOP. The overall findings show that MetaGPT significantly outperforms its competitors in terms of the code’s quality and compliance with the anticipated process.