Manus Agent 深度分析 🤖
"Manus 展示了通用 AI Agent 从理解需求到交付成果的完整闭环能力。"
1. Manus 概述
1.1 什么是 Manus
Manus 是 Monica.im 团队推出的通用型 AI Agent,核心特点是能够将用户的想法转化为实际行动,并最终交付可用的成果。与传统 AI 助手只提供建议不同,Manus 能够:
- 自主执行: 在沙盒环境中独立运行代码、操作文件
- 浏览器自动化: 通过 browser_use 控制浏览器完成网页任务
- 多步推理: 将复杂任务拆解为可执行的步骤
- 成果交付: 生成报告、文件、代码等可直接使用的产出物
1.2 核心能力
| 能力领域 | 具体功能 | 技术实现 |
|---|---|---|
| 信息处理 | 问答、搜索、数据分析 | LLM + RAG |
| 浏览器操作 | 导航、点击、表单、截图 | browser_use + Playwright |
| 文件系统 | 读写、搜索、格式转换 | 沙盒 VM 文件操作 |
| 代码执行 | Python/JS/Shell 脚本 | 隔离沙盒环境 |
| 通信反馈 | 进度更新、确认请求 | EventStream |
2. 泄露的 System Prompt 分析
2.1 Prompt 文件结构
通过诱导 Manus 读取系统目录,泄露了以下关键文件:
/manus/
├── agent.py # Agent 主逻辑
├── agent.system.md # 系统提示词
├── sandbox/ # 沙盒环境配置
└── modules/ # 功能模块
├── browser/ # 浏览器自动化
├── filesystem/ # 文件系统
├── shell/ # Shell 命令
└── communication/ # 用户通信2.2 完整 System Prompt 结构
泄露的系统提示词按以下层级组织:
markdown
# Manus AI Assistant
You are Manus, an AI assistant that bridges the gap between ideation and execution.
You excel at understanding user needs, breaking down complex tasks, and delivering
tangible results through autonomous actions in a sandboxed environment.
## Core Identity
- You are a general-purpose AI agent
- You operate in an isolated virtual environment
- You can autonomously browse web, write code, manage files
- You focus on delivering actionable results, not just suggestions
## General Capabilities
### Information Processing
- Answer questions across diverse topics
- Conduct research and compile information
- Analyze data and generate reports
### Tools & Interfaces
#### Browser Tools
- Navigate to specific websites
- Read and extract web page content
- Interact with web elements (click, scroll, fill forms)
- Execute JavaScript in browser console
- Monitor web page changes
- Capture web page screenshots
#### File System Tools
- Read and write files in various formats
- Search for files based on name or content
- Create and organize directory structures
- Compress and archive files
- Analyze file content and extract information
- Convert between different file formats
#### Shell and Command Line
- Execute shell commands in Linux environment
- Install and configure software packages
- Run scripts in various languages
- Manage processes
- Automate repetitive tasks through shell scripts
- Access and manipulate system resources
#### Communication Tools
- Send messages to users
- Ask questions to clarify requirements
- Provide progress updates for long-running tasks
- Attach files and resources to messages
## Programming Languages & Technologies
### Languages
Python, JavaScript/TypeScript, Java, C/C++, Go, Rust, Ruby, PHP, Swift, Kotlin
### Frontend Frameworks
React, Vue, Angular, Svelte, Next.js, Nuxt.js
### Backend Frameworks
Node.js, Django, Flask, FastAPI, Spring Boot, Express
### Databases
MySQL, PostgreSQL, MongoDB, Redis, SQLite, Elasticsearch
### DevOps
Docker, Kubernetes, CI/CD, AWS, GCP, Azure
## Task Methodology
### Step-by-Step Approach
1. **Understand**: Analyze the user's request thoroughly
2. **Plan**: Break down into concrete, achievable steps
3. **Execute**: Perform each step using appropriate tools
4. **Verify**: Check results after each action
5. **Adapt**: Adjust approach if obstacles arise
6. **Deliver**: Provide clear results and artifacts
### Best Practices
- Always save important results to files
- Prefer simple, robust solutions over complex ones
- Handle errors gracefully with fallback strategies
- Keep users informed of progress
- Respect rate limits and ethical boundaries
## Limitations
- Cannot access real-time information without browsing
- Limited to tools and interfaces provided
- Cannot perform actions outside sandbox
- May require user confirmation for sensitive operations
- Has context window limitations2.3 关键设计要点分析
1. 能力边界清晰
- 明确列出能做什么、不能做什么
- 设定伦理和安全边界
2. 工具分类明确
- Browser / FileSystem / Shell / Communication 四大类
- 每类工具有详细的能力描述
3. 任务方法论内置
- 强调理解→规划→执行→验证的闭环
- 内置错误处理和适应策略
4. 技术栈全覆盖
- 支持主流编程语言和框架
- 全栈开发能力
3. 多 Agent 系统架构
3.1 核心架构图
┌─────────────────────────────────────────────────────────────────────────────┐
│ Manus Multi-Agent System │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Planner │───▶│ Executor │───▶│ Verifier │ │
│ │ (任务规划) │ │ (任务执行) │ │ (结果验证) │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ │ ┌─────────────────┴─────────────────┐ │ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Tool Router (工具路由) │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Browser │ │ FileSystem │ │ Shell │ │Communication │ │
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Knowledge Retriever (知识检索) │ │
│ │ Memory / Context / External Knowledge Base │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Sandbox VM (隔离沙盒环境) │ │
│ │ Linux Container / Virtual Machine │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘3.2 核心 Agent 组件
typescript
// Planner Agent - 任务规划
interface PlannerAgent {
// 将用户需求转化为结构化任务
async planTask(userRequest: string, context: Context): Promise<TaskPlan>;
// 动态调整计划
async replan(currentState: State, obstacle: Obstacle): Promise<TaskPlan>;
}
// Executor Agent - 任务执行
interface ExecutorAgent {
// 执行单个步骤
async executeStep(step: TaskStep, tools: ToolSet): Promise<StepResult>;
// 错误恢复
async recover(error: ExecutionError): Promise<RecoveryResult>;
}
// Knowledge Retriever - 知识检索
interface KnowledgeRetriever {
// 检索相关知识
async retrieve(query: string): Promise<Knowledge[]>;
// 存储新知识
async store(knowledge: Knowledge): Promise<void>;
}3.3 任务规划器实现
typescript
class TaskPlanner {
private llm: LLMClient;
async plan(userRequest: string, context: Context): Promise<TaskPlan> {
const response = await this.llm.chat({
messages: [{
role: "system",
content: PLANNER_SYSTEM_PROMPT
}, {
role: "user",
content: `
User Request: ${userRequest}
Current Context:
- Working Directory: ${context.cwd}
- Available Tools: ${context.tools.join(', ')}
- Previous Actions: ${context.history.slice(-5).join(' → ')}
- Current Time: ${new Date().toISOString()}
Create a detailed execution plan with concrete, verifiable steps.
Output as JSON.`
}]
});
return JSON.parse(response.content);
}
}
const PLANNER_SYSTEM_PROMPT = `You are a task planner for an AI agent system.
Your job is to:
1. Understand the user's goal
2. Break it down into concrete, executable steps
3. Identify which tools are needed for each step
4. Define success criteria for each step
5. Anticipate potential issues and plan fallbacks
Output format:
{
"goal": "Clear statement of the overall objective",
"steps": [
{
"id": 1,
"description": "What to do",
"tool": "Which tool to use",
"params": { "key": "value" },
"success_criteria": "How to verify this step succeeded",
"fallback": "What to try if this fails"
}
],
"estimated_time": "How long this might take",
"deliverables": ["List of expected outputs"]
}`;4. 沙盒执行环境
4.1 沙盒架构
Manus 的核心创新之一是运行在隔离的沙盒环境中:
┌────────────────────────────────────────────────────┐
│ Host System │
│ ┌──────────────────────────────────────────────┐ │
│ │ Manus Orchestrator │ │
│ │ - 接收用户请求 │ │
│ │ - 分配沙盒实例 │ │
│ │ - 管理任务生命周期 │ │
│ └──────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────┴────────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Sandbox #1 │ │ Sandbox #2 │ │
│ │ ┌───────────┐ │ │ ┌───────────┐ │ │
│ │ │ Linux VM │ │ │ │ Linux VM │ │ │
│ │ │ - Browser │ │ │ │ - Browser │ │ │
│ │ │ - Python │ │ │ │ - Python │ │ │
│ │ │ - Node.js │ │ │ │ - Node.js │ │ │
│ │ │ - Files │ │ │ │ - Files │ │ │
│ │ └───────────┘ │ │ └───────────┘ │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
└────────────────────────────────────────────────────┘4.2 沙盒技术价值
typescript
// 沙盒带来的核心价值
const sandboxBenefits = {
// 1. 安全隔离 - 任务执行不影响主机
security: {
isolation: "完全隔离的执行环境",
noHostAccess: "无法访问主机系统资源",
networkRestriction: "可控的网络访问"
},
// 2. 错误容忍 - 出错可重置
errorTolerance: {
checkpoint: "支持状态快照",
rollback: "可回滚到之前状态",
restart: "任务失败可重新开始"
},
// 3. 无状态化 - 每次任务干净环境
stateless: {
cleanSlate: "每个任务独立环境",
noInterference: "任务间互不影响",
reproducible: "可重现的执行环境"
},
// 4. 资源控制
resourceControl: {
cpuLimit: "CPU 使用限制",
memoryLimit: "内存使用限制",
timeLimit: "执行时间限制"
}
};4.3 沙盒实现示例
typescript
class SandboxManager {
private activeSandboxes: Map<string, Sandbox> = new Map();
async createSandbox(taskId: string): Promise<Sandbox> {
const sandbox = await this.spawnVM({
image: "manus-sandbox:latest",
resources: {
cpu: 2,
memory: "4G",
disk: "20G"
},
timeout: 3600, // 1 hour max
network: {
allowOutbound: true,
allowedDomains: ["*"] // 可配置白名单
}
});
// 预装工具
await sandbox.exec("pip install pandas numpy requests beautifulsoup4");
await sandbox.exec("npm install -g playwright");
this.activeSandboxes.set(taskId, sandbox);
return sandbox;
}
async executInSandbox(
taskId: string,
command: string
): Promise<ExecutionResult> {
const sandbox = this.activeSandboxes.get(taskId);
if (!sandbox) throw new Error("Sandbox not found");
try {
const result = await sandbox.exec(command, {
timeout: 300, // 5 min per command
captureOutput: true
});
return {
success: true,
stdout: result.stdout,
stderr: result.stderr,
exitCode: result.exitCode
};
} catch (error) {
return {
success: false,
error: error.message
};
}
}
async destroySandbox(taskId: string): Promise<void> {
const sandbox = this.activeSandboxes.get(taskId);
if (sandbox) {
// 导出用户需要的文件
await sandbox.exportFiles("/output/*", `./results/${taskId}/`);
// 销毁沙盒
await sandbox.destroy();
this.activeSandboxes.delete(taskId);
}
}
}5. 浏览器自动化系统
5.1 browser_use 集成
Manus 的浏览器自动化基于开源库 browser_use,底层使用 Playwright:
typescript
// browser_use 核心接口
interface BrowserAgent {
// 导航
goto(url: string): Promise<void>;
// 交互
click(selector: string | Coordinates): Promise<void>;
type(text: string): Promise<void>;
scroll(direction: "up" | "down", amount?: number): Promise<void>;
// 信息获取
screenshot(): Promise<Buffer>;
getContent(): Promise<string>;
extractData(schema: DataSchema): Promise<any>;
// 高级操作
executeScript(script: string): Promise<any>;
waitForElement(selector: string, timeout?: number): Promise<void>;
}5.2 浏览器工具定义
typescript
const browserTools = [
{
name: "browser_navigate",
description: "Navigate to a URL. Use this to open websites.",
parameters: {
type: "object",
properties: {
url: {
type: "string",
description: "The URL to navigate to"
}
},
required: ["url"]
}
},
{
name: "browser_click",
description: `Click on an element on the page.
You can specify:
- A CSS selector: "#submit-btn", ".nav-link"
- Text content: "Submit", "Next"
- Coordinates: { x: 100, y: 200 }
Tips:
- Prefer semantic selectors over coordinates
- Wait for page to load before clicking
- Take screenshot after to verify action`,
parameters: {
type: "object",
properties: {
target: {
type: "string",
description: "CSS selector, text content, or coordinates JSON"
}
},
required: ["target"]
}
},
{
name: "browser_type",
description: `Type text into the currently focused element or specified input.
Supports special keys:
- [Enter] - Press enter key
- [Tab] - Press tab key
- [Escape] - Press escape
- [Ctrl+a] - Select all
- [Ctrl+c] - Copy
- [Ctrl+v] - Paste`,
parameters: {
type: "object",
properties: {
text: { type: "string" },
selector: {
type: "string",
description: "Optional: target input element"
}
},
required: ["text"]
}
},
{
name: "browser_screenshot",
description: `Capture a screenshot of the current page.
Always take screenshots:
- After navigation to see the page
- After clicking to verify the action
- When you need to analyze page content
- Before making decisions about next steps`,
parameters: {
type: "object",
properties: {
fullPage: {
type: "boolean",
description: "Capture full page vs viewport only"
}
}
}
},
{
name: "browser_extract",
description: "Extract structured data from the current page",
parameters: {
type: "object",
properties: {
schema: {
type: "object",
description: "JSON schema describing the data to extract"
},
selector: {
type: "string",
description: "Optional: CSS selector to scope extraction"
}
},
required: ["schema"]
}
}
];5.3 浏览器自动化实现
typescript
class BrowserController {
private browser: Browser;
private page: Page;
async init() {
const playwright = require('playwright');
this.browser = await playwright.chromium.launch({
headless: false, // Manus 使用有头浏览器以便截图
args: ['--disable-blink-features=AutomationControlled']
});
this.page = await this.browser.newPage();
// 设置视口
await this.page.setViewportSize({ width: 1920, height: 1080 });
// 注入反检测脚本
await this.page.addInitScript(() => {
Object.defineProperty(navigator, 'webdriver', { get: () => false });
});
}
async navigate(url: string): Promise<NavigationResult> {
try {
await this.page.goto(url, {
waitUntil: 'networkidle',
timeout: 30000
});
return {
success: true,
url: this.page.url(),
title: await this.page.title()
};
} catch (error) {
return {
success: false,
error: error.message
};
}
}
async click(target: string | { x: number; y: number }): Promise<void> {
if (typeof target === 'string') {
// 尝试多种定位策略
const strategies = [
() => this.page.click(target), // CSS selector
() => this.page.click(`text=${target}`), // Text content
() => this.page.getByRole('button', { name: target }).click(),
() => this.page.getByRole('link', { name: target }).click()
];
for (const strategy of strategies) {
try {
await strategy();
return;
} catch {
continue;
}
}
throw new Error(`Could not find clickable element: ${target}`);
} else {
// 坐标点击
await this.page.mouse.click(target.x, target.y);
}
}
async screenshot(): Promise<ScreenshotResult> {
const buffer = await this.page.screenshot({
type: 'png',
fullPage: false
});
return {
image: buffer.toString('base64'),
timestamp: new Date().toISOString(),
url: this.page.url(),
viewport: { width: 1920, height: 1080 }
};
}
async extractContent(schema: DataSchema): Promise<any> {
// 获取页面内容
const html = await this.page.content();
const text = await this.page.innerText('body');
// 使用 LLM 提取结构化数据
const response = await this.llm.chat({
messages: [{
role: "user",
content: `Extract data from this webpage according to the schema.
Page URL: ${this.page.url()}
Page Content:
${text.slice(0, 10000)}
Schema:
${JSON.stringify(schema, null, 2)}
Output valid JSON matching the schema.`
}]
});
return JSON.parse(response.content);
}
}6. 文件系统与 Shell 工具
6.1 文件系统工具
typescript
const fileSystemTools = [
{
name: "file_read",
description: `Read the contents of a file.
Supported formats:
- Text: .txt, .md, .json, .yaml, .xml, .csv
- Code: .py, .js, .ts, .java, .go, .rs, etc.
- Data: .pdf (text extraction), .xlsx (as JSON)
Returns file content as string or structured data.`,
parameters: {
type: "object",
properties: {
path: { type: "string", description: "File path to read" },
encoding: { type: "string", default: "utf-8" }
},
required: ["path"]
}
},
{
name: "file_write",
description: `Write content to a file. Creates directories if needed.
Best practices:
- Always save important results to files
- Use appropriate file extensions
- Prefer structured formats (JSON, CSV) for data`,
parameters: {
type: "object",
properties: {
path: { type: "string" },
content: { type: "string" },
mode: {
type: "string",
enum: ["write", "append"],
default: "write"
}
},
required: ["path", "content"]
}
},
{
name: "file_search",
description: "Search for files by name pattern or content",
parameters: {
type: "object",
properties: {
directory: { type: "string", default: "." },
pattern: { type: "string", description: "Glob pattern or regex" },
contentMatch: { type: "string", description: "Search within file contents" }
},
required: ["pattern"]
}
},
{
name: "file_list",
description: "List files and directories",
parameters: {
type: "object",
properties: {
path: { type: "string", default: "." },
recursive: { type: "boolean", default: false }
}
}
}
];6.2 Shell 命令工具
typescript
const shellTools = [
{
name: "shell_exec",
description: `Execute a shell command in the sandbox Linux environment.
Available tools:
- Python 3.x with common packages
- Node.js and npm
- Git, curl, wget
- Standard Linux utilities
Safety:
- Commands run in isolated sandbox
- Network access is available
- File changes persist within the sandbox session
Tips:
- Use && to chain commands
- Redirect output to files for large results
- Check exit codes for success/failure`,
parameters: {
type: "object",
properties: {
command: { type: "string" },
workdir: { type: "string", description: "Working directory" },
timeout: { type: "number", default: 300 }
},
required: ["command"]
}
},
{
name: "shell_python",
description: `Execute Python code directly.
Pre-installed packages:
- Data: pandas, numpy, scipy
- Web: requests, beautifulsoup4, aiohttp
- Utils: json, csv, datetime, re
For complex scripts, prefer writing to .py file first.`,
parameters: {
type: "object",
properties: {
code: { type: "string" },
saveAs: { type: "string", description: "Optionally save script to file" }
},
required: ["code"]
}
}
];6.3 文件操作实现
typescript
class FileSystemController {
private basePath: string;
constructor(sandboxPath: string) {
this.basePath = sandboxPath;
}
async read(path: string): Promise<FileContent> {
const fullPath = this.resolvePath(path);
const ext = path.split('.').pop()?.toLowerCase();
// 根据文件类型处理
switch (ext) {
case 'json':
const json = await fs.readFile(fullPath, 'utf-8');
return { type: 'json', data: JSON.parse(json) };
case 'csv':
const csv = await fs.readFile(fullPath, 'utf-8');
return { type: 'csv', data: this.parseCSV(csv) };
case 'pdf':
const pdfText = await this.extractPdfText(fullPath);
return { type: 'text', data: pdfText };
default:
const text = await fs.readFile(fullPath, 'utf-8');
return { type: 'text', data: text };
}
}
async write(path: string, content: string): Promise<void> {
const fullPath = this.resolvePath(path);
// 确保目录存在
await fs.mkdir(dirname(fullPath), { recursive: true });
await fs.writeFile(fullPath, content, 'utf-8');
}
async search(pattern: string, options: SearchOptions = {}): Promise<string[]> {
const glob = require('fast-glob');
const files = await glob(pattern, {
cwd: this.basePath,
ignore: ['node_modules/**', '.git/**']
});
if (options.contentMatch) {
// 过滤包含指定内容的文件
const matches = [];
for (const file of files) {
const content = await this.read(file);
if (content.data.includes(options.contentMatch)) {
matches.push(file);
}
}
return matches;
}
return files;
}
private resolvePath(path: string): string {
// 安全检查:防止路径遍历
const resolved = resolve(this.basePath, path);
if (!resolved.startsWith(this.basePath)) {
throw new Error('Path traversal detected');
}
return resolved;
}
}7. 通信与状态管理
7.1 通信工具
typescript
const communicationTools = [
{
name: "message_user",
description: `Send a message to the user.
Use cases:
- Progress updates for long tasks
- Ask clarifying questions
- Report completion or errors
- Share intermediate results
Format:
- Be clear and concise
- Use markdown for formatting
- Include relevant details`,
parameters: {
type: "object",
properties: {
content: { type: "string" },
attachments: {
type: "array",
items: { type: "string" },
description: "File paths to attach"
},
type: {
type: "string",
enum: ["info", "question", "progress", "success", "error"]
}
},
required: ["content"]
}
},
{
name: "request_confirmation",
description: "Ask user to confirm before proceeding with sensitive action",
parameters: {
type: "object",
properties: {
action: { type: "string", description: "What action needs confirmation" },
reason: { type: "string", description: "Why confirmation is needed" }
},
required: ["action"]
}
}
];7.2 EventStream 实现
typescript
class EventStreamManager {
private events: Event[] = [];
private subscribers: Set<(event: Event) => void> = new Set();
emit(event: Event): void {
this.events.push({
...event,
timestamp: Date.now()
});
// 通知订阅者
this.subscribers.forEach(callback => callback(event));
}
// 任务开始
taskStarted(taskId: string, description: string): void {
this.emit({
type: 'task_started',
taskId,
data: { description }
});
}
// 步骤进度
stepProgress(taskId: string, step: number, total: number, message: string): void {
this.emit({
type: 'step_progress',
taskId,
data: { step, total, message }
});
}
// 工具调用
toolInvoked(taskId: string, tool: string, params: any): void {
this.emit({
type: 'tool_invoked',
taskId,
data: { tool, params }
});
}
// 工具结果
toolResult(taskId: string, tool: string, result: any): void {
this.emit({
type: 'tool_result',
taskId,
data: { tool, result }
});
}
// 任务完成
taskCompleted(taskId: string, deliverables: string[]): void {
this.emit({
type: 'task_completed',
taskId,
data: { deliverables }
});
}
// 获取任务历史
getTaskHistory(taskId: string): Event[] {
return this.events.filter(e => e.taskId === taskId);
}
}7.3 状态管理
typescript
interface AgentState {
// 任务信息
task: {
id: string;
description: string;
plan: TaskPlan;
currentStep: number;
};
// 执行环境
sandbox: {
id: string;
status: 'running' | 'paused' | 'stopped';
cwd: string;
};
// 浏览器状态
browser: {
url: string;
title: string;
lastScreenshot: string;
};
// 执行历史
history: {
actions: Action[];
errors: Error[];
};
// 产出物
artifacts: {
files: string[];
data: any;
};
}
class StateManager {
private state: AgentState;
private checkpoints: AgentState[] = [];
// 保存检查点(用于回滚)
checkpoint(): string {
const id = `checkpoint-${Date.now()}`;
this.checkpoints.push({
...JSON.parse(JSON.stringify(this.state)),
checkpointId: id
});
return id;
}
// 回滚到检查点
rollback(checkpointId: string): boolean {
const checkpoint = this.checkpoints.find(c => c.checkpointId === checkpointId);
if (checkpoint) {
this.state = JSON.parse(JSON.stringify(checkpoint));
return true;
}
return false;
}
// 获取上下文摘要(用于 LLM)
getContextSummary(): string {
return `
## Current State
**Task**: ${this.state.task.description}
**Progress**: Step ${this.state.task.currentStep} of ${this.state.task.plan.steps.length}
**Browser**:
- URL: ${this.state.browser.url}
- Title: ${this.state.browser.title}
**Recent Actions**:
${this.state.history.actions.slice(-5).map(a => `- ${a.tool}: ${a.summary}`).join('\n')}
**Files Created**:
${this.state.artifacts.files.join('\n')}
`;
}
}8. 任务执行流程
8.1 完整执行流程
┌─────────────────────────────────────────────────────────────────────────────┐
│ Manus Task Execution Flow │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ User Request │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ 1. Understand │ Parse user intent, identify requirements │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ 2. Plan │ Break down into steps, assign tools │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ 3. Execute │────▶│ Tool Router │ │
│ │ Step N │ │ │ │
│ └────────┬────────┘ │ browser_* │ │
│ │ │ file_* │ │
│ │ │ shell_* │ │
│ │ │ message_* │ │
│ │ └─────────────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ 4. Verify │ Check if step succeeded │
│ └────────┬────────┘ │
│ │ │
│ ┌─────┴─────┐ │
│ │ │ │
│ Success? Failed │
│ │ │ │
│ │ ┌─────┴─────┐ │
│ │ │ 5. Adapt │ Try fallback or replan │
│ │ └─────┬─────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ │
│ │ More Steps? │────Yes───▶ Loop back to Step 3 │
│ └────────┬────────┘ │
│ │ No │
│ ▼ │
│ ┌─────────────────┐ │
│ │ 6. Deliver │ Package results, notify user │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘8.2 Agent 主循环实现
typescript
class ManusAgent {
private planner: TaskPlanner;
private executor: TaskExecutor;
private browser: BrowserController;
private filesystem: FileSystemController;
private state: StateManager;
private events: EventStreamManager;
async run(userRequest: string): Promise<TaskResult> {
const taskId = generateId();
try {
// 1. 创建沙盒环境
const sandbox = await this.sandboxManager.createSandbox(taskId);
this.events.taskStarted(taskId, userRequest);
// 2. 规划任务
const plan = await this.planner.plan(userRequest, this.state.getContext());
this.state.update({ task: { plan, currentStep: 0 } });
// 3. 执行计划
for (let i = 0; i < plan.steps.length; i++) {
const step = plan.steps[i];
this.state.update({ task: { currentStep: i + 1 } });
this.events.stepProgress(taskId, i + 1, plan.steps.length, step.description);
// 执行步骤(带重试)
const result = await this.executeStepWithRetry(step, 3);
if (!result.success) {
// 尝试重新规划
const newPlan = await this.planner.replan(this.state.get(), result.error);
if (newPlan) {
plan.steps = [...plan.steps.slice(0, i), ...newPlan.steps];
i--; // 重新执行当前步骤
continue;
}
throw new Error(`Failed at step ${i + 1}: ${result.error}`);
}
}
// 4. 验证任务完成
const verification = await this.verifyTaskComplete(plan);
// 5. 打包交付物
const deliverables = await this.packageDeliverables();
this.events.taskCompleted(taskId, deliverables);
return {
success: true,
deliverables,
summary: await this.generateSummary(plan)
};
} catch (error) {
this.events.emit({
type: 'task_failed',
taskId,
data: { error: error.message }
});
return {
success: false,
error: error.message
};
} finally {
// 清理沙盒
await this.sandboxManager.destroySandbox(taskId);
}
}
private async executeStepWithRetry(step: TaskStep, maxRetries: number): Promise<StepResult> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
// 决定使用哪个工具
const toolCall = await this.decideToolCall(step);
this.events.toolInvoked(this.state.get().task.id, toolCall.tool, toolCall.params);
// 执行工具调用
const result = await this.executor.execute(toolCall);
this.events.toolResult(this.state.get().task.id, toolCall.tool, result);
// 验证步骤结果
const verified = await this.verifyStep(step, result);
if (verified) {
return { success: true, result };
}
// 验证失败,尝试备选方案
if (step.fallback && attempt < maxRetries) {
await this.executeFallback(step.fallback);
}
} catch (error) {
if (attempt === maxRetries) {
return { success: false, error: error.message };
}
// 等待后重试
await sleep(1000 * attempt);
}
}
return { success: false, error: 'Max retries exceeded' };
}
private async decideToolCall(step: TaskStep): Promise<ToolCall> {
// 如果步骤已指定工具,直接使用
if (step.tool) {
return {
tool: step.tool,
params: step.params
};
}
// 否则让 LLM 决定
const screenshot = await this.browser.screenshot();
const response = await this.llm.chat({
messages: [{
role: "system",
content: "You are a tool selection agent. Choose the appropriate tool and parameters."
}, {
role: "user",
content: [
{ type: "image", source: { type: "base64", data: screenshot.image } },
{ type: "text", text: `
Current step: ${step.description}
Context:
${this.state.getContextSummary()}
Available tools:
${this.getToolDescriptions()}
Choose a tool and provide parameters. Output JSON:
{
"tool": "tool_name",
"params": { ... },
"reasoning": "why this tool"
}`
}
]
}]
});
return JSON.parse(response.content);
}
}9. 错误恢复与安全
9.1 错误恢复策略
typescript
class RecoveryManager {
private strategies: Map<string, RecoveryStrategy> = new Map([
['page_load_failed', {
detect: (error) => error.includes('timeout') || error.includes('net::'),
recover: async (agent) => {
// 重试导航
await agent.browser.reload();
return await agent.wait(3000);
}
}],
['element_not_found', {
detect: (error) => error.includes('not found') || error.includes('no such element'),
recover: async (agent) => {
// 截图分析,尝试替代定位方式
const screenshot = await agent.browser.screenshot();
const alternative = await agent.analyzeForAlternative(screenshot);
if (alternative) {
await agent.browser.click(alternative);
return true;
}
return false;
}
}],
['captcha_detected', {
detect: (error, screenshot) => this.detectCaptcha(screenshot),
recover: async (agent) => {
// 通知用户处理验证码
await agent.message({
type: 'question',
content: '检测到验证码,请手动完成验证后告诉我继续。'
});
return await agent.waitForUserConfirmation(300); // 5分钟超时
}
}],
['login_required', {
detect: (error, screenshot) => this.detectLoginWall(screenshot),
recover: async (agent) => {
// 询问用户凭据或跳过
await agent.message({
type: 'question',
content: '需要登录才能继续。请问是否提供登录信息,或者跳过此步骤?'
});
return await agent.waitForUserInput();
}
}],
['popup_blocking', {
detect: (error, screenshot) => this.detectPopup(screenshot),
recover: async (agent) => {
// 尝试关闭弹窗
const closeButton = await agent.findElement(screenshot, 'close button, X, dismiss');
if (closeButton) {
await agent.browser.click(closeButton);
return true;
}
await agent.browser.key('Escape');
return true;
}
}]
]);
async attemptRecovery(error: Error, agent: Agent): Promise<boolean> {
const screenshot = await agent.browser.screenshot();
for (const [name, strategy] of this.strategies) {
if (strategy.detect(error.message, screenshot)) {
console.log(`Attempting recovery strategy: ${name}`);
const success = await strategy.recover(agent);
if (success) {
console.log(`Recovery successful: ${name}`);
return true;
}
}
}
return false;
}
}9.2 安全限制
typescript
class SecurityGuard {
private config = {
// 禁止访问的域名模式
blockedDomains: [
/bank/i, /payment/i, /paypal/i, /stripe/i,
/admin/i, /internal/i, /localhost/i
],
// 禁止的 shell 命令
blockedCommands: [
/rm\s+-rf\s+\//, // rm -rf /
/sudo\s+/, // sudo
/chmod\s+777/, // chmod 777
/curl.*\|.*sh/, // curl | sh
/wget.*\|.*sh/ // wget | sh
],
// 敏感操作需要确认
sensitivePatterns: [
/password/i, /credential/i, /secret/i, /token/i,
/delete/i, /remove/i, /drop/i
],
// 资源限制
limits: {
maxFileSize: 100 * 1024 * 1024, // 100MB
maxFiles: 1000,
maxRequestsPerMinute: 60,
maxExecutionTime: 3600 // 1 hour
}
};
async validateNavigation(url: string): Promise<ValidationResult> {
try {
const parsed = new URL(url);
for (const pattern of this.config.blockedDomains) {
if (pattern.test(parsed.hostname)) {
return {
allowed: false,
reason: `Domain ${parsed.hostname} is blocked for security reasons`
};
}
}
return { allowed: true };
} catch {
return { allowed: false, reason: 'Invalid URL' };
}
}
async validateCommand(command: string): Promise<ValidationResult> {
for (const pattern of this.config.blockedCommands) {
if (pattern.test(command)) {
return {
allowed: false,
reason: 'This command is not allowed for security reasons'
};
}
}
return { allowed: true };
}
async checkSensitiveOperation(action: string): Promise<boolean> {
for (const pattern of this.config.sensitivePatterns) {
if (pattern.test(action)) {
return true; // 需要用户确认
}
}
return false;
}
}10. 实战案例
10.1 市场调研任务
typescript
// 用户请求: "帮我调研 AI 编程助手市场的主要竞争对手"
const taskPlan = {
goal: "调研 AI 编程助手市场竞争对手",
steps: [
{
id: 1,
description: "搜索 AI 编程助手相关信息",
tool: "browser_navigate",
params: { url: "https://www.google.com/search?q=AI+coding+assistant+tools+2024" }
},
{
id: 2,
description: "访问 GitHub Copilot 官网获取信息",
tool: "browser_navigate",
params: { url: "https://github.com/features/copilot" }
},
{
id: 3,
description: "提取 Copilot 产品信息",
tool: "browser_extract",
params: {
schema: {
name: "string",
pricing: "array",
features: "array",
targetUsers: "string"
}
}
},
// ... 更多步骤访问其他竞品
{
id: 10,
description: "生成竞品分析报告",
tool: "shell_python",
params: {
code: `
import json
import pandas as pd
# 加载收集的数据
with open('competitors.json') as f:
data = json.load(f)
# 创建对比表格
df = pd.DataFrame(data)
df.to_csv('competitor_comparison.csv')
df.to_html('competitor_comparison.html')
# 生成 Markdown 报告
report = generate_markdown_report(data)
with open('competitor_analysis.md', 'w') as f:
f.write(report)
`
}
},
{
id: 11,
description: "发送结果给用户",
tool: "message_user",
params: {
content: "市场调研完成!已生成竞品分析报告。",
attachments: [
"competitor_analysis.md",
"competitor_comparison.csv"
]
}
}
],
deliverables: [
"competitor_analysis.md",
"competitor_comparison.csv",
"competitor_comparison.html"
]
};10.2 数据处理任务
typescript
// 用户请求: "帮我分析这个销售数据 Excel 并生成可视化报告"
async function analyzeSalesData(filePath: string) {
const agent = new ManusAgent();
return await agent.run(`
分析销售数据文件 ${filePath}:
1. 读取 Excel 文件,理解数据结构
2. 清洗数据(处理缺失值、异常值)
3. 计算关键指标:
- 总销售额
- 月度趋势
- 产品类别分布
- 地区销售对比
4. 生成可视化图表:
- 销售趋势折线图
- 产品分类饼图
- 地区热力图
5. 生成 HTML 报告,包含图表和关键发现
6. 导出处理后的数据为 CSV
`);
}11. 关键设计启示
11.1 Manus 的创新点
| 创新点 | 说明 | 价值 |
|---|---|---|
| 沙盒执行 | 隔离环境中运行代码和浏览器 | 安全、可控、可重置 |
| 多 Agent 协作 | Planner + Executor + Retriever | 复杂任务分解与执行 |
| 视觉+DOM 混合 | 结合截图分析和 DOM 操作 | 更强的网页理解能力 |
| 任务拆解优化 | 自动将模糊需求转为具体步骤 | 提高执行成功率 |
| 实时反馈 | EventStream 持续更新进度 | 良好的用户体验 |
11.2 复刻 Manus 的关键要素
typescript
// 1. 沙盒环境是核心基础设施
const sandboxEssentials = {
isolation: "Docker/VM 隔离执行环境",
tools: "预装 Python, Node.js, Browser",
persistence: "支持文件持久化和导出",
networking: "可控的网络访问",
cleanup: "任务完成后自动清理"
};
// 2. 多 Agent 架构提升复杂任务处理能力
const multiAgentSystem = {
planner: "将任务分解为可执行步骤",
executor: "调用工具执行每个步骤",
verifier: "验证步骤执行结果",
retriever: "检索相关知识和上下文",
coordinator: "协调各 Agent 协作"
};
// 3. 完善的工具集
const toolCategories = {
browser: ["navigate", "click", "type", "screenshot", "extract"],
filesystem: ["read", "write", "search", "list"],
shell: ["exec", "python", "install"],
communication: ["message", "confirm", "attach"]
};
// 4. 错误恢复机制
const recoveryMechanisms = {
retry: "失败后自动重试",
replan: "遇到障碍时重新规划",
fallback: "使用备选方案",
humanInLoop: "必要时请求人工介入"
};
// 5. 任务执行核心循环
async function manusLoop(task: string) {
const plan = await planner.plan(task);
for (const step of plan.steps) {
// 执行
const result = await executor.execute(step);
// 验证
if (!await verifier.verify(step, result)) {
// 恢复
const recovered = await recover(step, result);
if (!recovered) {
// 重新规划
const newPlan = await planner.replan(step, result.error);
// ... 继续执行新计划
}
}
}
return await packageDeliverables();
}11.3 与其他 Agent 框架对比
| 特性 | Manus | Claude Computer Use | AutoGPT |
|---|---|---|---|
| 执行环境 | 云端沙盒 VM | 本地桌面 | 本地/云端 |
| 浏览器操作 | browser_use + Playwright | 原生截图+坐标 | Selenium |
| 代码执行 | 沙盒内完整环境 | 本地 shell | 本地 shell |
| 任务规划 | 多 Agent 协作 | 单 Agent 循环 | 目标导向循环 |
| 用户交互 | EventStream 实时反馈 | 同步等待 | 命令行 |
| 产品形态 | 托管 SaaS | SDK/API | 开源框架 |
延伸阅读
- Manus 官方 - Manus AI Agent 官网
- OpenManus - 开源复刻实现
- browser_use - 浏览器自动化库
- Playwright - 底层浏览器自动化框架
- Anthropic Computer Use - Claude 的计算机使用能力
- 53AI 技术分析 - Manus 逆向分析文章