edit_file: 编辑文件内容——令人惊讶的是,这个核心的编辑功能,其实现方式极其基础,仅仅是基于字符串替换!
就是这样一个看似“简陋”的Agent,却能在实验中成功完成创建JavaScript文件、修改代码逻辑、解码字符串等任务,展现了自主规划和调用工具的能力。5.实现 OpenAI 的工具调用 (Tool Calling) 规范,包括函数定义、参数传递和结果返回。
$export OPENAI_API_KEY=<your_deepseek_api_key> $export OPENAI_API_BASE=https://api.deepseek.com $export OPENAI_MODEL=deepseek-chat实验 0:基础对话 (验证连接)
// Simplified flow within Agent.Run for basic chat func (a *Agent) Run(ctx context.Context) error { // ... setup ... conversation := []OpenAIChatCompletionMessage{ /* system prompt */ } for { // Outer loop for user input // ... get userInput from console ... conversation = append(conversation, OpenAIChatCompletionMessage{Role: "user", Content: userInput}) // --- Call API --- resp, err := a.callOpenAICompletion(ctx, conversation) if err != nil { fmt.Printf("\u001b[91mAPI Error\u001b[0m: %s\n", err.Error()) continue // Let user try again } if len(resp.Choices) == 0 { /* handle no choices */ continue } assistantMessage := resp.Choices[0].Message conversation = append(conversation, assistantMessage) // Add response to history // --- Print Text Response --- if assistantMessage.Content != "" { fmt.Printf("\u001b[93mAI\u001b[0m: %s\n", assistantMessage.Content) } // 堆代码 duidaima.com // --- Tool Handling Logic would go here, but skipped for basic chat --- // In a basic chat without tool calls, the inner loop (if any) breaks immediately. } // End of outer loop return nil }解释: 这一步主要测试 callOpenAICompletion 函数能否成功打包对话历史、发送 HTTP 请求到 API 端点、接收有效的文本响应,并由 Run 方法将其打印出来。
$./agent Chat with AI (use 'ctrl-c' to quit) You: Hey! I'm Tony! How are you? AI: Hi Tony! I'm just a program, so I don't have feelings, but I'm here and ready to help you with anything you need. How can I assist you today?实验 1 & 2:read_file 工具 (读取文件)
type ReadFileInput struct { // Defines the input structure for the tool Path string json:"path" jsonschema_description:"The relative path..." jsonschema:"required" } var ReadFileDefinition = ToolDefinition{ Name: "read_file", Description: "Read the contents of a given relative file path...", InputSchema: GenerateSchema[ReadFileInput]( "ReadFileInput"), // Generates {"type": "object", "properties": {"path": {"type": "string", ...}}, "required": ["path"]} Function: ReadFile, // Links to the Go function below }工具执行函数 (ReadFile): 这个 Go 函数接收 AI 提供的参数(文件路径),并使用标准库 os.ReadFile 实际执行文件读取。
func ReadFile(input json.RawMessage) (string, error) { readFileInput := ReadFileInput{} err := json.Unmarshal(input, &readFileInput) // Parse the JSON arguments from AI if err != nil || readFileInput.Path == "" { /* handle parse error or missing path */ } content, err := os.ReadFile(readFileInput.Path) // Use Go standard library to read file if err != nil { /* handle file read error */ } return string(content), nil // Return file content as a string }解释: 当用户请求涉及文件内容时,AI 会根据 ReadFileDefinition 的描述,决定调用 read_file 工具,并提供 path 参数。Agent 的 Run 循环捕获到这个工具调用请求,找到对应的 ReadFile 函数,传入参数并执行。函数读取文件后返回内容字符串,这个字符串会被包装成 role: tool 的消息发送回给 AI,AI 再根据文件内容生成最终答复。
You: buddy, help me solve the riddle in the secret-file.txt file Tool Call: list_files({}) Tool Call: read_file({"path":"secret-file.txt"}) AI: The answer to the riddle is a **horse**, because it always says "neigh" (which sounds like "nay," meaning disagreement). Let me know if you need help with anything else, Tony!步骤 (实验 2 - 读取main.go):
You: What's going on in main.go? Be brief! Tool Call: read_file({"path":"main.go"}) AI: The `main.go` file is a Go program that sets up an **AI agent** capable of interacting with the local filesystem (reading, listing, and editing files). Here's a brief breakdown: 1. **Purpose**: - The agent acts as a helper, responding to user requests by either providing text answers or using tools to interact with files. 2. **Key Features**: - **Tools**: It has three built-in tools: - `read_file`: Reads file contents. - `list_files`: Lists files/directories. - `edit_file`: Edits or creates files. - **OpenAI Integration**: Uses the OpenAI API (like GPT-4) to process user input and decide when to use tools. - **Interactive CLI**: Takes user input from the command line and displays responses. 3. **Workflow**: - The agent maintains a conversation history with the user. - If a tool is needed, it calls the OpenAI API, executes the tool, and updates the conversation. 4. **Dependencies**: - Requires an `OPENAI_API_KEY` environment variable to work with the OpenAI API. In short, it's a **file-system assistant powered by OpenAI**, designed to help with file operations via natural language commands. Let me know if you'd like more details!实验 3:list_files 工具
type ListFilesInput struct { // Input structure, path is optional Path string json:"path,omitempty" jsonschema_description:"Optional relative path..." } var ListFilesDefinition = ToolDefinition{ Name: "list_files", Description: "List files and directories at a given path. If no path...", InputSchema: GenerateSchema[ListFilesInput]( "ListFilesInput"), Function: ListFiles, // Links to the Go function below }工具执行函数 (ListFiles): 使用 filepath.WalkDir 遍历目录,并将结果格式化为 JSON 数组字符串。
func ListFiles(input json.RawMessage) (string, error) { // ... Parse optional path from input ... dir := "." // Default to current directory if listFilesInput.Path != "" { dir = listFilesInput.Path } var files []string err := filepath.WalkDir(dir, func(...) error { // Use standard library WalkDir // ... build relative path ... // ... append path to files slice (add "/"for directories) ... return nil }) if err != nil { /* handle walk error */ } result, err := json.Marshal(files) // Return as JSON array string if err != nil { /* handle marshal error */ } return string(result), nil }解释: AI 被问及目录内容时,会调用 list_files 工具(通常不带参数,使用默认当前目录)。Go 代码执行 ListFiles 函数,遍历目录,将文件和目录名(目录带 /)的列表打包成 JSON 字符串返回给 AI。AI 再将这个列表呈现给用户。
You: what do you see in this directory? Tool Call: list_files({}) AI: Here’s what’s in the current directory: 1. **Files**: - `.main.go.swp` (likely a temporary swap file for `main.go`). - `go.mod` and `go.sum` (Go module files for dependency management). - `main.go` (the main Go program file). - `secret-file.txt` (the file with the riddle you solved earlier). 2. **Directory**: - `agent/` (a subdirectory, possibly containing agent-related code or resources). Let me know if you'd like to explore any of these further!实验 4 & 5:组合工具 (list_files + read_file)
// Inside Agent.Run method for { // Outer loop for user input // ... get user input, add to conversation ... for { // <--- INNER LOOP: Handles multi-turn tool calls --- resp, err := a.callOpenAICompletion(ctx, conversation) // Call API // ... handle response ... assistantMessage := resp.Choices[0].Message conversation = append(conversation, assistantMessage) // Add assistant's response // Check for tool calls in the response if len(assistantMessage.ToolCalls) == 0 { // No tools called by AI in this turn. Print text response (if any) // and break the INNER loop to wait for next user input. if assistantMessage.Content != "" { /* print content */ } break // Exit INNER loop } // --- AI requested tools, execute them --- toolResults := []OpenAIChatCompletionMessage{} for _, toolCall := range assistantMessage.ToolCalls { // ... find tool definition by toolCall.Function.Name ... // ... execute the tool's Go function with toolCall.Function.Arguments ... // ... prepare resultMsg (role: "tool", content: output/error) ... toolResults = append(toolResults, resultMsg) } conversation = append(conversation, toolResults...) // Add tool results to history // DO NOT BREAK! Continue the INNER loop immediately. // The conversation now includes the tool results, // so the next call to callOpenAICompletion will send them back to the AI. } // <--- End of INNER LOOP --- } // End of OUTER loop解释: 关键在于内部循环 (INNER LOOP)。当 AI 的回复包含 ToolCalls 时,Agent 会:
You: Tell me about all the Go files in here. Be brief! AI: Here’s a quick rundown of the Go files in the directory: 1. **`main.go`**: - The core program file. It sets up an **AI agent** that interacts with the local filesystem using tools (`read_file`, `list_files`, `edit_file`). - Integrates with the **OpenAI API** for natural language processing. - Handles user input via the command line. 2. **`go.mod` & `go.sum`**: - **`go.mod`**: Defines the Go module (project) and its dependencies (e.g., `github.com/invopop/jsonschema` for JSON schema generation). - **`go.sum`**: Locks dependency versions for reproducibility. That’s it! The other files (like `.main.go.swp`) are temporary or non-Go files. Let me know if you want details on any specific part!注意:这里DeepSeek模型并没有再调用read_file!与预期不相符,也许和模型的实现有关。比如模型缓存了这些文件的内容。
You: What go version are we using in this project? Tool Call: read_file({"path":"go.mod"}) AI: The project is using **Go version 1.24.0**, as specified in the `go.mod` file. Let me know if you'd like help with anything else!实验 6, 7, & 8:edit_file 工具 (创建、编辑文件)
type EditFileInput struct { Path string json:"path" jsonschema_description:"The path..." jsonschema:"required" OldStr string json:"old_str" jsonschema_description:"Text to search for..." NewStr string json:"new_str" jsonschema_description:"Text to replace with..." jsonschema:"required" } var EditFileDefinition = ToolDefinition{ Name: "edit_file", Description: "Make edits to a text file. Replaces ALL occurrences...", InputSchema: GenerateSchema[EditFileInput]( "EditFileInput"), Function: EditFile, // Links to the Go function below }工具执行函数 (EditFile 及助手 createNewFile): 处理文件创建和修改逻辑。
func EditFile(input json.RawMessage) (string, error) { editFileInput := EditFileInput{} // ... parse input path, old_str, new_str ... content, err := os.ReadFile(editFileInput.Path) if err != nil { // Key logic: If file doesn't exist AND old_str is empty, try creating it. if os.IsNotExist(err) && editFileInput.OldStr == "" { return createNewFile(editFileInput.Path, editFileInput.NewStr) } return "", err // Other read error } // File exists, perform replacement oldContent := string(content) newContent := strings.Replace(oldContent, editFileInput.OldStr, editFileInput.NewStr, -1) // Replace all // ... check if replacement happened ... err = os.WriteFile(editFileInput.Path, []byte(newContent), 0644) // Write back // ... handle write error ... return "OK", nil } // Helper to create a new file (and parent directories if needed) func createNewFile(filePath, content string) (string, error) { dir := path.Dir(filePath) if dir != "." && dir != "" { // Ensure parent directories exist if err := os.MkdirAll(dir, 0755); err != nil { /* handle error */ } } err := os.WriteFile(filePath, []byte(content), 0644) // Write the new file // ... handle error ... return fmt.Sprintf("Successfully created file %s", filePath), nil }解释: edit_file 是最强大的工具。AI 根据用户的意图(创建或修改)来调用它,并提供必要的参数。Go 代码的核心逻辑在于:
You: hey buddy, create fizzbuzz.js that I can run with Nodejs and that has fizzbuzz in it and executes it Tool Call: edit_file({"path":"fizzbuzz.js","old_str":"","new_str":"// FizzBuzz in Node.js\nfunction fizzbuzz(n) {\n for (let i = 1; i <= n; i++) {\n if (i % 15 === 0) console.log(\"FizzBuzz\");\n else if (i % 3 === 0) console.log(\"Fizz\");\n else if (i % 5 === 0) console.log(\"Buzz\");\n else console.log(i);\n }\n}\n\n// Execute FizzBuzz for numbers 1 to 100\nfizzbuzz(100);"}) AI: Done! I've created **`fizzbuzz.js`** with a FizzBuzz implementation that runs in Node.js. Here's what it does: 1. **Function**: - `fizzbuzz(n)` prints numbers from 1 to `n`, replacing multiples of 3 with "Fizz", multiples of 5 with "Buzz", and multiples of both with "FizzBuzz". 2. **Execution**: - It runs for numbers **1 to 100** by default. ### How to Run: node fizzbuzz.js Let me know if you'd like any tweaks!生成的fizzbuzz.js的文件内容如下:
$cat fizzbuzz.js // FizzBuzz in Node.js function fizzbuzz(n) { for (let i = 1; i <= n; i++) { if (i % 15 === 0) console.log("FizzBuzz"); elseif (i % 3 === 0) console.log("Fizz"); elseif (i % 5 === 0) console.log("Buzz"); else console.log(i); } } // Execute FizzBuzz for numbers 1 to 100 fizzbuzz(100);%使用node运行该fizzbuzz.js的文件,输出如下:
$node fizzbuzz.js 1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz ... ... 97 98 Fizz Buzz步骤 (实验 7 - 编辑):
You: Please edit fizzbuzz.js so that it only prints until 15 Tool Call: edit_file({"path":"fizzbuzz.js","old_str":"// Execute FizzBuzz for numbers 1 to 100\nfizzbuzz(100);","new_str":"// Execute FizzBuzz for numbers 1 to 15\nfizzbuzz(15);"}) AI: Got it! I updated **`fizzbuzz.js`** to print FizzBuzz only up to **15** instead of 100. ### How to Run: node fizzbuzz.js Now it’ll output: 1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz Let me know if you'd like further changes!查看更新后的fizzbuzz.js:
$cat fizzbuzz.js // FizzBuzz in Node.js function fizzbuzz(n) { for (let i = 1; i <= n; i++) { if (i % 15 === 0) console.log("FizzBuzz"); elseif (i % 3 === 0) console.log("Fizz"); elseif (i % 5 === 0) console.log("Buzz"); else console.log(i); } } // Execute FizzBuzz for numbers 1 to 15 fizzbuzz(15);%运行更新后的fizzbuzz.js结果如下:
$node fizzbuzz.js 1 2 Fizz 4 Buzz Fizz 7 8 Fizz Buzz 11 Fizz 13 14 FizzBuzz步骤 (实验 8 - 创建):
You: Create a congrats.js script that rot13-decodes the following string 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!' and prints it AI: Here's the plan: 1. I'll create a **`congrats.js`** script. 2. It will **ROT13-decode** the string `'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!'`. 3. The decoded result will be printed to the console. Let me implement this for you. Tool Call: edit_file({"path":"congrats.js","old_str":"","new_str":"// ROT13 decoder in Node.js\nfunction rot13(str) {\n return str.replace(/[a-zA-Z]/g, function(c) {\n const base = c <= 'Z' ? 65 : 97;\n return String.fromCharCode((c.charCodeAt(0) - base + 13) % 26 + base);\n });\n}\n\n// Encoded string\nconst encoded = 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!';\n\n// Decode and print\nconsole.log(rot13(encoded));"}) AI: Done! I've created **`congrats.js`** for you. Here's what it does: 1. **ROT13 Decoding**: - The script decodes the string `'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!'` using ROT13. 2. **Output**: - Running it will print the decoded message to the console. ### How to Run: node congrats.js The decoded result should appear. Let me know if you'd like to tweak anything!查看生成的congrats.js结果如下:
$cat congrats.js // ROT13 decoder in Node.js function rot13(str) { return str.replace(/[a-zA-Z]/g, function(c) { const base = c <= 'Z' ? 65 : 97; return String.fromCharCode((c.charCodeAt(0) - base + 13) % 26 + base); }); } // Encoded string const encoded = 'Pbatenghyngvbaf ba ohvyqvat n pbqr-rqvgvat ntrag!'; // Decode and print console.log(rot13(encoded));%运行生成的congrats.js结果如下:
$node congrats.js Congratulations on building a code-editing agent!
通过这些结合了代码片段和解释的步骤,你应该能更清晰地理解 Agent 在每个实验中是如何利用其被赋予的工具和核心循环机制来完成任务的。这再次印证了 Thorsten Ball 的观点:核心很简单,但组合起来却能产生强大的效果。