社交来源: V2EX2026年6月14日重要度: 4/5

Claude Code v2.1.177 将工具输出误认为用户指令，导致自动写入 Memory 文件

英文摘要

A user on macOS running Claude Code 2.1.177 with Opus 4.8 and acceptEdits enabled observed that during a long session the model repeatedly attributed tool_results and its own reasoning as actual user messages. For example, a shell output containing a version string was later quoted as “your original words,” even though the user never typed it. The agent then autonomously wrote a memory file based on these phantom inputs. The user verified via the raw JSONL transcript that the messages were from tool_results, not user turns. No evidence of system compromise was found; similar issues are reported in GitHub issues #58671, #68159, #63871, and others, indicating a harness boundary bug. Temporary mitigations include downgrading to 2.1.174, disabling acceptEdits, removing global write permissions, and not resuming affected sessions.

中文摘要

macOS 用户使用 Claude Code 2.1.177 + Opus 4.8 并开启 acceptEdits 时，在长会话中发现模型多次将工具输出结果（tool_result）或自身推理当作“用户原话”。例如 shell 输出中的版本号被引用为“你刚才说的”，并据此自动创建 memory 文件。通过对照 JSONL 确认这些内容实为工具输出，非真实用户消息。未发现被入侵证据，GitHub 上已有多个类似 issue（#58671、#68159、#63871 等），指向 harness 层 user/tool 边界错乱。临时方案包括降级至 2.1.174、关闭 acceptEdits、移除全局写入权限以及不复用旧会话。

关键要点

Claude Code 2.1.177 with Opus 4.8 conflated tool_result content with actual user messages, causing the agent to treat its own reasoning or shell outputs as user directives.
Claude Code 2.1.177 搭配 Opus 4.8 时，将 tool_result 内容误认为用户消息，导致模型把自身推理或 shell 输出当作用户指令。
With acceptEdits enabled, the agent automatically wrote a memory file based on a phantom input, demonstrating a concrete safety risk: agent actions derived from misattributed messages can modify the filesystem.
在 acceptEdits 开启的情况下，模型基于虚假输入自动写入了 memory 文件，表明误认消息可直接引发文件系统修改，存在实际安全风险。
The user verified via raw JSONL transcripts that the cited messages were originally tool_results, not user turns, and found no evidence of MitM or system compromise.
用户通过原始 JSONL 日志确认被引用的内容是 tool_result，而非用户真实消息，同时排除了中间人攻击或系统被黑的可能。
Similar behavior is documented in multiple GitHub issues (e.g., #58671, #68159), suggesting a harness-level bug where user/tool/system boundaries are lost in long sessions.
多个 GitHub issue（如 #58671、#68159）存在类似报告，暗示 harness 层在长会话中失去了 user/tool/system 消息边界。

打开原文