Agent Token Usage Balloons from 300 to 7,000 Over 20 Turns, a 20× Increase
English summary
A user measured input token costs for an AI agent browsing similar pages over 20 turns. Turn 1 consumed roughly 300 tokens, while turn 20 consumed 7,000 tokens—a 20× increase—as the agent re-reads all previous context. The observation highlights a hidden “context tax” that drives up inference costs in multi-turn agent workflows.
Chinese summary
作者测量了一个AI智能体在浏览相似网页时,20轮对话中每一轮的输入令牌成本。第一轮约消耗300个令牌,到第20轮消耗7000个令牌,成本增长约20倍,原因是智能体重复读取之前的所有上下文。这一发现揭示了多轮智能体工作流中隐藏的“上下文税”,推高了推理成本。
Key points
Turn 1 input token count: ~300 tokens.
第1轮输入令牌数:约300个。
Turn 20 input token count: ~7,000 tokens.
第20轮输入令牌数:约7000个。
Cost increase factor: approximately 20× from first to last turn.
从首轮到末轮成本增加约20倍。
The growth is due to the agent re-including all previous context in each new request.
增长原因是智能体在每次新请求中都重新包含了之前的所有上下文。