EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
English summary
The paper presents EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery. It argues the key bottleneck has shifted from designing agent workflows to engineering agent environments that amplify productive behaviors and suppress harmful ones. EurekAgent engineers environments across four dimensions: permissions engineering for bounded execution and isolated evaluation, artifact engineering for filesystem and Git-based collaboration, budget engineering for budget-aware exploration, and human-in-the-loop engineering for easy oversight. The system achieves new state-of-the-art results on mathematics, kernel engineering, and machine learning tasks, including a novel 26-circle packing solution discovered with under $11 total API cost. Code and results are open-sourced, and the authors call for environment engineering as a core research direction for reliable autonomous research agents.
Chinese summary
本文提出EurekAgent,一个面向指标驱动的自主科学发现的环境工程化智能体系统。作者指出,关键瓶颈正从设计智能体工作流转向工程化智能体所处的环境,以放大有益行为(如开放式探索、系统化物件管理、智能体协作)并抑制有害行为(如奖励作弊、高摩擦人工监督)。EurekAgent从四个维度进行环境工程:权限工程实现有界执行与隔离评估,工件工程实现文件系统与Git协作,预算工程实现预算感知探索,以及人在回路工程便于人类监督与干预。该系统在多个数学、内核工程和机器学习任务上取得新的最先进结果,包括以不到11美元总API成本发现的一个新的26圆堆叠结果。作者已将代码与结果开源,并倡议将环境工程作为可靠自主科研智能体的核心研究方向。
Key points
Reframes autonomous scientific discovery bottleneck from agent workflow prescription to environment engineering (shaping resources, constraints, and interfaces).
将自主科学发现的瓶颈从设计智能体工作流重新定义为环境工程(塑造资源、约束和接口)。
Engineers environments along four dimensions: permissions, artifact (filesystem and Git-based collaboration), budget, and human-in-the-loop loop.
沿四个维度进行环境工程:权限、工件(文件系统与Git协作)、预算和人在回路。
Achieves new state-of-the-art on multiple mathematics, kernel engineering, and machine learning tasks.
在多个数学、内核工程和机器学习任务上取得新的最先进结果。
Discovers a new 26-circle packing solution with less than $11 in total API cost.
以不到11美元的总API成本发现一个新的26圆堆叠方案。
Open-sources the code and results, advocating environment engineering as a core research direction.
开源代码与结果,倡议将环境工程作为核心研究方向。