Hardest Agent Benchmark Tier Yields Zero Score for All Agents
English summary
In a recently conducted agent evaluation, the highest difficulty tier proved insurmountable: every tested agent scored zero. No model was able to earn any points on that level, highlighting the extreme challenge posed by the benchmark.
Chinese summary
在近期的一项智能体测评中,最高难度档位无任何智能体得分,全部零分。该档位的难度让所有参测模型均无法取得任何分数,凸显了该测评对现有智能体能力的极端挑战。
Key points
All agents scored zero on the hardest difficulty setting of the benchmark.
在基准测试的最高难度档位,所有智能体均得零分。