ICML 2026 paper introduces predictable hallucination abstention gate, and releases training-free ntkMirror tool for open-weight models
English summary
The paper 'Predictable Compression Failures' (ICML 2026) addresses hallucinations in evidence-grounded QA by modeling order sensitivity as permutation dispersion and deriving an Expectation-level Decompression Law (EDFL). It defines a fixed ISR=1 answer/abstain gate that requires no threshold tuning, achieving 0.0–0.7% hallucination at ~24% abstention and 80.5% accuracy on held-out tests. The newly released ntkMirror implements this gate for local open-weight models in a training-free manner, using order-marginal verification across multiple evidence permutations. A fused kernel speeds up the permutation forwards by 2.6–10× with bit-identical fp32 results. New hallucination detection benchmarks on Qwen2.5 and Gemma models show AUROC up to 0.96 on SciFact, and the gate raises grounded fraction from 50% to 75–90% at the cost of dropping 10–20% valid claims.
Chinese summary
论文《可预测的压缩失败》(ICML 2026)针对证据根基问答中的幻觉问题,将证据顺序敏感性建模为排列分散,推导出期望级解压缩定律(EDFL)。据此定义了一个固定的 ISR=1 回答/弃权门控,无需阈值调参,在预注册的留出审计中达到 0.0–0.7% 的幻觉率,同时弃权约 24%,尝试回答的准确率为 80.5%。今日发布的 ntkMirror 以免训练方式为本地开源模型实现了该门控,采用多证据排列下的顺序边缘验证。融合核能将排列前向计算加速 2.6–10 倍,fp32 下结果逐比特一致。在 Qwen2.5 和 Gemma 等小模型上的新幻觉检测基准显示,SciFact 上 AUROC 最高达 0.96,门控将基于事实的声明比例从 50% 提升至 75–90%,代价是舍弃 10–20% 的有效声明。
Key points
ICML 2026 paper addresses hallucinations via order sensitivity (permutation dispersion) and information budgeting, deriving EDFL and an ISR=1 abstention gate with no tuning.
ICML 2026 论文通过顺序敏感性(排列分散)和信息预算解决幻觉问题,推导出 EDFL 和无需调参的 ISR=1 弃权门控。
Audit results: 0.0–0.7% hallucination rate at ~24% abstention, with 80.5% accuracy on attempted answers.
审计结果:弃权约 24% 时幻觉率为 0.0–0.7%,尝试回答的准确率为 80.5%。
ntkMirror released: training-free, open-weight, order-marginal verification with fused kernel for 2.6–10× speedup.
发布 ntkMirror:免训练、开源权重、顺序边缘验证,配备融合核实现 2.6–10 倍加速。
Hallucination detection AUROC up to 0.96 (Gemma E4B on SciFact); gate boosts grounded fraction from 50% to 75–90% at 10–20% valid claim loss.
幻觉检测 AUROC 最高 0.96(Gemma E4B 在 SciFact 上);门控将事实根基比例从 50% 提升至 75–90%,损失 10–20% 有效声明。