A user debugging code with glm-5.1 via litellm discovered that the model rejected a debug log because it contained the date 'June 4'. The resulting AnthropicException indicated the system detected potentially unsafe or sensitive content. The log was merely a historical record of prior errors, but the presence of the date triggered the censorship filter. This incident demonstrates how safety filters in Chinese LLMs can unexpectedly interfere with routine technical tasks when dates associated with sensitive events appear.
The developer of OpenLumara, an AI agent, set up a public Discord bot challenge to test its sandbox security against real hackers. Despite initial claims of robust protection, three distinct vulnerabilities were quickly found. A path traversal flaw in the coder module allowed unintended file access, an authorization bypass occurred by appending a public command to restricted ones, and a third undisclosed exploit was reported. The developer acknowledged all issues and published corresponding fixes via GitHub commits.
The paper 'Predictable Compression Failures' (ICML 2026) addresses hallucinations in evidence-grounded QA by modeling order sensitivity as permutation dispersion and deriving an Expectation-level Decompression Law (EDFL). It defines a fixed ISR=1 answer/abstain gate that requires no threshold tuning, achieving 0.0–0.7% hallucination at ~24% abstention and 80.5% accuracy on held-out tests. The newly released ntkMirror implements this gate for local open-weight models in a training-free manner, using order-marginal verification across multiple evidence permutations. A fused kernel speeds up the permutation forwards by 2.6–10× with bit-identical fp32 results. New hallucination detection benchmarks on Qwen2.5 and Gemma models show AUROC up to 0.96 on SciFact, and the gate raises grounded fraction from 50% to 75–90% at the cost of dropping 10–20% valid claims.