Infogap feed

AI signal, minus the noise.

Curated items are read from the processed items table and served as a bilingual feed.

Page 1 of 10

Filters

PapersSource: ARXIVJun 12, 2026Importance: 4/5

The paper proposes Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT), a post-training framework that teaches language models to reason by analogy. It first trains a reasoning-aware retriever via gold-relevance distillation, so that contexts are ranked by expected reasoning benefit rather than semantic overlap. The policy model is then fine-tuned using reinforcement learning on retrieved analogous demonstrations under verifiable outcome rewards, enabling it to leverage reasoning traces. Analysis shows that reasoning-aware retrieval surfaces complementary solution strategies that provide distinct scaffolding per problem. On AIME 2025, RA-RFT improves average@32 accuracy over GRPO by 7.1 points for Qwen3-1.7B and 2.8 points for Qwen3-4B, demonstrating that reasoning-aware retrieval is an orthogonal improvement to reward design or training curricula.

PapersSource: ARXIVJun 12, 2026Importance: 4/5

Researchers evaluated an LLM pipeline on 76 published social and behavioral science studies with predefined claims. Excluding 7 studies where the LLM failed to produce a viable effect size estimate, the pipeline recovered original effect sizes within ±0.05 Cohen's d in 41% of the remaining studies. It reached the same qualitative conclusion as the original study in 96% of cases, outperforming human reanalysts who achieved 34% effect-size recovery and 74% conclusion agreement. These findings suggest LLMs can automate and scale reproducibility assessments, providing a foundation for systematic auditing of empirical results.

PapersSource: ARXIVJun 12, 2026Importance: 3/5

Large language model training data curation requires data attribution methods to identify how individual samples influence model outputs. Traditional influence functions, though effective, are too slow and memory-intensive for large-scale use. The paper proposes Influcoder, which distills gradient influence rankings from decoder models into a dedicated encoder. This yields a quick, cost-effective approach to influence-based data attribution at scale.

AI signal, minus the noise.

Filters

HyperTool: Beyond Step-Wise Tool Calls for Tool-Augmented Agents

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

Dense Supervision, Sparse Updates: On the Sparsity and Geometry of On-Policy Distillation