Doc-to-Atom: Learning to Compile and Compose Memory Atoms
English summary
Researchers propose Doc-to-Atom (Doc2Atom), a parametric memory framework that compresses long documents into semantically typed knowledge atoms. Each atom is compiled into an independent micro-LoRA adapter and a provenance retrieval key. At inference, a lightweight query router assembles only relevant atoms into a query-specific adapter, which is injected into a frozen base model. The system is trained end-to-end via multi-objective distillation. Experiments on six QA benchmarks show Doc2Atom outperforms Doc-to-LoRA baselines while reducing the memory cost of document internalization.
Chinese summary
研究人员提出Doc-to-Atom(Doc2Atom)参数化记忆框架,将长文档压缩为语义类型化的知识原子。每个原子被编译为一个独立的微型LoRA适配器和一个溯源检索键。推理时,轻量级查询路由器将相关原子组装成查询特定适配器,注入冻结的基座模型。整个系统通过多目标蒸馏端到端训练。在六个问答基准上,Doc2Atom性能优于Doc-to-LoRA基线,同时降低了文档内化的内存成本。
Key points
Doc-to-Atom decomposes documents into distinct knowledge atoms, each with a micro-LoRA adapter and retrieval key.
Doc-to-Atom将文档分解为独立的知识原子,每个原子带有微型LoRA适配器和检索键。
A query router selectively assembles relevant atoms into a query-specific adapter, avoiding interference from irrelevant content.
查询路由器选择性地将相关原子组装成查询特定的适配器,避免无关内容的干扰。
End-to-end trained via multi-objective distillation, outperforming Doc-to-LoRA on six QA benchmarks with lower memory footprint.
通过多目标蒸馏端到端训练,在六个问答基准上优于Doc-to-LoRA,且内存占用更低。