PapersSource: ARXIVJune 12, 2026Importance: 4/5

Agents-K1: Towards Agent-native Knowledge Orchestration

English summary

The paper presents Agents-K1, an end-to-end pipeline that transforms raw documents into agent-native scientific knowledge graphs. It combines a multimodal parser using a five-module schema to capture entities, evidence, citations, and typed cross-entity relations from full papers, a 4B information-extraction backbone trained with GRPO under a rule-based reward, and a GraphAnything CLI that unifies web search, multimodal graph retrieval, and cross-document traversal. The authors process 2.46 million scientific papers across six subjects to construct Scholar-KG and release a one-million-paper subset. Experiments show superior performance on scientific information extraction, knowledge graph construction, and multi-hop scientific reasoning. The pipeline is extensible to general-domain corpora and schema-conformant data synthesis.

Chinese summary

本文提出Agents-K1，一条端到端管线，可将原始文档转化为智能体原生的科学知识图谱。该管线整合了三个组件：一个多模态解析器，采用五模块模式从全文捕获实体、多模态证据、引用及带类型的实体间关系；一个4B参数的信息抽取骨干网络，通过GRPO和基于规则的奖励训练；一个GraphAnything命令行界面，统一网络搜索、多模态图检索和跨文档遍历。作者处理了六个学科领域的246万篇科学论文，构建了Scholar-KG，并发布了其中100万篇论文的子集。实验表明，Agents-K1在科学信息抽取、知识图谱构建和多跳科学推理上均取得优越性能。该管线可扩展至通用领域语料和符合模式的数据合成。

Key points

Agents-K1 integrates a multimodal parser, a 4B IE model trained with GRPO, and a GraphAnything CLI for search and graph traversal.
Agents-K1集合了多模态解析器、通过GRPO训练的4B信息抽取模型，以及可实现搜索和图遍历的GraphAnything命令行界面。
The pipeline processes 2.46M papers to build Scholar-KG; a 1M-paper subset is publicly released.
管线处理了246万篇论文，构建Scholar-KG，并公开发布100万篇论文子集。
The system captures entities, multimodal evidence, citations, and typed relations from full paper content, not just abstracts.
系统从全文内容中捕获实体、多模态证据、引用和带类型的关系，而非仅依据摘要。
Superior results are reported on scientific IE, KG construction, and multi-hop reasoning tasks.
在科学信息抽取、知识图谱构建和多跳推理任务上取得领先结果。

Open original