Valid Inference with Synthetic Data via Task Exchangeability
English summary
The paper proposes a statistical framework for using synthetic data in scientific research with provable validity guarantees. It introduces a new technical condition called task exchangeability, which requires that the current task is exchangeable with historical tasks for which real data exists. The authors develop inference methods that guarantee validity under task exchangeability and extend guarantees beyond it. The framework is demonstrated on public opinion surveys with LLM-generated silicon samples and on AI evaluation using autoraters. The work addresses fundamental concerns about bias, noise, and misspecification in synthetic data.
Chinese summary
该论文提出了一个统计框架,用于在科学研究中使用合成数据,并提供可证明的有效性保证。关键思路是引入了一个新的技术条件——任务可交换性,要求当前任务与有真实数据的历史任务在适当的数学意义下可交换。作者发展了在任务可交换性下的有效推断方法,并提供了超出可交换性的扩展保证。该框架在基于大语言模型生成硅样本的民意调查和基于自动评分器的AI评估上进行了演示,解决了合成数据中偏差、噪声和误设的根本担忧。
Key points
Synthetic data enables more research but raises concerns about bias, noise, and misspecification.
合成数据能支持更多研究,但带来了偏差、噪声和模型误设的担忧。
The core contribution is a condition called task exchangeability: the current task must be exchangeable with historical tasks that have real data.
核心贡献是任务可交换性条件:当前任务必须与拥有真实数据的历史任务在数学意义上可交换。
The authors provide inference methods with provable validity, including extensions beyond strict exchangeability.
作者提供了具有可证明有效性的推断方法,包括超出严格可交换性的扩展。
The framework is demonstrated on public opinion surveys with silicon samples and AI evaluation with autoraters.
该框架在硅样本民意调查和AI自动评分评估上得到了实证展示。