Giving AI a Headache: Acoustic Adversarial Attacks to Computer Vision Applications
English summary
The paper investigates acoustic adversarial attacks on AI-based computer vision systems using audible frequencies (<20 kHz). Unlike prior ultrasonic attacks limited to short range, this work demonstrates that lower-frequency sound can resonate commercially available cameras to induce physical motion and introduce artifacts. Physical experiments on an off-the-shelf object detection model (YOLO11) caused misclassifications, missed detections, and object hallucinations. The study analyzes how various image and object features influence attack effectiveness and provides insights into vulnerability factors to inform future mitigation strategies.
Chinese summary
该论文研究了利用可听频率(<20 kHz)对基于AI的计算机视觉系统进行声学对抗攻击。与以往受限于短距离的超声波攻击不同,本工作表明更低频率的声音可以引起商用摄像头的共振,从而产生物理运动并引入伪影。在现成的目标检测模型(YOLO11)上进行的物理实验导致误分类、漏检和物体幻觉。研究分析了不同图像和物体特征如何影响攻击效果,并提供了关于易受攻击因素的见解,以指导未来的缓解策略。
Key points
Demonstrates acoustic adversarial attack using audible frequencies (<20 kHz) to disrupt computer vision models, overcoming range limitations of ultrasound.
展示了使用可听频率(<20 kHz)干扰计算机视觉模型的声学对抗攻击,克服了超声波的距离限制。
Physical experiments caused YOLO11 to misclassify, miss detections, and hallucinate objects by resonating a commercial camera.
通过使商用摄像头共振,物理实验导致YOLO11出现误分类、漏检和物体幻觉。
Analyzes which image and object features increase vulnerability to such attacks, offering guidance for defense development.
分析了哪些图像和物体特征会增加对此类攻击的易感性,为防御开发提供指导。