论文来源: ARXIV2026年6月17日重要度: 3/5

Multi-Source Cybersecurity Logs: An ATT&CK-Labeled Dataset and SLM Evaluation

中文标题: 多源网络安全日志：带ATT&CK标签的数据集与小语言模型评估

英文摘要

The paper introduces a new dataset that combines system, network, and browser activity logs, containing 2.3 million events from 870 sessions (70 attack, 800 benign). All malicious events are labeled with MITRE ATT&CK technique IDs, covering 12 tactics and 53 techniques, and attacks were generated using real tools including RAT, C2 tunnels, and cloud exfiltration. The authors fine-tuned three Small Language Models (Qwen2.5-1.5B, Llama-3.2-3B, Phi-4-Mini) with LoRA and evaluated them on chunk classification and ATT&CK technique identification. Fine-tuning raised chunk classification accuracy from ~8% for base models to 90–97%. Technique identification remained hard, with the best exact-match accuracy at 42%, though high partial-match scores indicate the models learned the underlying reasoning.

中文摘要

该论文发布了一个新数据集，整合了系统、网络和浏览器日志，包含870个会话（70个攻击，800个正常）约230万条事件。所有恶意事件均标注了MITRE ATT&CK技术ID，覆盖12类战术、53项技术，攻击数据使用真实的远程访问木马、C2隧道和云外泄工具生成。作者用LoRA微调了三个小语言模型（Qwen2.5-1.5B、Llama-3.2-3B、Phi-4-Mini），并在日志块分类和ATT&CK技术识别任务上评估。微调使块分类准确率从基线的约8%升至90–97%；技术识别仍具挑战，最佳精确匹配仅42%，但高部分匹配分数表明模型捕获了大部分推理过程。

关键要点

Created a novel multi-source log dataset with 2.3M events and per-event ATT&CK technique labels (12 tactics, 53 techniques) from real attack tools.
构建了包含230万事件、带有逐事件ATT&CK技术标签（12类战术、53项技术）的新多源日志数据集，攻击数据来自真实工具。
Fine-tuned Qwen2.5-1.5B, Llama-3.2-3B, and Phi-4-Mini with LoRA, improving chunk classification accuracy from ~8% to 90–97%.
用LoRA微调Qwen2.5-1.5B、Llama-3.2-3B和Phi-4-Mini，将日志块分类准确率从约8%提升至90–97%。
ATT&CK technique identification reached 42% exact-match at best, but high partial-match scores show models captured reasoning.
ATT&CK技术识别最佳精确匹配仅42%，但高部分匹配分数显示模型掌握了推理逻辑。
Addresses the gap of no public dataset combining system, network, and browser logs with granular ATT&CK labels.
填补了无公开数据集同时整合系统、网络和浏览器日志并带有细粒度ATT&CK标签的空白。

打开原文