llama.cpp Adds Post-Decode Callback to mtmd for Multimodal Processing
English summary
The llama.cpp release b9654 adds a post-decode callback to the mtmd (multimodal text decode) module, implemented in PR #24645. The development was assisted by the Qwen3.6-27B language model. Pre-built binaries are provided for macOS Apple Silicon, Linux x64/arm64, Windows x64/arm64, and Android, with various GPU backends (Vulkan, CUDA 12/13, ROCm, SYCL, HIP) and some configurations disabled.
Chinese summary
llama.cpp 版本 b9654 为 mtmd(多模态文本解码)模块新增了一个解码后回调,该功能由 PR #24645 实现,并得到了 Qwen3.6-27B 模型的支持。此次发布提供了针对 macOS Apple Silicon、Linux x64/arm64、Windows x64/arm64 和 Android 平台的预编译二进制文件,涵盖多种 GPU 后端(Vulkan、CUDA 12/13、ROCm、SYCL、HIP),部分构建因平台问题被禁用。
Key points
The mtmd component now includes a post-decode callback mechanism.
mtmd 组件现包含解码后回调机制。
The feature implementation was assisted by the Qwen3.6-27B model.
该功能的实现得到了 Qwen3.6-27B 模型的辅助。
The release supplies binaries for a wide range of CPU and GPU configurations across macOS, Linux, Windows, and Android.
发布版提供了适用于 macOS、Linux、Windows 和 Android 上多种 CPU 与 GPU 配置的二进制文件。