ReposSource: GITHUBJune 16, 2026Importance: 2/5

llama.cpp Adds Post-Decode Callback to mtmd for Multimodal Processing

English summary

The llama.cpp release b9654 adds a post-decode callback to the mtmd (multimodal text decode) module, implemented in PR #24645. The development was assisted by the Qwen3.6-27B language model. Pre-built binaries are provided for macOS Apple Silicon, Linux x64/arm64, Windows x64/arm64, and Android, with various GPU backends (Vulkan, CUDA 12/13, ROCm, SYCL, HIP) and some configurations disabled.

Chinese summary

llama.cpp 版本 b9654 为 mtmd（多模态文本解码）模块新增了一个解码后回调，该功能由 PR #24645 实现，并得到了 Qwen3.6-27B 模型的支持。此次发布提供了针对 macOS Apple Silicon、Linux x64/arm64、Windows x64/arm64 和 Android 平台的预编译二进制文件，涵盖多种 GPU 后端（Vulkan、CUDA 12/13、ROCm、SYCL、HIP），部分构建因平台问题被禁用。

Key points

The mtmd component now includes a post-decode callback mechanism.
mtmd 组件现包含解码后回调机制。
The feature implementation was assisted by the Qwen3.6-27B model.
该功能的实现得到了 Qwen3.6-27B 模型的辅助。
The release supplies binaries for a wide range of CPU and GPU configurations across macOS, Linux, Windows, and Android.
发布版提供了适用于 macOS、Linux、Windows 和 Android 上多种 CPU 与 GPU 配置的二进制文件。

Open original