llama.cpp 新增 Cohere2MoE (North Code) 专用对话解析器
英文摘要
Release b9637 of llama.cpp introduces a dedicated chat parser for the Cohere2MoE model architecture, referred to as North Code. The parser is implemented via PR #24615 to ensure correct conversation formatting for Cohere's mixture-of-experts variant. The release ships pre-built binaries for macOS, Linux, Windows, and Android across CPU, CUDA, Vulkan, ROCm, SYCL, and other backends. No other functional changes are noted in the release notes beyond this parser addition and some internal renames.
中文摘要
llama.cpp 发布 b9637 版本,为 Cohere2MoE 模型架构(North Code)引入了专用对话解析器。解析器通过 PR #24615 实现,确保 Cohere 混合专家模型的对话格式正确。该版本提供适用于 macOS、Linux、Windows 和 Android 的预编译二进制文件,支持 CPU、CUDA、Vulkan、ROCm、SYCL 等多种后端。发布说明中除该解析器和一些内部重命名外,无其他功能性变更。
关键要点
Added a dedicated chat parser for Cohere2MoE (North Code) to handle conversation formatting.
新增 Cohere2MoE (North Code) 专用对话解析器,处理对话格式。
Implementation is contained in pull request #24615.
由 PR #24615 实现。
Release includes pre-built binaries for multiple platforms and hardware backends.
包含多平台、多硬件后端的预编译二进制文件。