The paper analyzes on-policy distillation (OPD), a post-training method combining on-policy student trajectories and dense teacher supervision. The study finds that OPD-style updates are small and coordinate-sparse, distributed across layers and FFN-heavy. Training only the discovered sparse subnetwork recovers nearly full OPD performance, but the sparsity-inducing SGD optimizer underperforms AdamW because dense supervision preserves heterogeneous gradient scales that benefit from adaptive scaling. Geometrically, the updates are numerically full-rank but spectrally concentrated, lying away from the principal singular subspaces of the source weights and disproportionately on coordinates where source weights are near zero. The results show that OPD retains geometric signatures of on-policy post-training rather than behaving as dense parameter rewriting.
PapersSource: ARXIVImportance: 4/5
The paper introduces SkMTEB, the first comprehensive MTEB-style text embedding benchmark for Slovak, comprising 31 datasets across 7 task types. Evaluation of 31 embedding models shows large instruction-tuned multilingual models perform best, while existing Slovak-specific NLU models transfer poorly to embedding tasks. The authors develop e5-sk-small (45M parameters) and e5-sk-large (365M) by vocabulary trimming and fine-tuning Multilingual E5 models. Despite size reductions of up to 62%, these open-source models achieve competitive performance with proprietary APIs and are suitable for local deployment in semantic search and RAG. The benchmark, models, datasets, and code are released openly, offering a replicable path for other under-resourced languages.
TutorialsSource: MEDIUM LARGE LANGUAGE MODELSImportance: 1/5
The provided article excerpt only contains a metaphor comparing a pre-trained model to a professional pianist who can play various music styles. No specific information about fine-tuning methods, steps, or examples is included. The full content is not accessible.
Pyrecall is a new open-source tool built to address the lack of practical tooling for continual learning research. It snapshots skill scores before and after fine-tuning, flags performance regressions, and supports rolling back LoRA adapters by name. The tool runs fully locally, is released under the MIT license at v0.1.0, and can be installed via pip. The developer is seeking community feedback on the benchmark design.
SocialSource: REDDIT ARTIFICIALImportance: 2/5
Deploying an initial AI model is rarely the hard part; real users introduce internal terminology, incomplete queries, and messy documents that benchmarks never capture. Most production systems do not connect inference logs, dataset curation, fine‑tuning, and evaluation within a single loop, turning every model improvement into a separate one-off project. The core bottleneck is model iteration—the ability to convert production traffic into failure patterns, create or curate datasets, re‑train or fine‑tune, and redeploy consistently. The post describes an insurance chatbot use case where a continuous feedback loop from production logs to post‑training and redeployment improved the model, and notes that platforms like Data Lab treat logs, datasets, post‑training, and deployment as parts of the same iteration cycle.
SocialSource: XImportance: 2/5
On the NVIDIA AI Podcast, Mistral AI CTO and co-founder Timothée Lacroix discussed the company's open-model philosophy, its Forge customization framework, and the collaboration with NVIDIA through the Nemotron Coalition. The conversation addresses bringing open models to enterprise environments. Lacroix elaborated on Mistral's approach to openness and model adaptation. The Nemotron Coalition is a partnership aimed at advancing AI capabilities.