Chinese embodied intelligence company/research group Kuawei has adapted Bird's Eye View (BEV) techniques, originally from autonomous driving, for robot data representation. The method aims to unify spatial perception, enabling large-scale training of robotic systems. By applying BEV, robot learning data can be scaled efficiently, similar to the scaling laws observed in large language models. This reflects a trend of cross-domain technology transfer from autonomous vehicles to general-purpose robotics. Limited technical details are provided in the announcement.
ReleasesSource: QBITAIImportance: 1/5
A QuantumBit (QbitAI) article, authorized for reposting from Zhixiang Future, carries a title asserting that the HiDream-O1-Image-1.5 model ranks first in China and second globally on a text-to-image generation leaderboard, surpassing Google and NVIDIA. The article body consists solely of a copyright notice and offers no technical details, benchmark results, or verification of the claim. As a result, the report lacks substantive content to support its headline.
ReleasesSource: QBITAIImportance: 1/5
The article body contains only the cryptic phrase 'Stand on tiptoe.' The title suggests Douyin (TikTok's Chinese version) is recruiting AI video talents, but no concrete details—such as program scope, requirements, or incentives—are provided. The content offers no actionable facts.
ReleasesSource: DEEPMIND BLOGImportance: 4/5
Google DeepMind announced Gemini 3.5 Live Translate, a feature that provides near real-time, natural voice translation. The capability is now available in Google AI Studio, Google Translate, and Google Meet. It delivers fluid, conversational translations, minimizing robotic outputs and reducing lag. This integration brings live speech translation directly into Google's widely used communication and development platforms.
Google DeepMind released Gemma 4 12B, a 12-billion-parameter open multimodal model. The model handles text and images without a separate vision encoder through a unified architecture. It is part of the Gemma family of open models. The announcement highlights the encoder-free design but provides no further performance or capability details.