robotics

【论文速递】2026年第03周(Jan-11-17)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准In real-world video question answering scenarios, videos often provide only localized visual cues, while verifiable answers are distributed across the open web; models therefore need to jointly perform cross-frame clue extr

【论文速递】2026年第02周(Jan-04-10)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准As language models become increasingly capable, users expect them to provide not only accurate responses but also behaviors aligned with diverse human preferences across a variety of scenarios. To achieve this, Reinforcemen

【论文速递】2026年第01周(Dec-28-Jan-03)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifying connectivity pattern

【论文速递】2025年第50周(Dec-07-13)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准We present Wan-Move, a simple and scalable framework that brings motion control to video generative models. Existing motion-controllable methods typically suffer from coarse control granularity and limited scalability, leav

【论文速递】2025年第47周(Nov-16-22)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准This report introduces Kandinsky 5.0, a family of state-of-the-art foundation models for high-resolution image and 10-second video synthesis. The framework comprises three core line-up of models: Kandinsky 5.0 Image Lite -

【论文速递】2025年第51周(Dec-14-20)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准We present Kling-Omni, a generalist generative framework designed to synthesize high-fidelity videos directly from multimodal visual language inputs. Adopting an end-to-end perspective, Kling-Omni bridges the functional sep

[EAI-035] 机器人的“ChatGPT 时刻”还有多远？从VLA模型 π0.5 看开放世界泛化的突破π0.5 是基于 π0 开发的视觉-语言-动作（VLA）模型，旨在解决机器人学习中的核心难题：开放世界泛化（Open-World Generalization）。通过在异构数据源（包括不同形态的机器人、高层语义预测、Web 数据和口头指令）上进行共训练（Co-training），π0.5 能够控制移动操作机器人在从未见过的家庭环境中执行长程、多阶段的家务任务（如收拾厨房、叠被子），任务时长可达 10-15 分钟。

【论文速递】2025年第37周(Sep-07-13)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准Post-training language models (LMs) with reinforcement learning (RL) can enhance their complex reasoning capabilities without supervised fine-tuning, as demonstrated by DeepSeek-R1-Zero. However, effectively utilizing RL fo

【论文速递】2025年第42周(Oct-12-18)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准We propose QeRL, a Quantization-enhanced Reinforcement Learning framework for large language models (LLMs). While RL is essential for LLMs’ reasoning capabilities, it is resource-intensive, requiring substantial GPU memory

【论文速递】2025年第41周(Oct-05-11)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Su

【论文速递】2025年第33周(Aug-10-16)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准标题: GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

【论文速递】2025年第32周(Aug-03-09)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准We present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. To address the challenges of complex text rendering, we

3D空间表征基础本文所指三维空间中的表征默认是在一组单位正交基构成的右手系下所进行的欧氏变换。欧氏变换（刚性变换）：改变物体的空间位置，不改变形状、大小，包括旋转变换和平移变换。

【论文速递】2025年第30周(Jul-20-26)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准This paper introduces Group Sequence Policy Optimization (GSPO), our stable, efficient, and performant reinforcement learning algorithm for training large language models. Unlike previous algorithms that adopt token-level i

【论文速递】2025年第28周(Jul-06-12)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准We introduce a full-stack framework that scales up reasoning in vision-language models (VLMs) to long videos, leveraging reinforcement learning. We address the unique challenges of long video reasoning by integrating three

【论文速递】2025年第29周(Jul-13-19)(Robotics/Embodied AI/LLM)中文使用 googletrans 翻译，翻译不对的地方以英文为准标题: A Survey of Context Engineering for Large Language Models

视觉SLAM14精讲——相机与图像3.1相机是VSLAM中的核心传感器。本章知识点内容涉及到相机相关的知识以及3D计算视觉的一些基础内容。技术栈涉及到相机内外参的标定，投影，以及三维重建。

【AI视野·今日Robot 机器人论文速览第七十八期】Wed, 17 Jan 2024AI视野·今日CS.Robotics 机器人学论文速览 Wed, 17 Jan 2024 Totally 49 papers 👉上期速览✈更多精彩请移步主页

【AI视野·今日Robot 机器人论文速览第四十九期】Fri, 6 Oct 2023AI视野·今日CS.Robotics 机器人学论文速览 Fri, 6 Oct 2023 Totally 29 papers 👉上期速览✈更多精彩请移步主页

【AI视野·今日Robot 机器人论文速览第四十七期】Wed, 4 Oct 2023AI视野·今日CS.Robotics 机器人学论文速览 Wed, 4 Oct 2023 Totally 40 papers 👉上期速览✈更多精彩请移步主页