English
Related papers

Related papers: CoFL: Continuous Flow Fields for Language-Conditio…

200 papers

In this study, we address vision-language-guided multi-robot cooperative transport, where each robot grounds natural-language instructions from onboard camera observations. A key challenge in this decentralized setting is perceptual…

Robotics · Computer Science 2026-02-10 Joachim Yann Despature , Kazuki Shibata , Takamitsu Matsubara

Generative models have emerged as a promising paradigm for offline multi-agent reinforcement learning (MARL), but existing approaches require many iterative sampling steps. Recent few-step acceleration methods either distill a joint teacher…

Artificial Intelligence · Computer Science 2026-05-14 Guowei Zou , Haitao Wang , Beiwen Zhang , Boning Zhang , Hejun Wu

Existing rectified flow models are based on linear trajectories between data and noise distributions. This linearity enforces zero curvature, which can inadvertently force the image generation process through low-probability regions of the…

Computer Vision and Pattern Recognition · Computer Science 2025-08-26 Yan Luo , Drake Du , Hao Huang , Yi Fang , Mengyu Wang

Predicting program behavior without execution is a critical task in software engineering. Existing models often fall short in capturing the dynamic dependencies among program elements. To address this, we present CodeFlow, a novel machine…

Software Engineering · Computer Science 2025-02-11 Cuong Chi Le , Hoang Nhat Phan , Huy Nhat Phan , Tien N. Nguyen , Nghi D. Q. Bui

Vision-and-Language Navigation (VLN) tasks require an agent to follow textual instructions to navigate through 3D environments. Traditional approaches use supervised learning methods, relying heavily on domain-specific datasets to train VLN…

Robotics · Computer Science 2025-02-12 Yanyuan Qiao , Wenqi Lyu , Hui Wang , Zixu Wang , Zerui Li , Yuan Zhang , Mingkui Tan , Qi Wu

A few recent studies have demonstrated that leveraging centrally pre-trained models can offer advantageous initializations for federated learning (FL). However, existing pre-training methods do not generalize well when faced with an…

Machine Learning · Computer Science 2024-12-12 Yun-Wei Chu , Dong-Jun Han , Seyyedali Hosseinalipour , Christopher G. Brinton

Visual target navigation is a critical capability for autonomous robots operating in unknown environments, particularly in human-robot interaction scenarios. While classical and learning-based methods have shown promise, most existing…

Robotics · Computer Science 2025-05-07 Bangguo Yu , Qihao Yuan , Kailai Li , Hamidreza Kasaei , Ming Cao

Visual navigation in unknown environments based solely on natural language descriptions is a key capability for intelligent robots. In this work, we propose a navigation framework built upon off-the-shelf Visual Language Models (VLMs),…

Robotics · Computer Science 2025-08-08 Weifan Zhang , Tingguang Li , Yuzhen Liu

Recent Vision-Language-Action (VLA) models built on pre-trained Vision-Language Models (VLMs) require extensive post-training, resulting in high computational overhead that limits scalability and deployment.We propose CogVLA, a…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Wei Li , Renshan Zhang , Rui Shao , Jie He , Liqiang Nie

We introduce Correspondence-Oriented Imitation Learning (COIL), a conditional policy learning framework for visuomotor control with a flexible task representation in 3D. At the core of our approach, each task is defined by the intended…

Robotics · Computer Science 2025-12-08 Yunhao Cao , Zubin Bhaumik , Jessie Jia , Xingyi He , Kuan Fang

Multiple federated learning (FL) methods are proposed for traffic flow forecasting (TFF) to avoid heavy-transmission and privacy-leaking concerns resulting from the disclosure of raw data in centralized methods. However, these FL methods…

Machine Learning · Computer Science 2024-11-22 Qingxiang Liu , Sheng Sun , Yuxuan Liang , Xiaolong Xu , Min Liu , Muhammad Bilal , Yuwei Wang , Xujing Li , Yu Zheng

UAV vision-language navigation (VLN) requires an agent to navigate complex 3D environments from an egocentric perspective while following ambiguous multi-step instructions over long horizons. Existing zero-shot methods remain limited, as…

Computer Vision and Pattern Recognition · Computer Science 2026-04-20 Dian Shao , Zhengzheng Xu , Peiyang Wang , Like Liu , Yule Wang , Jieqi Shi , Jing Huo

We present a flow-matching planner for autonomous driving that directly outputs actionable control trajectories defined by acceleration and curvature profiles. The model is conditioned on a bird's-eye-view (BEV) raster of the surrounding…

Collaborative perception allows agents to enhance their perceptual capabilities by exchanging intermediate features. Existing methods typically organize these intermediate features as 2D bird's-eye-view (BEV) representations, which discard…

Computer Vision and Pattern Recognition · Computer Science 2025-08-28 Yang Li , Quan Yuan , Guiyang Luo , Xiaoyuan Fu , Rui Pan , Yujia Yang , Congzhang Shao , Yuewen Liu , Jinglin Li

The interpretation of ego motion and scene change is a fundamental task for mobile robots. Optical flow information can be employed to estimate motion in the surroundings. Recently, unsupervised optical flow estimation has become a research…

Computer Vision and Pattern Recognition · Computer Science 2020-11-05 Hengli Wang , Rui Fan , Ming Liu

In this paper, we propose a training-free framework for vision-and-language navigation (VLN). Existing zero-shot VLN methods are mainly designed for discrete environments or involve unsupervised training in continuous simulator…

Robotics · Computer Science 2025-09-15 Hang Yin , Haoyu Wei , Xiuwei Xu , Wenxuan Guo , Jie Zhou , Jiwen Lu

Vision-language navigation (VLN) requires intelligent agents to navigate environments by interpreting linguistic instructions alongside visual observations, serving as a cornerstone task in Embodied AI. Current VLN research for unmanned…

This paper explores the problem of continual learning (CL) of vision-language models (VLMs) in open domains, where the models need to perform continual updating and inference on a streaming of datasets from diverse seen and unseen domains…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Yukun Li , Guansong Pang , Wei Suo , Chenchen Jing , Yuling Xi , Lingqiao Liu , Hao Chen , Guoqiang Liang , Peng Wang

Local navigation in cluttered environments often suffers from dense obstacles and frequent local minima. Conventional local planners rely on heuristics and are prone to failure, while deep reinforcement learning(DRL)based approaches provide…

Robotics · Computer Science 2026-03-18 Jiwon Park , Dongkyu Lee , I Made Aswin Nahrendra , Jaeyoung Lim , Hyun Myung

Autonomous driving, particularly navigating complex and unanticipated scenarios, demands sophisticated reasoning and planning capabilities. While Multi-modal Large Language Models (MLLMs) offer a promising avenue for this, their use has…

Computer Vision and Pattern Recognition · Computer Science 2025-10-15 Hidehisa Arai , Keita Miwa , Kento Sasaki , Yu Yamaguchi , Kohei Watanabe , Shunsuke Aoki , Issei Yamamoto
‹ Prev 1 2 3 10 Next ›