Related papers: ProTo: Program-Guided Transformer for Program-Guid…

Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot. Prior approaches for this task possess one of the following limitations: (i) rely on…

Robotics · Computer Science 2024-04-03 Namasivayam Kalithasan , Himanshu Singh , Vishal Bindal , Arnav Tuli , Vishwajeet Agrawal , Rahul Jain , Parag Singla , Rohan Paul

Instruction-Following Agents with Multimodal Transformer

Humans are excellent at understanding language and vision to accomplish a wide range of tasks. In contrast, creating general instruction-following embodied agents remains a difficult challenge. Prior work that uses pure language-only models…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Hao Liu , Lisa Lee , Kimin Lee , Pieter Abbeel

Prospection: Interpretable Plans From Language By Predicting the Future

High-level human instructions often correspond to behaviors with multiple implicit steps. In order for robots to be useful in the real world, they must be able to to reason over both motions and intermediate goals implied by human…

Artificial Intelligence · Computer Science 2019-03-21 Chris Paxton , Yonatan Bisk , Jesse Thomason , Arunkumar Byravan , Dieter Fox

ProToM: Promoting Prosocial Behaviour via Theory of Mind-Informed Feedback

While humans are inherently social creatures, the challenge of identifying when and how to assist and collaborate with others - particularly when pursuing independent goals - can hinder cooperation. To address this challenge, we aim to…

Artificial Intelligence · Computer Science 2025-09-08 Matteo Bortoletto , Yichao Zhou , Lance Ying , Tianmin Shu , Andreas Bulling

Grounding Spatio-Temporal Language with Transformers

Language is an interface to the outside world. In order for embodied agents to use it, language must be grounded in other, sensorimotor modalities. While there is an extended literature studying how machines can learn grounded language, the…

Artificial Intelligence · Computer Science 2021-10-12 Tristan Karch , Laetitia Teodorescu , Katja Hofmann , Clément Moulin-Frier , Pierre-Yves Oudeyer

Cognitive maps are generative programs

Making sense of the world and acting in it relies on building simplified mental representations that abstract away aspects of reality. This principle of cognitive mapping is universal to agents with limited resources. Living organisms,…

Artificial Intelligence · Computer Science 2025-04-30 Marta Kryven , Cole Wyeth , Aidan Curtis , Kevin Ellis

Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction

Developing a generalist agent is a longstanding objective in artificial intelligence. Previous efforts utilizing extensive offline datasets from various tasks demonstrate remarkable performance in multitasking scenarios within Reinforcement…

Artificial Intelligence · Computer Science 2024-11-19 Yonggang Jin , Ge Zhang , Hao Zhao , Tianyu Zheng , Jarvi Guo , Liuyu Xiang , Shawn Yue , Stephen W. Huang , Zhaofeng He , Jie Fu

Translating Natural Language Instructions to Computer Programs for Robot Manipulation

It is highly desirable for robots that work alongside humans to be able to understand instructions in natural language. Existing language conditioned imitation learning models directly predict the actuator commands from the image…

Robotics · Computer Science 2021-03-23 Sagar Gubbi Venkatesh , Raviteja Upadrashta , Bharadwaj Amrutur

Can Transformers Learn to Solve Problems Recursively?

Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them. While semantic information plays a crucial part in these processes, it remains unclear to what degree popular…

Machine Learning · Computer Science 2023-06-27 Shizhuo Dylan Zhang , Curt Tigges , Stella Biderman , Maxim Raginsky , Talia Ringer

Learning Semantic-Geometric Task Graph-Representations from Human Demonstrations

Learning structured task representations from human demonstrations is essential for understanding long-horizon manipulation behaviors, particularly in bimanual settings where action ordering, object involvement, and interaction geometry can…

Robotics · Computer Science 2026-01-19 Franziska Herbert , Vignesh Prasad , Han Liu , Dorothea Koert , Georgia Chalvatzaki

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

Robots operating in the real world require both rich manipulation skills as well as the ability to semantically reason about when to apply those skills. Towards this goal, recent works have integrated semantic representations from…

Artificial Intelligence · Computer Science 2023-04-28 Renhao Wang , Jiayuan Mao , Joy Hsu , Hang Zhao , Jiajun Wu , Yang Gao

PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback

Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track…

Computation and Language · Computer Science 2025-08-22 Alexandru Coca , Bo-Hsiang Tseng , Pete Boothroyd , Jianpeng Cheng , Mark Gaynor , Zhenxing Zhang , Joe Stacey , Tristan Guigue , Héctor Martinez Alonso , Diarmuid Ó Séaghdha , Anders Johannsen

Prototype Transformer: Towards Language Model Architectures Interpretable by Design

While state-of-the-art language models (LMs) surpass the vast majority of humans in certain domains, their reasoning remains largely opaque, undermining trust in their output. Furthermore, while autoregressive LMs can output explicit…

Artificial Intelligence · Computer Science 2026-02-13 Yordan Yordanov , Matteo Forasassi , Bayar Menzat , Ruizhi Wang , Chang Qi , Markus Kaltenberger , Amine M'Charrak , Tommaso Salvatori , Thomas Lukasiewicz

Functional Transparency for Structured Data: a Game-Theoretic Approach

We provide a new approach to training neural models to exhibit transparency in a well-defined, functional manner. Our approach naturally operates over structured data and tailors the predictor, functionally, towards a chosen family of…

Machine Learning · Computer Science 2019-02-27 Guang-He Lee , Wengong Jin , David Alvarez-Melis , Tommi S. Jaakkola

Imitation-Projected Programmatic Reinforcement Learning

We study the problem of programmatic reinforcement learning, in which policies are represented as short programs in a symbolic language. Programmatic policies can be more interpretable, generalizable, and amenable to formal verification…

Machine Learning · Computer Science 2021-01-21 Abhinav Verma , Hoang M. Le , Yisong Yue , Swarat Chaudhuri

Emergent Representations of Program Semantics in Language Models Trained on Programs

We present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train a Transformer model on a synthetic corpus of…

Machine Learning · Computer Science 2024-08-06 Charles Jin , Martin Rinard

Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning

Transformer-based language models have achieved significant success; however, their internal mechanisms remain largely opaque due to the complexity of non-linear interactions and high-dimensional operations. While previous studies have…

Artificial Intelligence · Computer Science 2025-02-17 Lin Zhang , Lijie Hu , Di Wang

TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills

Code pre-trained models (CodePTMs) have recently demonstrated a solid capacity to process various software intelligence tasks, e.g., code clone detection, code translation, and code summarization. The current mainstream method that deploys…

Software Engineering · Computer Science 2024-05-10 Qiushi Sun , Nuo Chen , Jianing Wang , Xiang Li , Ming Gao

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Even though Transformers are extensively used for Natural Language Processing tasks, especially for machine translation, they lack an explicit memory to store key concepts of processed texts. This paper explores the properties of the…

Computation and Language · Computer Science 2024-06-21 Alsu Sagirova , Mikhail Burtsev

A Survey on Natural Language Processing for Programming

Natural language processing for programming aims to use NLP techniques to assist programming. It is increasingly prevalent for its effectiveness in improving productivity. Distinct from natural language, a programming language is highly…

Computation and Language · Computer Science 2023-08-08 Qingfu Zhu , Xianzhen Luo , Fang Liu , Cuiyun Gao , Wanxiang Che