Ruochen Zhao — Scifaro

Unified-MAS: Universally Generating Domain-Specific Nodes for Empowering Automatic Multi-Agent Systems

Automatic Multi-Agent Systems (MAS) generation has emerged as a promising paradigm for solving complex reasoning tasks. However, existing frameworks are fundamentally bottlenecked when applied to knowledge-intensive domains (e.g.,…

Artificial Intelligence · Computer Science 2026-03-24 Hehai Lin , Yu Yan , Zixuan Wang , Bo Xu , Sudong Wang , Weiquan Huang , Ruochen Zhao , Minzhi Li , Chengwei Qin

Training Multi-Turn Search Agent via Contrastive Dynamic Branch Sampling

Agentic reinforcement learning has enabled large language models to perform complex multi-turn planning and tool use. However, learning in long-horizon settings remains challenging due to sparse, trajectory-level outcome rewards. While…

Computation and Language · Computer Science 2026-02-04 Yubao Zhao , Weiquan Huang , Sudong Wang , Ruochen Zhao , Chen Chen , Yao Shu , Chengwei Qin

DR-Arena: an Automated Evaluation Framework for Deep Research Agents

As Large Language Models (LLMs) increasingly operate as Deep Research (DR) Agents capable of autonomous investigation and information synthesis, reliable evaluation of their task performance has become a critical bottleneck. Current…

Computation and Language · Computer Science 2026-01-16 Yiwen Gao , Ruochen Zhao , Yang Deng , Wenxuan Zhang

AgREE: Agentic Reasoning for Knowledge Graph Completion on Emerging Entities

Open-domain Knowledge Graph Completion (KGC) faces significant challenges in an ever-changing world, especially when considering the continual emergence of new entities in daily news. Existing approaches for KGC mainly rely on pretrained…

Artificial Intelligence · Computer Science 2025-08-07 Ruochen Zhao , Simone Conia , Eric Peng , Min Li , Saloni Potdar

A Comprehensive Survey of Contamination Detection Methods in Large Language Models

With the rise of Large Language Models (LLMs) in recent years, abundant new opportunities are emerging, but also new challenges, among which contamination is quickly becoming critical. Business applications and fundraising in Artificial…

Computation and Language · Computer Science 2025-07-11 Mathieu Ravaut , Bosheng Ding , Fangkai Jiao , Hailin Chen , Xingxuan Li , Ruochen Zhao , Chengwei Qin , Caiming Xiong , Shafiq Joty

Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents

Effective research ideation is a critical step for scientific research. However, the exponential increase in scientific literature makes it challenging for researchers to stay current with recent advances and identify meaningful research…

Artificial Intelligence · Computer Science 2024-10-31 Long Li , Weiwen Xu , Jiayan Guo , Ruochen Zhao , Xingxuan Li , Yuqian Yuan , Boqiang Zhang , Yuming Jiang , Yifei Xin , Ronghao Dang , Deli Zhao , Yu Rong , Tian Feng , Lidong Bing

Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions

As LLMs continuously evolve, there is an urgent need for a reliable evaluation method that delivers trustworthy results promptly. Currently, static benchmarks suffer from inflexibility and unreliability, leading users to prefer human voting…

Computation and Language · Computer Science 2024-10-08 Ruochen Zhao , Wenxuan Zhang , Yew Ken Chia , Weiwen Xu , Deli Zhao , Lidong Bing

Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks

State-of-the-art large language models (LLMs) exhibit impressive problem-solving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought and…

Computation and Language · Computer Science 2024-10-03 Xingxuan Li , Weiwen Xu , Ruochen Zhao , Fangkai Jiao , Shafiq Joty , Lidong Bing

Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges

In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This…

Computation and Language · Computer Science 2024-07-03 Bosheng Ding , Chengwei Qin , Ruochen Zhao , Tianze Luo , Xinze Li , Guizhen Chen , Wenhan Xia , Junjie Hu , Anh Tuan Luu , Shafiq Joty

Lifelong Event Detection with Embedding Space Separation and Compaction

To mitigate forgetting, existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task. However, the simple combination of memory data and new-task samples…

Computation and Language · Computer Science 2024-04-04 Chengwei Qin , Ruirui Chen , Ruochen Zhao , Wenhan Xia , Shafiq Joty

Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources

We present chain-of-knowledge (CoK), a novel framework that augments large language models (LLMs) by dynamically incorporating grounding information from heterogeneous sources. It results in more factual rationales and reduced hallucination…

Computation and Language · Computer Science 2024-02-22 Xingxuan Li , Ruochen Zhao , Yew Ken Chia , Bosheng Ding , Shafiq Joty , Soujanya Poria , Lidong Bing

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of AI, both in research and commerce. Through instruction-tuning a large language model (LLM) with supervised fine-tuning and reinforcement learning…

Computation and Language · Computer Science 2024-01-17 Hailin Chen , Fangkai Jiao , Xingxuan Li , Chengwei Qin , Mathieu Ravaut , Ruochen Zhao , Caiming Xiong , Shafiq Joty

Retrieving Multimodal Information for Augmented Generation: A Survey

As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception…

Computation and Language · Computer Science 2023-12-04 Ruochen Zhao , Hailin Chen , Weishi Wang , Fangkai Jiao , Xuan Long Do , Chengwei Qin , Bosheng Ding , Xiaobao Guo , Minzhi Li , Xingxuan Li , Shafiq Joty

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to…

Computation and Language · Computer Science 2023-11-21 Chengwei Qin , Qian Li , Ruochen Zhao , Shafiq Joty

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

As large language models (LLMs) have become the norm in NLP, demonstrating good performance in generation and reasoning tasks, one of its most fatal disadvantages is the lack of factual correctness. Generating unfactual texts not only leads…

Computation and Language · Computer Science 2023-10-10 Ruochen Zhao , Xingxuan Li , Shafiq Joty , Chengwei Qin , Lidong Bing

PromptSum: Parameter-Efficient Controllable Abstractive Summarization

Prompt tuning (PT), a parameter-efficient technique that only tunes the additional prompt embeddings while keeping the backbone pre-trained language model (PLM) frozen, has shown promising results in language understanding tasks, especially…

Computation and Language · Computer Science 2023-08-08 Mathieu Ravaut , Hailin Chen , Ruochen Zhao , Chengwei Qin , Shafiq Joty , Nancy Chen

Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications

Large-scale pre-trained language models have shown outstanding performance in a variety of NLP tasks. However, they are also known to be significantly brittle against specifically crafted adversarial examples, leading to increasing interest…

Computation and Language · Computer Science 2023-05-12 Han Cheol Moon , Shafiq Joty , Ruochen Zhao , Megh Thakkar , Xu Chi

Explaining Language Models' Predictions with High-Impact Concepts

The emergence of large-scale pretrained language models has posed unprecedented challenges in deriving explanations of why the model has made some predictions. Stemmed from the compositional nature of languages, spurious correlations have…

Computation and Language · Computer Science 2023-05-04 Ruochen Zhao , Shafiq Joty , Yongjie Wang , Tan Wang

Can ChatGPT-like Generative Models Guarantee Factual Accuracy? On the Mistakes of New Generation Search Engines

Although large conversational AI models such as OpenAI's ChatGPT have demonstrated great potential, we question whether such models can guarantee factual accuracy. Recently, technology companies such as Microsoft and Google have announced…

Computation and Language · Computer Science 2023-04-24 Ruochen Zhao , Xingxuan Li , Yew Ken Chia , Bosheng Ding , Lidong Bing