Ruochen Wang — Scifaro

QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models

Recently, Multimodal Large Language Models (MLLMs) encounter two key issues in multi-image contexts: (1) a lack of fine-grained perception across disparate images, and (2) a diminished capability to effectively reason over and synthesize…

Computer Vision and Pattern Recognition · Computer Science 2025-11-06 Kuei-Chun Kao , Hsu Tzu-Yin , Yunqi Hong , Ruochen Wang , Cho-Jui Hsieh

Concepts or Skills? Rethinking Instruction Selection for Multi-modal Models

Vision-language instruction tuning achieves two main purposes: learning visual concepts and learning visual skills. In this paper, we found that vision-language benchmarks fall into the dichotomy of mainly benefiting from training on…

Computer Vision and Pattern Recognition · Computer Science 2025-08-15 Andrew Bai , Justin Cui , Ruochen Wang , Cho-Jui Hsieh

Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models

While recent success of large reasoning models (LRMs) significantly advanced LLMs' reasoning capability by optimizing the final answer accuracy using reinforcement learning, they may also drastically increase the output length due to…

Artificial Intelligence · Computer Science 2025-05-29 Sohyun An , Ruochen Wang , Tianyi Zhou , Cho-Jui Hsieh

Addressing Challenges in Time Series Forecasting: A Comprehensive Comparison of Machine Learning Techniques

The explosion of Time Series (TS) data, driven by advancements in technology, necessitates sophisticated analytical methods. Modern management systems increasingly rely on analyzing this data, highlighting the importance of effcient…

Machine Learning · Computer Science 2025-03-27 Seyedeh Azadeh Fallah Mortezanejad , Ruochen Wang

Physics-Informed Neural Networks with Unknown Partial Differential Equations: an Application in Multivariate Time Series

A significant advancement in Neural Network (NN) research is the integration of domain-specific knowledge through custom loss functions. This approach addresses a crucial challenge: how can models utilize physics or mathematical principles…

Machine Learning · Computer Science 2025-03-27 Seyedeh Azadeh Fallah Mortezanejad , Ruochen Wang , Ali Mohammad-Djafari

Signed Rank Chart For Tied Observations: An Application of Deep Learning Models

Shewhart Control Charts (SCC)s are constructed under the assumption of normality and are widely recognized in statistical quality control by numerous researchers. Problems arise when the distribution of process data does not conform to a…

Applications · Statistics 2025-03-27 Seyedeh Azadeh Fallah Mortezanejad , Ruochen Wang

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Recently DeepSeek R1 demonstrated how reinforcement learning with simple rule-based incentives can enable autonomous development of complex reasoning in large language models, characterized by the "aha moment", in which the model manifest…

Artificial Intelligence · Computer Science 2025-03-11 Hengguang Zhou , Xirui Li , Ruochen Wang , Minhao Cheng , Tianyi Zhou , Cho-Jui Hsieh

Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?

Large Language Models (LLMs) have demonstrated remarkable performance in solving math problems, a hallmark of human intelligence. Despite high success rates on current benchmarks; however, these often feature simple problems with only one…

Artificial Intelligence · Computer Science 2024-11-19 Kuei-Chun Kao , Ruochen Wang , Cho-Jui Hsieh

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers

The safety alignment of Large Language Models (LLMs) is vulnerable to both manual and automated jailbreak attacks, which adversarially trigger LLMs to output harmful content. However, current methods for jailbreaking LLMs, which nest entire…

Cryptography and Security · Computer Science 2024-11-13 Xirui Li , Ruochen Wang , Minhao Cheng , Tianyi Zhou , Cho-Jui Hsieh

Mitigating Bias in Dataset Distillation

Dataset Distillation has emerged as a technique for compressing large datasets into smaller synthetic counterparts, facilitating downstream training tasks. In this paper, we study the impact of bias inside the original dataset on the…

Machine Learning · Computer Science 2024-07-11 Justin Cui , Ruochen Wang , Yuanhao Xiong , Cho-Jui Hsieh

On Discrete Prompt Optimization for Diffusion Models

This paper introduces the first gradient-based framework for prompt optimization in text-to-image diffusion models. We formulate prompt engineering as a discrete optimization problem over the language space. Two major challenges arise in…

Machine Learning · Computer Science 2024-07-03 Ruochen Wang , Ting Liu , Cho-Jui Hsieh , Boqing Gong

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have…

Artificial Intelligence · Computer Science 2024-07-02 Ruochen Wang , Sohyun An , Minhao Cheng , Tianyi Zhou , Sung Ju Hwang , Cho-Jui Hsieh

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?

Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli, albeit in very different contexts. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs)…

Computation and Language · Computer Science 2024-06-27 Xirui Li , Hengguang Zhou , Ruochen Wang , Tianyi Zhou , Minhao Cheng , Cho-Jui Hsieh

Large Language Models are Interpretable Learners

The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack…

Artificial Intelligence · Computer Science 2024-06-26 Ruochen Wang , Si Si , Felix Yu , Dorothea Wiesmann , Cho-Jui Hsieh , Inderjit Dhillon

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

The concept of negative prompts, emerging from conditional generation models like Stable Diffusion, allows users to specify what to exclude from the generated images.%, demonstrating significant practical efficacy. Despite the widespread…

Computer Vision and Pattern Recognition · Computer Science 2024-06-06 Yuanhao Ban , Ruochen Wang , Tianyi Zhou , Minhao Cheng , Boqing Gong , Cho-Jui Hsieh

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

Diffusion models have achieved remarkable success in text-to-image generation tasks; however, the role of initial noise has been rarely explored. In this study, we identify specific regions within the initial noise image, termed trigger…

Computer Vision and Pattern Recognition · Computer Science 2024-06-05 Yuanhao Ban , Ruochen Wang , Tianyi Zhou , Boqing Gong , Cho-Jui Hsieh , Minhao Cheng

MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object Diffusion

Existing text-to-image models still struggle to generate images of multiple objects, especially in handling their spatial positions, relative sizes, overlapping, and attribute bindings. To efficiently address these challenges, we develop a…

Computer Vision and Pattern Recognition · Computer Science 2024-05-27 Sen Li , Ruochen Wang , Cho-Jui Hsieh , Minhao Cheng , Tianyi Zhou

Dependence control chart using maximum copula entropy

Statistical quality control methods are noteworthy to producing standard production in manufacturing processes. In this regard, there are many classical manners to control the process. Many of them have a global assumption around the…

Applications · Statistics 2024-01-25 Seyedeh Azadeh Fallah Mortezanejad , Ruochen Wang , Gholamreza Mohtashami Borzadaran , Kim Phuc Tran

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

Dataset Distillation is a newly emerging area that aims to distill large datasets into much smaller and highly informative synthetic ones to accelerate training and reduce storage. Among various dataset distillation methods,…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Justin Cui , Ruochen Wang , Si Si , Cho-Jui Hsieh

Profile control chart based on maximum entropy

Monitoring a process over time is so important in manufacturing processes to reduce the waste of money and time. Some charts as Shewhart, CUSUM, and EWMA are common to monitor a process with a single intended attribute which is used in…

Applications · Statistics 2023-11-02 Seyedeh Azadeh Fallah Mortezanejad , Ruochen Wang , Gholamreza Mohtashami Borzadaran , Renkai Ding , Kim Phuc Tran