Bill Howe — Scifaro

STOP: Structured On-Policy Pruning of Long-Form Reasoning in Low-Data Regimes

Long chain-of-thought (Long CoT) reasoning improves performance on multi-step problems, but it also induces overthinking: models often generate low-yield reasoning that increases inference cost and latency. This inefficiency is especially…

Computation and Language · Computer Science 2026-05-14 Chenjun Xu , Zhennan Zhou , Zhan Su , Bill Howe , Lucy Lu Wang , Bingbing Wen

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

Domain reweighting can improve sample efficiency and downstream generalization, but data-mixture optimization for multimodal midtraining remains largely unexplored. Current multimodal training recipes tune mixtures along a single dimension,…

Machine Learning · Computer Science 2026-04-17 Bingbing Wen , Sirajul Salekin , Feiyang Kang , Bill Howe , Lucy Lu Wang , Javier Movellan , Manjot Bilkhu

Understanding Privacy Norms Around LLM-Based Chatbots: A Contextual Integrity Perspective

LLM-driven chatbots like ChatGPT have created large volumes of conversational data, but little is known about how user privacy expectations are evolving with this technology. We conduct a survey experiment with 300 US ChatGPT users to…

Computers and Society · Computer Science 2025-08-12 Sarah Tran , Hongfan Lu , Isaac Slaughter , Bernease Herman , Aayushi Dangol , Yue Fu , Lufei Chen , Biniyam Gebreyohannes , Bill Howe , Alexis Hiniker , Nicholas Weber , Robert Wolfe

Can Large Language Models Integrate Spatial Data? Empirical Insights into Reasoning Strengths and Computational Weaknesses

We explore the application of large language models (LLMs) to empower domain experts in integrating large, heterogeneous, and noisy urban spatial datasets. Traditional rule-based integration methods are unable to cover all edge cases,…

Artificial Intelligence · Computer Science 2025-08-08 Bin Han , Robert Wolfe , Anat Caspi , Bill Howe

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs

Psychology research has shown that humans are poor at estimating their performance on tasks, tending towards underconfidence on easy tasks and overconfidence on difficult tasks. We examine three LLMs, Llama-3-70B-instruct, Claude-3-Sonnet,…

Artificial Intelligence · Computer Science 2025-07-29 Chenjun Xu , Bingbing Wen , Bin Han , Robert Wolfe , Lucy Lu Wang , Bill Howe

Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?

Finetuning can cause spurious correlations to arise between non-essential features and the target labels, but benchmarks to study these effects involve contrived settings and narrow tasks. In contrast, we consider spurious correlations in…

Computer Vision and Pattern Recognition · Computer Science 2025-06-24 Yiwei Yang , Chung Peng Lee , Shangbin Feng , Dora Zhao , Bingbing Wen , Anthony Z. Liu , Yulia Tsvetkov , Bill Howe

Fragments to Facts: Partial-Information Fragment Inference from LLMs

Large language models (LLMs) can leak sensitive training data through memorization and membership inference attacks. Prior work has primarily focused on strong adversarial assumptions, including attacker access to entire samples or long,…

Machine Learning · Computer Science 2025-05-21 Lucas Rosenblatt , Bin Han , Robert Wolfe , Bill Howe

Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery

LLMs increasingly serve as tools for knowledge acquisition, yet users cannot effectively specify how they want information presented. When users request that LLMs "cite reputable sources," "express appropriate uncertainty," or "include…

Human-Computer Interaction · Computer Science 2025-04-03 Nicholas Clark , Hua Shen , Bill Howe , Tanushree Mitra

Know Your Limits: A Survey of Abstention in Large Language Models

Abstention, the refusal of large language models (LLMs) to provide an answer, is increasingly recognized for its potential to mitigate hallucinations and enhance safety in LLM systems. In this survey, we introduce a framework to examine…

Computation and Language · Computer Science 2025-02-13 Bingbing Wen , Jihan Yao , Shangbin Feng , Chenjun Xu , Yulia Tsvetkov , Bill Howe , Lucy Lu Wang

Are Data Experts Buying into Differentially Private Synthetic Data? Gathering Community Perspectives

Data privacy is a core tenet of responsible computing, and in the United States, differential privacy (DP) is the dominant technical operationalization of privacy-preserving data analysis. With this study, we qualitatively examine one class…

Human-Computer Interaction · Computer Science 2024-12-18 Lucas Rosenblatt , Bill Howe , Julia Stoyanovich

Reliable, Routable, and Reproducible: Collection of Pedestrian Pathways at Statewide Scale

While advances in mobility technology including autonomous vehicles and multi-modal navigation systems can improve mobility equity for people with disabilities, these technologies depend crucially on accurate, standardized, and complete…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Yuxiang Zhang , Bill Howe , Anat Caspi

Characterizing LLM Abstention Behavior in Science QA with Context Perturbations

The correct model response in the face of uncertainty is to abstain from answering a question so as not to mislead the user. In this work, we study the ability of LLMs to abstain from answering context-dependent science questions when…

Computation and Language · Computer Science 2024-10-08 Bingbing Wen , Bill Howe , Lucy Lu Wang

ML-EAT: A Multilevel Embedding Association Test for Interpretable and Transparent Social Science

This research introduces the Multilevel Embedding Association Test (ML-EAT), a method designed for interpretable and transparent measurement of intrinsic bias in language technologies. The ML-EAT addresses issues of ambiguity and difficulty…

Computation and Language · Computer Science 2024-08-29 Robert Wolfe , Alexis Hiniker , Bill Howe

Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI

Multimodal AI models capable of associating images and text hold promise for numerous domains, ranging from automated image captioning to accessibility applications for blind and low-vision users. However, uncertainty about bias has in some…

Computer Vision and Pattern Recognition · Computer Science 2024-08-29 Robert Wolfe , Aayushi Dangol , Alexis Hiniker , Bill Howe

Representation Bias of Adolescents in AI: A Bilingual, Bicultural Study

Popular and news media often portray teenagers with sensationalism, as both a risk to society and at risk from society. As AI begins to absorb some of the epistemic functions of traditional media, we study how teenagers in two countries…

Computers and Society · Computer Science 2024-08-06 Robert Wolfe , Aayushi Dangol , Bill Howe , Alexis Hiniker

Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper)

Equitable urban transportation applications require high-fidelity digital representations of the built environment: not just streets and sidewalks, but bike lanes, marked and unmarked crossings, curb ramps and cuts, obstructions, traffic…

Computer Vision and Pattern Recognition · Computer Science 2024-08-05 Bin Han , Yiwei Yang , Anat Caspi , Bill Howe

SARN: Structurally-Aware Recurrent Network for Spatio-Temporal Disaggregation

Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate learning and integration for downstream AI/ML systems. In this work, we consider models to…

Machine Learning · Computer Science 2024-08-05 Bin Han , Bill Howe

PathwayBench: Assessing Routability of Pedestrian Pathway Networks Inferred from Multi-City Imagery

Applications to support pedestrian mobility in urban areas require a complete, and routable graph representation of the built environment. Globally available information, including aerial imagery provides a scalable source for constructing…

Computer Vision and Pattern Recognition · Computer Science 2024-07-25 Yuxiang Zhang , Bill Howe , Sachin Mehta , Nicholas-J Bolten , Anat Caspi

Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

The rapid proliferation of generative AI has raised questions about the competitiveness of lower-parameter, locally tunable, open-weight models relative to high-parameter, API-guarded, closed-weight models in terms of performance, domain…

Machine Learning · Computer Science 2024-05-28 Robert Wolfe , Isaac Slaughter , Bin Han , Bingbing Wen , Yiwei Yang , Lucas Rosenblatt , Bernease Herman , Eva Brown , Zening Qu , Nic Weber , Bill Howe

InfoVisDial: An Informative Visual Dialogue Dataset by Bridging Large Multimodal and Language Models

In this paper, we build a visual dialogue dataset, named InfoVisDial, which provides rich informative answers in each round even with external knowledge related to the visual content. Different from existing datasets where the answer is…

Computer Vision and Pattern Recognition · Computer Science 2023-12-22 Bingbing Wen , Zhengyuan Yang , Jianfeng Wang , Zhe Gan , Bill Howe , Lijuan Wang