Yu
Prompt optimization improves language models without updating their weights by searching for a better system prompt, but its effectiveness varies widely across tasks. We study what makes a task amenable to prompt optimization. We show that…
Mixture-of-Experts (MoE) architecture offers enhanced efficiency for Large Language Models (LLMs) with modularized computation, yet its inherent sparsity poses significant hardware deployment challenges, including memory locality issues,…
Post hoc explainers such as SHAP and LIME are used widely in business research to interpret complex machine learning models. Although they were designed to explain model predictions, there has been an increasing trend in which the…
We present a hardware realization and measurements of a tetron qubit device in a superconductor-semiconductor heterostructure. The device architecture contains two parallel superconducting nanowires, which support four Majorana zero modes…
Mixture-of-experts (MoE) architectures could achieve impressive computational efficiency with expert parallelism, which relies heavily on all-to-all communication across devices. Unfortunately, such communication overhead typically…
To develop trustworthy Vision-Language Models (VLMs), it is essential to address adversarial robustness and hallucination mitigation, both of which impact factual accuracy in high-stakes applications such as defense and healthcare. Existing…
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and…
Vehicle-based mobile sensing, also known as drive-by sensing, efficiently surveys urban environments at low costs by leveraging the mobility of urban vehicles. While recent studies have focused on drive-by sensing for fleets of a single…
Bayesian Optimization (BO) is a foundational strategy in the field of engineering design optimization for efficiently handling black-box functions with many constraints and expensive evaluations. This paper introduces a fast and accurate BO…
In this paper, we consider the Cauchy problem of 2D tropical climate model without thermal diffusion and construct global smooth solutions by choosing a class of special initial data whose $L^{\infty}$ norm can be arbitrarily large.
Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models. However, the training and inference…
The general linear model (GLM) is a widely popular and convenient tool for estimating the functional brain response and identifying areas of significant activation during a task or stimulus. However, the classical GLM is based on a massive…
Figgie is a card game that approximates open-outcry commodities trading. We design strategies for Figgie and study their performance and the resulting market behavior. To do this, we develop a flexible agent-based discrete-event market…
In this paper we study coupled dynamical systems and investigate dimension properties of the subspace spanned by solutions of each individual system. Relevant problems on \textit{collinear dynamical systems} and their variations are…
As very large studies of complex neuroimaging phenotypes become more common, human quality assessment of MRI-derived data remains one of the last major bottlenecks. Few attempts have so far been made to address this issue with machine…
Solid hydrogen sulfide is well known as a typical molecular crystal but its stability under pressure is still under debate. Particularly, Eremets et al. found the high pressure superconductivity with $T_{c}\approx$ 190 K in a H$_{2}$S…
In recent years, open systems with balanced loss and gain, that are invariant under the combined parity and time-reversal ($\mathcal{PT}$) operations, have been studied via asymmetries of their solutions. They represent systems as diverse…
Traditional anomaly detection on social media mostly focuses on individual point anomalies while anomalous phenomena usually occur in groups. Therefore it is valuable to study the collective behavior of individuals and detect group…