Boxuan Li
Query-based image retrieval (QBIR) requires retrieving relevant images given diverse and often stylistically heterogeneous queries, such as sketches, artworks, or low-resolution previews. While large-scale vision--language representation…
Unified multimodal models that couple visual understanding with image generation have advanced rapidly, yet most systems still focus on visual grounding-aligning language with image regions-while their generative counterpart,…
AI agents may soon become capable of autonomously completing valuable, long-horizon tasks in diverse domains. Current benchmarks either do not measure real-world tasks, or are not sufficiently difficult to meaningfully measure frontier…
The approval of the Bitcoin Spot ETF in January 2024 marked a transformative event in cryptocurrency markets, signaling increased institutional adoption and integration into traditional finance. This study examines Bitcoin's changing…
We interact with computers on an everyday basis, be it in everyday life or work, and many aspects of work can be done entirely with access to a computer and the Internet. At the same time, thanks to improvements in large language models…
Autonomous Graphical User Interface (GUI) navigation agents can enhance user experience in communication, entertainment, and productivity by streamlining workflows and reducing manual intervention. However, prior GUI agents often trained…
Ultra-wide bandgap (UWBG) semiconductors promise to revolutionize power electronics, yet a fundamental understanding of their interfacial electronic structure has been hindered by the absence of direct experimental observation. Here, we…
In this study, we introduce the safety human preference dataset, PKU-SafeRLHF, designed to promote research on safety alignment in large language models (LLMs). As a sibling project to SafeRLHF and BeaverTails, we separate annotations of…
Modern human labor is characterized by specialization; we train for years and develop particular tools that allow us to perform well across a variety of tasks. In addition, AI agents have been specialized for domains such as software…
Systematically simulating specular light transport requires an exhaustive search for primitive tuples containing admissible paths. Given the extreme inefficiency of enumerating all combinations, we propose to significantly reduce the search…
We report an infrared spectroscopy study of the antiferromagnetic (AFM) insulator EuZn$_2$As$_2$ over a broad frequency range, spanning temperatures both above and below the AFM transition $T_{\rm N} \simeq$ 20 K. The optical response…
Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to improvements in large language models…
Contrastive learning, a prominent approach within self-supervised learning, has demonstrated significant effectiveness in developing generalizable models for various applications involving natural images. However, recent research indicates…
A single material achieving multiple topological phases can provide potential application for topological spintronics, whereas the candidate materials are very limited. Here, we report the structure, physical properties, and possible…
This paper is devoted to the classification problems concerning extended deformations of convex polyhedra and real hyperplane arrangements in the following senses: combinatorial equivalence of face posets, normal equivalence on normal fans…
Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on…
In this paper we study social exclusion in social (information) networks using a game-theoretic approach, and study the stability of a certain class community structures that are a Nash equilibrium. The main result of our analysis shows…
Modulating heterogeneous microstructure in room temperature ionic liquids (RTILs) by external stimuli is an important approach for understanding and designing the external field induced chemical reactions in natural and applicable systems.…