Cheng Yu
High-repetition-rate, fully coherent extreme-ultraviolet (EUV) and X-ray free-electron lasers (FELs) are essential for advanced time-resolved ultrafast spectroscopies. While external seeding serves as the standard technique to achieve…
Articulated objects used in simulation and embodied AI are typically specified by geometry and kinematic structure, but lack the fine-grained dynamical effects that govern realistic mechanical behavior, such as frictional holding, detents,…
Processing long-form videos with Video Large Language Models (Video-LLMs) is computationally prohibitive. Current efficiency methods often compromise fine-grained perception through irreversible information disposal or inhibit long-range…
Recent image editing models have achieved strong visual fidelity but often struggle with tasks requiring complex reasoning. To investigate and enhance the reasoning-grounded planning for image editing, we propose DDA-Thinker, a…
This paper investigates the collisionless quantum hydrodynamic, or quantum Euler, system in \(\mathbb{T}^3\) with the linear pressure law \(P(\rho)=\rho\). Since this pressure is associated with the logarithmic internal energy…
Product attribute extraction in e-commerce is bottlenecked by ontologies that are inconsistent, incomplete, and costly to maintain. We present AutoPKG, a multi-agent Large Language Model (LLM) framework that automatically constructs a…
Despite impressive progress in high-fidelity image synthesis, generative models still struggle with logic-intensive instruction following, exposing a persistent reasoning--execution gap. Meanwhile, closed-source systems (e.g., Nano Banana)…
Recent advances in text-to-image (T2I) generation via reinforcement learning (RL) have benefited from reward models that assess semantic alignment and visual quality. However, most existing reward models pay limited attention to…
Diffusion Large Language Models (dLLMs) break the rigid left-to-right constraint of traditional LLMs, enabling token generation in arbitrary orders. Intuitively, this flexibility implies a solution space that strictly supersets the fixed…
We investigate the inertial limit of the compressible Navier--Stokes system posed on the $3$-dimensional torus, and allowing for regions of vacuum. Considering global-in-time finite-energy weak solutions of a scaled system, we rigorously…
Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-source community lacks a principled,…
Orofacial clefts are among the most common congenital craniofacial abnormalities, yet accurate prenatal detection remains challenging due to the scarcity of experienced specialists and the relative rarity of the condition. Early and…
Compact Free-Electron Lasers (FELs) offering broad, continuous spectral tunability are traditionally constrained by fixed-parameter magnetic structures and the necessity for high-energy electron beams. High-gain Harmonic Lasing (HL) has…
AI safety benchmarks are pivotal for safety in advanced AI systems; however, they have significant technical, epistemic, and sociotechnical shortcomings. We present a review of 210 safety benchmarks that maps out common challenges in safety…
We study the low Mach number limit of the compressible Euler equations through the lens of convex integration. For any prescribed $L^2$ weak solution of the incompressible Euler equations, we construct a corresponding family of weak…
Laser manipulation plays a critical role in precisely tailoring relativistic electron beams through energy modulation, enabling the generation of coherent, intense, and ultrashort radiation in accelerator-based light sources such as…
Diffusion models have achieved remarkable performance in generative modeling, yet their theoretical foundations are often intricate, and the gap between mathematical formulations in papers and practical open-source implementations can be…
Despite the growing integration of retrieval-enabled AI agents into society, their safety and ethical behavior remain inadequately understood. In particular, the integration of LLMs and AI agents with external information sources and…
While language models (LMs) paired with residual vector quantization (RVQ) tokenizers have shown promise in text-to-audio (T2A) generation, they still lag behind diffusion-based models by a non-trivial margin. We identify a critical dilemma…
Context: Due to the demand for strong algorithmic reasoning, complex logic implementation, and strict adherence to input/output formats and resource constraints, competitive programming generation by large language models (LLMs) is…