Jiaming Li
Reinforcement learning (RL) has shown extraordinary potential in aligning diffusion models to downstream tasks, yet most of them still suffer from significant reward hacking, which degrades generative diversity and quality by inducing…
Streaming video reasoning requires models to operate in a setting where history grows without bound while meaningful evidence remains scarce. In such a landscape, relevant signal is like an oasis-small, critical, and easily lost in a desert…
This paper studies a ground-segment implementation problem in 5G non-terrestrial networks (NTN): once UE-side geometric pre-compensation has produced a coarse timing/frequency prior, can an edge-side residual loop keep the uplink inside an…
Existing text-guided image editing methods primarily rely on end-to-end pixel-level inpainting paradigm. Despite its success in simple scenarios, this paradigm still significantly struggles with compositional editing tasks that require…
Fine-grained open-vocabulary object detection (FG-OVD) aims to detect novel object categories described by attribute-rich texts. While existing open-vocabulary detectors show promise at the base-category level, they underperform in…
This study presents a Secure Multi-Tenant Architecture (SMTA) combined with a novel concept Burn-After-Use (BAU) mechanism for enterprise LLM environments to effectively prevent data leakage. As institutions increasingly adopt LLMs across…
We propose an experimental scheme to load ultracold Fermi gases from the ground orbital band of a one-dimensional optical lattice into the first excited orbital band. Unlike the narrow momentum distribution of a Bose-Einstein Condensate,…
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a prevailing paradigm for enhancing reasoning in Multimodal Large Language Models (MLLMs). However, relying solely on outcome supervision risks reward hacking, where…
We report the observation of a dimensional crossover of a narrow $p$-wave Feshbach resonance in an ultracold, spin-polarized $^6$Li Fermi gas confined by a one-dimensional optical lattice. In the three-dimensional limit, atom loss near the…
Reward models are crucial for aligning large language models (LLMs) with human values and intentions. Existing approaches follow either Generative (GRMs) or Discriminative (DRMs) paradigms, yet both suffer from limitations: GRMs typically…
Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have empowered large language models (LLMs) to tackle challenging reasoning tasks such as mathematics and programming. Despite its promise, the RLVR paradigm poses…
High-fidelity coherent population transfer plays a vital role in the realization of quantum memories. However, population transfer with high performance across a broad frequency range is still challenging due to the finite Rabi coupling…
A semi-implicit Lax-Wendroff scheme is developed for electron-phonon coupling process in metals based on the two-temperature kinetic equations. The core of this method is to integrate the evolution information of physical equations into the…
We present a rapid evaporative cooling scheme for a strongly interacting $^{6}\mathrm{Li}$ Fermi gas in an optical dipole trap. The method uses a magnetic-field-gradient--induced tilt of the trapping potential to accelerate cooling in the…
We report precision, orbital-resolved measurements of three-body recombination near the 159~G $p$-wave Feshbach resonance in an ultracold gas of $^{6}$Li atoms prepared in their lowest hyperfine state. Using a radio-frequency gated protocol…
Widespread clinical deployment of computer-aided diagnosis (CAD) systems is hindered by the challenge of integrating with existing hospital IT infrastructure. Here, we introduce VisionCAD, a vision-based radiological assistance framework…
Existing efforts in building Graphical User Interface (GUI) agents largely rely on the training paradigm of supervised fine-tuning on Large Vision-Language Models (LVLMs). However, this approach not only demands extensive amounts of…
Intellectual Property (IP) is a highly specialized domain that integrates technical and legal knowledge, making it inherently complex and knowledge-intensive. Recent advancements in LLMs have demonstrated their potential to handle…
Black-box optimization (BBO) involves functions that are unknown, inexact and/or expensive-to-evaluate. Existing BBO algorithms face several challenges, including high computational cost from extensive evaluations, difficulty in handling…
Recent advancements in omnimodal learning have significantly improved understanding and generation across images, text, and speech, yet these developments remain predominantly confined to proprietary models. The lack of high-quality…