Jiaming Luo
Chiral carbon nanotubes (CNTs) are direct-gap semiconductors with optical properties governed by one-dimensional excitons with enormous oscillator strengths. Each species of chiral CNTs has an enantiomeric pair of left- and right-handed…
Wikipedia's perceived high quality and broad language coverage have established it as a fundamental resource in NLP. However, in recent years, such assumptions of high quality have become the subject of scrutiny in low-resource and…
Psychiatric comorbidity is clinically significant yet challenging due to the complexity of multiple co-occurring disorders. To address this, we develop a novel approach integrating synthetic patient electronic medical record (EMR)…
We present TranslateGemma, a suite of open machine translation models based on the Gemma 3 foundation models. To enhance the inherent multilingual capabilities of Gemma 3 for the translation task, we employ a two-stage fine-tuning process.…
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on…
We open-source SMOL (Set of Maximal Overall Leverage), a suite of training data to unlock machine translation for low-resource languages. SMOL has been translated into 124 (and growing) under-resourced languages (125 language pairs),…
In this paper, we establish an innovative framework in logarithmic Hodge theory for toroidal varieties, introducing weighted toroidal structures and developing a systematic obstruction theory for Hodge classes. Building upon recent advances…
In this paper, We define the stratified metric $\infty$-category $\mathbf{StratMet}_{\infty}$ and the middle perversity moduli stack $\mathscr{M}^{\mathrm{mid}}$. We construct a universal truncation complex…
Spin-lattice coupling is crucial for understanding the spin transport and dynamics for spintronics and magnonics applications. Recently, cobalt titanate (CoTiO3), an easy-plane antiferromagnet, has been found to host axial phonons with a…
In this paper, We develop the stratified de Rham theory on singular spaces using modern tools including derived geometry and stratified structures. This work unifies and extends the de Rham theory, Hodge theory, and deformation theory of…
In this paper, we mainly build up the theory of sheaf-correspondence filtered spaces and stratified de Rham complexes for studying singular spaces. We prove the finiteness of a stratified de Rham cohomology and obtain its isomorphism to…
Designing effective debt collection systems is crucial for improving operational efficiency and reducing costs in the financial industry. However, the challenges of maintaining script diversity, contextual relevance, and coherence make this…
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer…
While large language models (LLMs) have been increasingly adopted for machine translation (MT), their performance for specialist domains such as medicine and law remains an open challenge. Prior work has shown that LLMs can be…
Data contamination -- the accidental consumption of evaluation examples within the pre-training data -- can undermine the validity of evaluation benchmarks. In this paper, we present a rigorous analysis of the effects of contamination on…
Despite growing interest in incorporating feedback to improve language models, most efforts focus only on sequence-level annotations. In this work, we explore the potential of utilizing fine-grained span-level annotations from offline…
In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. Instead of viewing machine translation as a single, monolithic task, we propose a framework that…
We conduct a large-scale fine-grained comparative analysis of machine translations (MT) against human translations (HT) through the lens of morphosyntactic divergence. Across three language pairs and two types of divergence defined as the…
Time-reversal symmetry (TRS) is pivotal for materials optical, magnetic, topological, and transport properties. Chiral phonons, characterized by atoms rotating unidirectionally around their equilibrium positions, generate dynamic lattice…
The evaluation of abstractive summarization models typically uses test data that is identically distributed as training data. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or…