Ge Yu — Scifaro

GeoGauss: Strongly Consistent and Light-Coordinated OLTP for Geo-Replicated SQL Database

Multinational enterprises conduct global business that has a demand for geo-distributed transactional databases. Existing state-of-the-art databases adopt a sharded master-follower replication architecture. However, the single-master…

Databases · Computer Science 2026-05-15 Weixing Zhou , Qi Peng , Zijie Zhang , Yanfeng Zhang , Yang Ren , Sihao Li , Guo Fu , Yulong Cui , Qiang Li , Caiyi Wu , Shangjun Han , Shengyi Wang , Guoliang Li , Ge Yu

UNIKIE-BENCH: Benchmarking Large Multimodal Models for Key Information Extraction in Visual Documents

Key Information Extraction (KIE) from real-world documents remains challenging due to substantial variations in layout structures, visual quality, and task-specific information requirements. Recent Large Multimodal Models (LMMs) have shown…

Computer Vision and Pattern Recognition · Computer Science 2026-04-27 Yifan Ji , Zhipeng Xu , Zhenghao Liu , Zulong Chen , Qian Zhang , Zhibo Yang , Junyang Lin , Yu Gu , Ge Yu , Maosong Sun

Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling

Large Language Models (LLMs) as automatic evaluators, commonly referred to as LLM-as-a-Judge, have also attracted growing attention. This approach plays a vital role in aligning LLMs with human judgments, providing accurate and reliable…

Computation and Language · Computer Science 2026-04-22 Shuliang Liu , Zhipeng Xu , Zhenghao Liu , Yukun Yan , Minghe Yu , Yu Gu , Chong Chen , Huiyuan Xie , Ge Yu

MetaMem: Evolving Meta-Memory for Knowledge Utilization through Self-Reflective Symbolic Optimization

Existing memory systems enable Large Language Models (LLMs) to support long-horizon human-LLM interactions by persisting historical interactions beyond limited context windows. However, while recent approaches have succeeded in constructing…

Computation and Language · Computer Science 2026-04-21 Haidong Xin , Xinze Li , Zhenghao Liu , Yukun Yan , Shuo Wang , Cheng Yang , Yu Gu , Ge Yu , Maosong Sun

EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation

Knowledge Graph-based Retrieval-Augmented Generation (KG-RAG) has emerged as a promising paradigm for enhancing LLM reasoning by retrieving multi-hop paths from KGs. However, existing KG-RAG frameworks often underperform in real-world…

Databases · Computer Science 2026-04-20 Zhenbo Fu , Yuanzhe Zhang , Qiange Wang , Hao Yuan , Yuehao Xu , Enze Yi , Yanfeng Zhang , Ge Yu

DGAI: Decoupled On-Disk Graph-Based ANN Index for Efficient Updates and Queries

On-disk graph-based indexes are favored for billion-scale Approximate Nearest Neighbor Search (ANNS) due to their high performance and cost-efficiency. However, existing systems typically rely on a coupled storage architecture that…

Databases · Computer Science 2026-04-14 Jiahao Lou , Shufeng Gong , Quan Yu , Hao Guo , Youyou Lu , Song Yu , Yanfeng Zhang , Tiezheng Nie , Ge Yu

ReAlign: Optimizing the Visual Document Retriever with Reasoning-Guided Fine-Grained Alignment

Visual document retrieval aims to retrieve a set of document pages relevant to a query from visually rich collections. Existing methods often employ Vision-Language Models (VLMs) to encode queries and visual pages into a shared embedding…

Information Retrieval · Computer Science 2026-04-10 Hao Yang , Yifan Ji , Zhipeng Xu , Zhenghao Liu , Yukun Yan , Zulong Chen , Shuo Wang , Yu Gu , Ge Yu

Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains

Visual Retrieval-Augmented Generation (VRAG) enhances Vision-Language Models (VLMs) by incorporating external visual documents to address a given query. Existing VRAG frameworks usually depend on rigid, pre-defined external tools to extend…

Artificial Intelligence · Computer Science 2026-04-10 Yuqi Xiong , Chunyi Peng , Zhipeng Xu , Zhenghao Liu , Zulong Chen , Yukun Yan , Shuo Wang , Yu Gu , Ge Yu

Chunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference Optimization

Long-context modeling is critical for a wide range of real-world tasks, including long-context question answering, summarization, and complex reasoning tasks. Recent studies have explored fine-tuning Large Language Models (LLMs) with…

Computation and Language · Computer Science 2026-04-10 Shaohua Duan , Pengcheng Huang , Xinze Li , Zhenghao Liu , Xiaoyuan Yi , Yukun Yan , Shuo Wang , Yu Gu , Ge Yu , Maosong Sun

Mixture-of-Retrieval Experts for Reasoning-Guided Multimodal Knowledge Exploitation

Multimodal Retrieval-Augmented Generation (MRAG) has shown promise in mitigating hallucinations in Multimodal Large Language Models (MLLMs) by incorporating external knowledge. However, existing methods typically adhere to rigid retrieval…

Computation and Language · Computer Science 2026-04-07 Chunyi Peng , Zhipeng Xu , Zhenghao Liu , Yishan Li , Yukun Yan , Shuo Wang , Yu Gu , Minghe Yu , Ge Yu , Maosong Sun

On the Vulnerability of FHE Computation to Silent Data Corruption

Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations mature and begin moving toward practical…

Cryptography and Security · Computer Science 2026-03-25 Jianan Mu , Ge Yu , Zhaoxuan Kan , Song Bian , Liang Kong , Zizhen Liu , Cheng Liu , Jing Ye , Huawei Li

Automated Formalization via Conceptual Retrieval-Augmented LLMs

Interactive theorem provers (ITPs) require manual formalization, which is labor-intensive and demands expert knowledge. While automated formalization offers a potential solution, it faces two major challenges: model hallucination (e.g.,…

Artificial Intelligence · Computer Science 2026-03-24 Wangyue Lu , Lun Du , Sirui Li , Ke Weng , Haozhe Sun , Hengyu Liu , Minghe Yu , Tiancheng Zhang , Ge Yu

DIAL-KG: Schema-Free Incremental Knowledge Graph Construction via Dynamic Schema Induction and Evolution-Intent Assessment

Knowledge Graphs (KGs) are foundational to applications such as search, question answering, and recommendation. Conventional knowledge graph construction methods are predominantly static, rely ing on a single-step construction from a fixed…

Artificial Intelligence · Computer Science 2026-03-23 Weidong Bao , Yilin Wang , Ruyu Gao , Fangling Leng , Yubin Bao , Ge Yu

Tau-BNO: Brain Neural Operator for Tau Transport Model

Mechanistic modeling provides a biophysically grounded framework for studying the spread of pathological tau protein in tauopathies like Alzheimer's disease. Existing approaches typically model tau propagation as a diffusive process on the…

Computational Engineering, Finance, and Science · Computer Science 2026-03-18 Nuutti Barron , Heng Rao , Urmi Saha , Yu Gu , Zhenghao Liu , Ge Yu , Defu Yang , Ashish Raj , Minghan Chen

ATCC: Adaptive Concurrency Control for Unforeseen Agentic Transactions

Data agents, empowered by Large Language Models (LLMs), introduce a new paradigm in transaction processing. Unlike traditional applications with fixed patterns, data agents run online-generated workflows that repeatedly issue SQL…

Databases · Computer Science 2026-03-17 Weixing Zhou , Zhiyou Wang , Zeshun Peng , Hetian Chen , Yanfeng Zhang , Ge Yu

Concurrency Control as a Service

Existing disaggregated databases separate execution and storage layers, enabling independent and elastic scaling of resources. In most cases, this design makes transaction concurrency control (CC) a critical bottleneck, which demands…

Databases · Computer Science 2026-03-17 Weixing Zhou , Yanfeng Zhang , Xinji Zhou , Zhiyou Wang , Zeshun Peng , Yang Ren , Sihao Li , Huanchen Zhang , Guoliang Li , Ge Yu

Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation

This paper argues that reliable end-to-end graph data analytics cannot be achieved by retrieval- or code-generation-centric LLM agents alone. Although large language models (LLMs) provide strong reasoning capabilities, practical graph…

Databases · Computer Science 2026-02-26 Qiange Wang , Chaoyi Chen , Jingqi Gao , Zihan Wang , Yanfeng Zhang , Ge Yu

HIPPO: Enhancing the Table Understanding Capability of LLMs through Hybrid-Modal Preference Optimization

Tabular data contains rich structural semantics and plays a crucial role in organizing and manipulating information. Recent methods employ Multi-modal Large Language Models (MLLMs) to address table-related tasks across various modalities of…

Computation and Language · Computer Science 2026-02-17 Haolan Wang , Zhenghao Liu , Xinze Li , Xiaocui Yang , Yu Gu , Yukun Yan , Qi Shi , Fangfang Li , Chong Chen , Ge Yu

Legal$\Delta$: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain

Legal Artificial Intelligence (LegalAI) has achieved notable advances in automating judicial decision-making with the support of Large Language Models (LLMs). However, existing legal LLMs still struggle to generate reliable and…

Computation and Language · Computer Science 2026-02-10 Xin Dai , Buqiang Xu , Zhenghao Liu , Yukun Yan , Huiyuan Xie , Xiaoyuan Yi , Shuo Wang , Ge Yu

ThinkNote: Enhancing Knowledge Integration and Utilization of Large Language Models via Constructivist Cognition Modeling

Large Language Models (LLMs) have demonstrated strong performance across a wide range of NLP tasks. However, they often exhibit suboptimal behaviors and inconsistencies when exposed to unfamiliar external information, underscoring their…

Computation and Language · Computer Science 2026-01-28 Zhipeng Xu , Zhenghao Liu , Yukun Yan , Shuo Wang , Shi Yu , Zheni Zeng , Chaojun Xiao , Zhiyuan Liu , Ge Yu , Chenyan Xiong