English
Related papers

Related papers: Efficient Genomic Interval Queries Using Augmented…

200 papers

Many applications require efficient management of large sets of intervals because many objects are associated with intervals (e.g., time and price intervals). In such interval management systems, range search is a primitive operator for…

Databases · Computer Science 2024-05-24 Daichi Amagata

Regression trees are a popular machine learning algorithm that fit piecewise constant models by recursively partitioning the predictor space. This paper focuses on statistical inference for a data-dependent model obtained from a fitted…

Methodology · Statistics 2025-12-17 Soham Bakshi , Yiling Huang , Snigdha Panigrahi , Walter Dempsey

Genomics has revolutionized biology, enabling the interrogation of whole transcriptomes, genome-wide binding sites for proteins, and many other molecular processes. However, individual genomic assays measure elements that interact in vivo…

Machine Learning · Statistics 2022-06-08 Sumanta Basu , Karl Kumbier , James B. Brown , Bin Yu

Tree data structures, such as red-black trees, quad trees, treaps, or tries, are fundamental tools in computer science. A classical problem in concurrency is to obtain expressive, efficient, and scalable versions of practical tree data…

Databases · Computer Science 2023-10-10 Ilya Kokorin , Dan Alistarh , Vitaly Aksenov

Recent advancements in Retrieval-Augmented Generation (RAG) have enabled Large Language Models to answer financial questions using external knowledge bases of U.S. SEC filings, earnings reports, and regulatory documents. However, existing…

Although Large Language Models (LLMs) demonstrate significant capabilities, their reliance on parametric knowledge often leads to inaccuracies. Retrieval Augmented Generation (RAG) mitigates this by incorporating external knowledge, but…

Artificial Intelligence · Computer Science 2025-11-04 Hailong Yin , Bin Zhu , Jingjing Chen , Chong-Wah Ngo

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation Download PDF Neal Gregory Lawton, Alfy Samuel, Anoop Kumar, Daben Liu Published: 20 Aug 2025, Retrieval augmented generation (RAG) is a popular…

Computation and Language · Computer Science 2025-10-21 Neal Gregory Lawton , Alfy Samuel , Anoop Kumar , Daben Liu

Natural language text corpora are often available as sets of syntactically parsed trees. A wide range of expressive tree queries are possible over such parsed trees that open a new avenue in searching over natural language text. They not…

Databases · Computer Science 2012-08-02 Pirooz Chubak , Davood Rafiei

We initiate a study of a query-driven approach to designing partition trees for range-searching problems. Our model assumes that a data structure is to be built for an unknown query distribution that we can access through a sampling oracle,…

Data Structures and Algorithms · Computer Science 2025-02-20 Dimitris Fotakis , Andreas Kalavas , Ioannis Psarros

Quantum computing is a popular topic in computer science, which has recently attracted many studies in various areas such as machine learning and network. However, the topic of quantum data structures seems neglected. There is an open…

Databases · Computer Science 2024-06-03 Hao Liu , Xiaotian You , Raymond Chi-Wing Wong

In concurrent data structures, the efficiency of set operations can vary significantly depending on the workload characteristics. Numerous concurrent set implementations are optimized and fine-tuned to excel in scenarios characterized by…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-29 Daniel Manor , Mor Perry , Moshe Sulamy

Indexing intervals is a fundamental problem, finding a wide range of applications. Recent work on managing large collections of intervals in main memory focused on overlap joins and temporal aggregation problems. In this paper, we propose…

Databases · Computer Science 2022-03-08 George Christodoulou , Panagiotis Bouros , Nikos Mamoulis

The log-det distance between two aligned DNA sequences was introduced as a tool for statistically consistent inference of a gene tree under simple non-mixture models of sequence evolution. Here we prove that the log-det distance, coupled…

Populations and Evolution · Quantitative Biology 2018-06-14 Elizabeth S. Allman , Colby Long , John A. Rhodes

Geosocial reachability queries (\textsc{RangeReach}) determine whether a given vertex in a geosocial network can reach any spatial vertex within a query region. The state-of-the-art 3DReach method answers such queries by encoding graph…

Databases · Computer Science 2026-02-06 Rick van der Heijden , Nikolay Yakovets , Thekla Hamm

The need for scalable concurrent ordered set data structures with linearizable range query support is increasing due to the rise of multicore computers, data processing platforms and in-memory databases. This paper presents a new concurrent…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-05 Kjell Winblad

This paper proposes an efficient data structure, ikd-Tree, for dynamic space partition. The ikd-Tree incrementally updates a k-d tree with new coming points only, leading to much lower computation time than existing static k-d trees.…

Robotics · Computer Science 2021-02-23 Yixi Cai , Wei Xu , Fu Zhang

Processing graphs with temporal information (the temporal graphs) has become increasingly important in the real world. In this paper, we study efficient solutions to temporal graph applications using new algorithms for Incremental Minimum…

Data Structures and Algorithms · Computer Science 2025-05-13 Xiangyun Ding , Yan Gu , Yihan Sun

We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment…

Methodology · Statistics 2018-04-06 Susan Athey , Julie Tibshirani , Stefan Wager

The range, segment and rectangle query problems are fundamental problems in computational geometry, and have extensive applications in many domains. Despite the significant theoretical work on these problems, efficient implementations can…

Computational Geometry · Computer Science 2018-08-08 Yihan Sun , Guy E. Blelloch

Introduction: Epigenomic datasets from high-throughput sequencing experiments are commonly summarized as genomic intervals. As the volume of this data grows, so does interest in analyzing it through deep learning. However, the heterogeneity…

‹ Prev 1 2 3 10 Next ›