English
Related papers

Related papers: Binary Interval Search (BITS): A Scalable Algorith…

200 papers

Variable selection in ultra-high dimensional linear regression is often preceded by a screening step to significantly reduce the dimension. Here we develop a Bayesian variable screening method (BITS) guided by the posterior model…

Methodology · Statistics 2025-02-28 Run Wang , Somak Dutta , Vivekananda Roy

The identification of genetic signal regions in the human genome is critical for understanding the genetic architecture of complex traits and diseases. Numerous methods based on scan algorithms (i.e. QSCAN, SCANG, SCANG-STARR) have been…

Applications · Statistics 2025-01-24 Wei Zhang , Fan Wang , Fang Yao

Identification of functional elements of a genome often requires dividing a sequence of measurements along a genome into segments differing from adjacent segments. In many applications, the mean of the measured values at multiple genomic…

Applications · Statistics 2015-06-30 S. B. Girimurugan , Jonathan Dennis , Jinfeng Zhang

We present COBS, a COmpact Bit-sliced Signature index, which is a cross-over between an inverted index and Bloom filters. Our target application is to index $k$-mers of DNA samples or $q$-grams from text documents and process approximate…

Databases · Computer Science 2019-07-29 Timo Bingmann , Phelim Bradley , Florian Gauger , Zamin Iqbal

In this paper, a new and novel data structure is proposed to dynamically insert and delete segments. Unlike the standard segment trees[3], the proposed data structure permits insertion of a segment with interval range beyond the interval…

Computational Geometry · Computer Science 2015-01-15 K. S. Easwarakumar , T. Hema

In recent years, there has been an increasing demand on efficient algorithms for large scale change point detection problems. To this end, we propose seeded binary segmentation, an approach relying on a deterministic construction of…

Methodology · Statistics 2023-03-13 Solt Kovács , Housen Li , Peter Bühlmann , Axel Munk

String matching algorithm plays the vital role in the Computational Biology. The functional and structural relationship of the biological sequence is determined by similarities on that sequence. For that, the researcher is supposed to aware…

Data Structures and Algorithms · Computer Science 2014-01-30 Pandiselvam. P , Marimuthu. T , Lawrance. R

The exponential growth of DNA sequencing data has outpaced traditional heuristic-based methods, which struggle to scale effectively. Efficient computational approaches are urgently needed to support large-scale similarity search, a…

The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information retrieval, and many other areas. We design and implement SimilarityAtScale, the first…

Computational Engineering, Finance, and Science · Computer Science 2020-11-12 Maciej Besta , Raghavendra Kanakagiri , Harun Mustafa , Mikhail Karasikov , Gunnar Rätsch , Torsten Hoefler , Edgar Solomonik

Genetic information is encoded in a linear sequence of nucleotides, represented by letters ranging from thousands to billions. Mutations refer to changes in the DNA or RNA nucleotide sequence. Thus, mutation detection is vital in all areas…

Inferring concerted changes among biological traits along an evolutionary history remains an important yet challenging problem. Besides adjusting for spurious correlation induced from the shared history, the task also requires sufficient…

Indexing is an effective way to support efficient query processing in large databases. Recently the concept of learned index, which replaces or complements traditional index structures with machine learning models, has been actively…

Databases · Computer Science 2022-08-01 Yao Tian , Tingyun Yan , Xi Zhao , Kai Huang , Xiaofang Zhou

A computational framework utilizes the traditional similarity measures for mining the significant relationships in biological annotations is recently proposed by Tatiana V. Karpinets et al. [2]. In this paper, an improved approximation…

Databases · Computer Science 2015-07-21 Shuliang Wang , Yiping Zhao

Scientific practice typically involves repeatedly studying a system, each time trying to unravel a different perspective. In each study, the scientist may take measurements under different experimental conditions (interventions,…

Machine Learning · Statistics 2014-03-11 Sofia Triantafillou , Ioannis Tsamardinos

Practical use of neural networks often involves requirements on latency, energy and memory among others. A popular approach to find networks under such requirements is through constrained Neural Architecture Search (NAS). However, previous…

Machine Learning · Computer Science 2022-04-28 Niv Nayman , Yonathan Aflalo , Asaf Noy , Rong Jin , Lihi Zelnik-Manor

Similarity search finds objects that are similar to a given query object based on a similarity metric. As the amount and variety of data continue to grow, similarity search in metric spaces has gained significant attention. Metric spaces…

Databases · Computer Science 2024-10-08 Yifan Zhu , Chengyang Luo , Tang Qian , Lu Chen , Yunjun Gao , Baihua Zheng

Identifying similar protein sequences is a core step in many computational biology pipelines such as detection of homologous protein sequences, generation of similarity protein graphs for downstream analysis, functional annotation and gene…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-01 Oguz Selvitopi , Saliya Ekanayake , Giulia Guidi , Georgios Pavlopoulos , Ariful Azad , Aydin Buluc

Brain-computer interfaces (BCIs), is ways for electronic devices to communicate directly with the brain. For most medical-type brain-computer interface tasks, the activity of multiple units of neurons or local field potentials is sufficient…

Machine Learning · Computer Science 2022-05-25 Lang Qian , Shengjie Zheng , Chunshan Deng , Cheng Yang , Xiaojian Li

High-throughput sequencing (HTS) is revolutionizing biological research by enabling scientists to quickly and cheaply query variation at a genomic scale. Despite the increasing ease of obtaining such data, using these data effectively still…

Genomics · Quantitative Biology 2012-11-09 Sonal Singhal

Discovering patterns in data that best describe the differences between classes allows to hypothesize and reason about class-specific mechanisms. In molecular biology, for example, this bears promise of advancing the understanding of…

Machine Learning · Computer Science 2023-12-08 Nils Philipp Walter , Jonas Fischer , Jilles Vreeken
‹ Prev 1 2 3 10 Next ›