English
Related papers

Related papers: Alignment-free sequence comparison using absent wo…

200 papers

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realized by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free…

Data Structures and Algorithms · Computer Science 2015-12-23 Maxime Crochemore , Gabriele Fici , Robert Mercaş , Solon P. Pissis

This paper presents a new approach to statistical similarity assessment based on sequence alignment. The algorithm performs mutual matching of two random sequences by successively searching for common elements and by applying sequence…

Signal Processing · Electrical Eng. & Systems 2021-06-09 Jakub Nikonowicz , Łukasz Matuszewski , Paweł Kubczak

An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their…

Data Structures and Algorithms · Computer Science 2014-07-01 Carl Barton , Alice Heliou , Laurent Mouchard , Solon P. Pissis

Genome and metagenome comparisons based on large amounts of next-generation sequencing (NGS) data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads.…

Quantitative Methods · Quantitative Biology 2018-03-28 Jie Ren , Xin Bai , Yang Young Lu , Kujin Tang , Ying Wang , Gesine Reinert , Fengzhu Sun

A minimal absent word of a sequence x, is a sequence yt hat is not a factorof x, but all of its proper factors are factors of x as well. The set of minimal absent words uniquely defines the sequence itself. In recent times minimal absent…

Formal Languages and Automata Theory · Computer Science 2021-06-01 Giuseppa Castiglione , Jia Gao , Sabrina Mantaci , Antonio Restivo

In this paper we introduce a method to detect words or phrases in a given sequence of alphabets without knowing the lexicon. Our linear time unsupervised algorithm relies entirely on statistical relationships among alphabets in the input…

Computation and Language · Computer Science 2013-12-31 Tamal Chowdhury , Rabindra Rakshit , Arko Banerjee

Sequence classification algorithms, such as SVM, require a definition of distance (similarity) measure between two sequences. A commonly used notion of similarity is the number of matches between $k$-mers ($k$-length subsequences) in the…

Data Structures and Algorithms · Computer Science 2017-12-13 Muhammad Farhan , Juvaria Tariq , Arif Zaman , Mudassir Shabbir , Imdad Ullah Khan

A {\em subsequence} of a word $w$ is a word $u$ that can be obtained by deleting some letters from $w$ while maintaining the relative order of the remaining letters, e.g., $\mathtt{lala}$ is a subsequence of $\mathtt{alfalfa}$. A word, over…

Formal Languages and Automata Theory · Computer Science 2025-09-01 Duncan Adamson , Pamela Fleischmann , Annika Huch , Florin Manea , Paul Sarnighausen-Cahn , Max Wiedenhöft

Given a string $w$, another string $v$ is said to be a subsequence of $w$ if $v$ can be obtained from $w$ by removing some of its letters; on the other hand, $v$ is called an absent subsequence of $w$ if $v$ is not a subsequence of $w$. The…

Data Structures and Algorithms · Computer Science 2025-05-01 Florin Manea , Tina Ringleb , Stefan Siemer , Maximilian Winkler

Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed…

Genomics · Quantitative Biology 2015-01-21 Ivan Borozan , Stuart Watt , Vincent Ferretti

This paper proposes an algorithm for alignment-free sequence comparison using Logical Match. Here, we compute the score using fuzzy membership values which generate automatically from the number of matches and mismatches. We demonstrate the…

Computational Engineering, Finance, and Science · Computer Science 2014-07-09 Sanil Shanker KP , Elizabeth Sherly , Jim Austin

An absent factor of a string $w$ is a string $u$ which does not occur as a contiguous substring (a.k.a. factor) inside $w$. We extend this well-studied notion and define absent subsequences: a string $u$ is an absent subsequence of a string…

Formal Languages and Automata Theory · Computer Science 2026-04-08 Maria Kosche , Tore Koß , Florin Manea , Stefan Siemer

Background: Sequence comparison is essential in bioinformatics, serving various purposes such as taxonomy, functional inference, and drug discovery. The traditional method of aligning sequences for comparison is time-consuming, especially…

Quantitative Methods · Quantitative Biology 2023-11-23 Saeedeh Akbari Rokn Abadi , Melika Honarmand , Ali Hajialinaghi , Somayyeh Koohi

Given a set of sequences, the distance between pairs of them helps us to find their similarity and derive structural relationship amongst them. For genomic sequences such measures make it possible to construct the evolution tree of…

Information Theory · Computer Science 2012-08-29 Sandeep Hosangadi

Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is…

Quantitative Methods · Quantitative Biology 2007-05-23 Ralf Bundschuh , Nicholas Chia

Simon's congruence $\sim_k$ is defined as follows: two words are $\sim_k$-equivalent if they have the same set of subsequences of length at most $k$. We propose an algorithm which computes, given two words $s$ and $t$, the largest $k$ for…

Formal Languages and Automata Theory · Computer Science 2021-03-16 Pawel Gawrychowski , Maria Kosche , Tore Koss , Florin Manea , Stefan Siemer

This paper describes a new alignment algorithm for sequences that can be used for determination of deletions and substitutions. It provides several solutions out of which the best one can be chosen on the basis of minimization of gaps or…

Information Theory · Computer Science 2012-11-01 Sandeep Hosangadi , Subhash Kak

Enormous volumes of short reads data from next-generation sequencing (NGS) technologies have posed new challenges to the area of genomic sequence comparison. The multiple sequence alignment approach is hardly applicable to NGS data due to…

Genomics · Quantitative Biology 2020-03-25 Ngoc Hieu Tran , Xin Chen

Sequence Alignment is the process of aligning biological sequences in order to identify similarities between multiple sequences. In this paper, a Quantum Algorithm for finding the optimal alignment between DNA sequences has been…

Data Structures and Algorithms · Computer Science 2025-09-05 Md. Rabiul Islam Khan , Shadman Shahriar , Shaikh Farhan Rafid

We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to discover morphologically related pairs by looking for pairs…

Computation and Language · Computer Science 2007-05-23 Marco Baroni , Johannes Matiasek , Harald Trost
‹ Prev 1 2 3 10 Next ›