Related papers: SimPlot++: a Python application for representing s…

SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning

We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR…

Genomics · Quantitative Biology 2018-01-22 Bo Wang , Daniele Ramazzotti , Luca De Sano , Junjie Zhu , Emma Pierson , Serafim Batzoglou

PatchWorkPlot: simultaneous visualization of local alignments across multiple sequences

Motivation: Revealing structural variations across sequences of closely related individuals or species is crucial for understanding their diversification mechanisms and roles. Results: We developed PatchWorkPlot, a tool for visualization of…

Quantitative Methods · Quantitative Biology 2025-03-27 Mariia Pospelova , Yana Safonova

Synonymous and Nonsynonymous Distances Help Untangle Convergent Evolution and Recombination

When estimating a phylogeny from a multiple sequence alignment, researchers often assume the absence of recombination. However, if recombination is present, then tree estimation and all downstream analyses will be impacted, because…

Methodology · Statistics 2014-10-08 Peter B. Chi , Sujay Chattopadhyay , Philippe Lemey , Evgeni V. Sokurenko , Vladimir N. Minin

BioKlustering: a web app for semi-supervised learning of maximally imbalanced genomic data

Summary: Accurate phenotype prediction from genomic sequences is a highly coveted task in biological and medical research. While machine-learning holds the key to accurate prediction in a variety of fields, the complexity of biological data…

Genomics · Quantitative Biology 2024-12-17 Samuel Ozminkowski , Yuke Wu , Hailey Bruzzone , Liule Yang , Zhiwen Xu , Luke Selberg , Chunrong Huang , Helena Jaramillo-Mesa , Claudia Solis-Lemus

SPRINT: A fast, new software tool for reconstructing the evolutionary past of polyploid datasets

Polyploidization is an important evolutionary process which affects organisms ranging from plants to fish and fungi. The signal left behind by it is in the form of a species' ploidy level (number of complete chromosome sets found in a cell)…

Populations and Evolution · Quantitative Biology 2022-11-10 Liam J. Maher , Taoyang Wu , Katharina T. Huber

A biological sequence comparison algorithm using quantum computers

Genetic information is encoded in a linear sequence of nucleotides, represented by letters ranging from thousands to billions. Mutations refer to changes in the DNA or RNA nucleotide sequence. Thus, mutation detection is vital in all areas…

Quantum Physics · Physics 2024-03-14 Büsra Kösoglu-Kind , Robert Loredo , Michele Grossi , Christian Bernecker , Jody M Burks , Rudiger Buchkremer

snpQT: flexible, reproducible, and comprehensive quality control and imputation of genomic data

Motivation: Quality control of genomic data is an essential but complicated multi-step procedure, often requiring separate installation and expert familiarity with a combination of disparate bioinformatics tools. Results: To provide an…

Genomics · Quantitative Biology 2021-05-06 Christina Vasilopoulou , Benjamin Wingfield , Andrew P. Morris , William Duddy

Small Coupling Expansion for Multiple Sequence Alignment

The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional/structural characterizations between homologous sequences in different…

Quantitative Methods · Quantitative Biology 2023-05-01 Louise Budzynski , Andrea Pagnani

Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis

Clustering is a difficult and widely-studied data mining task, with many varieties of clustering algorithms proposed in the literature. Nearly all algorithms use a similarity measure such as a distance metric (e.g. Euclidean distance) to…

Neural and Evolutionary Computing · Computer Science 2019-10-24 Andrew Lensen , Bing Xue , Mengjie Zhang

SIMPT: Process Improvement Using Interactive Simulation of Time-aware Process Trees

Process mining techniques including process discovery, conformance checking, and process enhancement provide extensive knowledge about processes. Discovering running processes and deviations as well as detecting performance problems and…

Other Computer Science · Computer Science 2021-08-05 Mahsa Pourbafrani , Shuai Jiao , Wil M. P. van der Aalst

SAMBLASTER: fast duplicate marking and structural variant read extraction

Motivation: Illumina DNA sequencing is now the predominant source of raw genomic data, and data volumes are growing rapidly. Bioinformatic analysis pipelines are having trouble keeping pace. A common bottleneck in such pipelines is the…

Genomics · Quantitative Biology 2014-09-09 Gregory G. Faust , Ira M. Hall

Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python

In this work, we present scikit-fingerprints, a Python package for computation of molecular fingerprints for applications in chemoinformatics. Our library offers an industry-standard scikit-learn interface, allowing intuitive usage and easy…

Software Engineering · Computer Science 2025-08-12 Jakub Adamczyk , Piotr Ludynia

forqs: Forward-in-time Simulation of Recombination, Quantitative Traits, and Selection

forqs is a forward-in-time simulation of recombination, quantitative traits, and selection. It was designed to investigate haplotype patterns resulting from scenarios where substantial evolutionary change has taken place in a small number…

Populations and Evolution · Quantitative Biology 2013-10-14 Darren Kessner , John Novembre

Superplot: a graphical interface for plotting and analysing MultiNest output

We present an application, Superplot, for calculating and plotting statistical quantities relevant to parameter inference from a "chain" of samples drawn from a parameter space, produced by e.g. MultiNest. A simple graphical interface…

Data Analysis, Statistics and Probability · Physics 2016-12-06 Andrew Fowlie , Michael Hugh Bardsley

SimClone: Detecting Tabular Data Clones using Value Similarity

Data clones are defined as multiple copies of the same data among datasets. Presence of data clones between datasets can cause issues such as difficulties in managing data assets and data license violations when using datasets with clones…

Databases · Computer Science 2024-07-19 Xu Yang , Gopi Krishnan Rajbahadur , Dayi Lin , Shaowei Wang , Zhen Ming , Jiang

SIM2E: Benchmarking the Group Equivariant Capability of Correspondence Matching Algorithms

Correspondence matching is a fundamental problem in computer vision and robotics applications. Solving correspondence matching problems using neural networks has been on the rise recently. Rotation-equivariance and scale-equivariance are…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Shuai Su , Zhongkai Zhao , Yixin Fei , Shuda Li , Qijun Chen , Rui Fan

Merlin++, a flexible and feature-rich accelerator physics and particle tracking library

Merlin++ is a C++ charged-particle tracking library developed for the simulation and analysis of complex beam dynamics within high energy particle accelerators. Accurate simulation and analysis of particle dynamics is an essential part of…

Accelerator Physics · Physics 2020-11-10 Robert Appleby , Roger Barlow , Dirk Kruecker , James Molson , Haroon Rafique , Scott Rowan , Sam Tygier , Nicholas Walker , Andrzej Wolski

SCIMAP: A Python Toolkit for Integrated Spatial Analysis of Multiplexed Imaging Data

Multiplexed imaging data are revolutionizing our understanding of the composition and organization of tissues and tumors. A critical aspect of such tissue profiling is quantifying the spatial relationship relationships among cells at…

Quantitative Methods · Quantitative Biology 2024-05-06 Ajit J. Nirmal , Peter K. Sorger

Semantic-embedded Similarity Prototype for Scene Recognition

Due to the high inter-class similarity caused by the complex composition and the co-existing objects across scenes, numerous studies have explored object semantic knowledge within scenes to improve scene recognition. However, a resulting…

Computer Vision and Pattern Recognition · Computer Science 2024-08-06 Chuanxin Song , Hanbo Wu , Xin Ma , Yibin Li

seqme: a Python library for evaluating biological sequence design

Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the designed sequences to a target distribution and their…

Machine Learning · Computer Science 2025-11-07 Rasmus Møller-Larsen , Adam Izdebski , Jan Olszewski , Pankhil Gawade , Michal Kmicikiewicz , Wojciech Zarzecki , Ewa Szczurek