Related papers: BioStatFlow -Statistical Analysis Workflow for "Om…
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Our previous work revealed the potential of analyzing extracellular metabolomic data in the context of the…
AstroStat is an easy-to-use tool for performing statistical analysis on data. It has been designed to be compatible with Virtual Observatory (VO) standards thus enabling it to become an integral part of the currently available collection of…
Academic Clinical Trial Units frequently face fragmented statistical workflows, leading to duplicated effort, limited collaboration, and inconsistent analytical practices. To address these challenges within an oncology Clinical Trial Unit,…
SimOmics is an R package designed to generate realistic, multivariate, and multi-omics synthetic datasets. It is intended for use in benchmarking, method development, and reproducibility in bioinformatics, particularly in the context of…
Statistical practices such as building regression models or running hypothesis tests rely on following rigorous procedures of steps and verifying assumptions on data to produce valid results. However, common statistical tools do not verify…
For researchers in electromyography (EMG), and similar biosginals, signal processing is naturally an essential topic. There are a number of excellent tools available. To these one may add the freely available open source statistical…
Concerning NMR-based metabolomics, 1D spectra processing often requires an expert eye for disentangling the intertwined peaks, and so far the best way is to proceed interactively with a spectra viewer. NMRProcFlow is a graphical and…
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity and mechanisms underlying human health and disease. Large-scale metabolomics…
Managing data and code in open scientific research is complicated by two key problems: large datasets often cannot be stored alongside code in repository platforms like GitHub, and iterative analysis can lead to unnoticed changes to data,…
Hospitals around the world collect massive amounts of physiological data from their patients every day. Recently, there has been an increase in research interest to subject this data to statistical analysis to gain more insights and provide…
Metabolomics is a key approach in modern functional genomics and systems biology. Due to the complexity of metabolomics data, the variety of experimental designs, and the variety of existing bioinformatics tools, providing experimenters…
Optimizing a stateful dataflow language is a challenging task. There are strict correctness constraints for preserving properties expected by downstream consumers, a large space of possible optimizations, and complex analyses that must…
Data leakage remains a recurrent source of optimistic bias in biomedical machine learning studies. Standard row-wise cross-validation and globally estimated preprocessing steps are often inappropriate for data with repeated measurements,…
Research increasingly relies on computational methods to analyze experimental data and predict molecular properties. Current approaches often require researchers to use a variety of tools for statistical analysis and machine learning,…
Background: Mathematical models based on ordinary differential equations (ODEs) are essential tools across various scientific disciplines, including biology, ecology, and healthcare informatics. They are used to simulate complex dynamic…
In this paper we present the development of a modulated web based statistical system, hereafter MWStat, which shifts the statistical paradigm of analyzing data into a real time structure. The MWStat system is useful for both online storage…
Legacy scientific workflows, and the services within them, often present scarce and unstructured (i.e. textual) descriptions. This makes it difficult to find, share and reuse them, thus dramatically reducing their value to the community.…
Modeling dynamical systems and unraveling their underlying causal relationships is central to many domains in the natural sciences. Various physical systems, such as those arising in cell biology, are inherently high-dimensional and…
The stochastic simulation of large-scale biochemical reaction networks is of great importance for systems biology since it enables the study of inherently stochastic biological mechanisms at the whole cell scale. Stochastic Simulation…
This review presents how R, the popular statistical environment and programming language, can be used in the frame of proteomics data analysis. A short introduction to R is given, with special emphasis on some of the features that make R…