Related papers: The Distribution and Deposition Algorithm for Mult…

A Domain Decomposition Strategy for Alignment of Multiple Biological Sequences on Multiprocessor Platforms

Multiple Sequences Alignment (MSA) of biological sequences is a fundamental problem in computational biology due to its critical significance in wide ranging applications including haplotype reconstruction, sequence homology, phylogenetic…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-05-13 Fahad Saeed , Ashfaq Khokhar

Deposition and Extension Approach to Find Longest Common Subsequence for Multiple Sequences

The problem of finding the longest common subsequence (LCS) for a set of sequences is a very interesting and challenging problem in computer science. This problem is NP-complete, but because of its importance, many heuristic algorithms have…

Data Structures and Algorithms · Computer Science 2009-06-30 Kang Ning

Storage Allocation for Multi-Class Distributed Data Storage Systems

Distributed storage systems (DSSs) provide a scalable solution for reliably storing massive amounts of data coming from various sources. Heterogeneity of these data sources often means different data classes (types) exist in a DSS, each…

Information Theory · Computer Science 2017-01-24 Koosha Pourtahmasi Roshandeh , Moslem Noori , Masoud Ardakani , Chintha Tellambura

A Distributed Chunk Calculation Approach for Self-scheduling of Parallel Applications on Distributed-memory Systems

Loop scheduling techniques aim to achieve load-balanced executions of scientific applications. Dynamic loop self-scheduling (DLS) libraries for distributed-memory systems are typically MPI-based and employ a centralized chunk calculation…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-19 Ahmed Eleliemy , Florina M. Ciorba

DSA: Scalable Distributed Sequence Alignment System Using SIMD Instructions

Sequence alignment algorithms are a basic and critical component of many bioinformatics fields. With rapid development of sequencing technology, the fast growing reference database volumes and longer length of query sequence become new…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-01-09 Bo Xu , Changlong Li , Hang Zhuang , Jiali Wang , Qingfeng Wang , Jinhong Zhou , Xuehai Zhou

Evaluation of a Simple, Scalable, Parallel Best-First Search Strategy

Large-scale, parallel clusters composed of commodity processors are increasingly available, enabling the use of vast processing capabilities and distributed RAM to solve hard search problems. We investigate Hash-Distributed A* (HDA*), a…

Artificial Intelligence · Computer Science 2015-03-20 Akihiro Kishimoto , Alex Fukunaga , Adi Botea

Compressing Sets and Multisets of Sequences

This article describes lossless compression algorithms for multisets of sequences, taking advantage of the multiset's unordered structure. Multisets are a generalisation of sets where members are allowed to occur multiple times. A multiset…

Information Theory · Computer Science 2014-01-27 Christian Steinruecken

Distributed Principal Subspace Analysis for Partitioned Big Data: Algorithms, Analysis, and Implementation

Principal Subspace Analysis (PSA) -- and its sibling, Principal Component Analysis (PCA) -- is one of the most popular approaches for dimensionality reduction in signal processing and machine learning. But centralized PSA/PCA solutions are…

Machine Learning · Computer Science 2021-11-25 Arpita Gang , Bingqing Xiang , Waheed U. Bajwa

Multidimensional Scaling for Big Data

We present a set of algorithms implementing multidimensional scaling (MDS) for large data sets. MDS is a family of dimensionality reduction techniques using a $n \times n$ distance matrix as input, where $n$ is the number of individuals,…

Computation · Statistics 2024-02-02 Pedro Delicado , Cristian Pachón-García

Shared-Memory Hierarchical Process Mapping

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-03 Christian Schulz , Henning Woydt

Mesh-TensorFlow: Deep Learning for Supercomputers

Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting…

Machine Learning · Computer Science 2018-11-07 Noam Shazeer , Youlong Cheng , Niki Parmar , Dustin Tran , Ashish Vaswani , Penporn Koanantakool , Peter Hawkins , HyoukJoong Lee , Mingsheng Hong , Cliff Young , Ryan Sepassi , Blake Hechtman

On Distributed Larger-Than-Memory Subset Selection With Pairwise Submodular Functions

Modern datasets span billions of samples, making training on all available data infeasible. Selecting a high quality subset helps in reducing training costs and enhancing model quality. Submodularity, a discrete analogue of convexity, is…

Machine Learning · Computer Science 2025-04-04 Maximilian Böther , Abraham Sebastian , Pranjal Awasthi , Ana Klimovic , Srikumar Ramalingam

A New Distributed Evolutionary Computation Technique for Multi-Objective Optimization

Now-a-days, it is important to find out solutions of Multi-Objective Optimization Problems (MOPs). Evolutionary Strategy helps to solve such real world problems efficiently and quickly. But sequential Evolutionary Algorithms (EAs) require…

Neural and Evolutionary Computing · Computer Science 2016-11-15 Md. Asadul Islam , G. M. Mashrur-E-Elahi , M. M. A. Hashem

Multi-GPU Distributed Parallel Bayesian Differential Topic Modelling

There is an explosion of data, documents, and other content, and people require tools to analyze and interpret these, tools to turn the content into information and knowledge. Topic modeling have been developed to solve these problems.…

Computation and Language · Computer Science 2015-10-23 Aaron Q Li

Distributed Discrete Morse Sandwich: Efficient Computation of Persistence Diagrams for Massive Scalar Data

The persistence diagram, which describes the topological features of a dataset, is a key descriptor in Topological Data Analysis. The "Discrete Morse Sandwich" (DMS) method has been reported to be the most efficient algorithm for computing…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-27 Eve Le Guillou , Pierre Fortin , Julien Tierny

Stochastic Multidimensional Scaling

Multidimensional scaling (MDS) is a popular dimensionality reduction techniques that has been widely used for network visualization and cooperative localization. However, the traditional stress minimization formulation of MDS necessitates…

Optimization and Control · Mathematics 2016-12-22 Ketan Rajawat , Sandeep Kumar

Parallel Sort-Based Matching for Data Distribution Management on Shared-Memory Multiprocessors

In this paper we consider the problem of identifying intersections between two sets of d-dimensional axis-parallel rectangles. This is a common problem that arises in many agent-based simulation studies, and is of central importance in the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-08 Moreno Marzolla , Gabriele D'Angelo

Seq-SetNet: Exploring Sequence Sets for Inferring Structures

Sequence set is a widely-used type of data source in a large variety of fields. A typical example is protein structure prediction, which takes an multiple sequence alignment (MSA) as input and aims to infer structural information from it.…

Biomolecules · Quantitative Biology 2019-06-27 Fusong Ju , Jianwei Zhu , Guozheng Wei , Qi Zhang , Shiwei Sun , Dongbo Bu

A Rough Sets Partitioning Model for Mining Sequential Patterns with Time Constraint

Now a days, data mining and knowledge discovery methods are applied to a variety of enterprise and engineering disciplines to uncover interesting patterns from databases. The study of Sequential patterns is an important data mining problem…

Databases · Computer Science 2009-06-24 Jigyasa Bisaria , Namita Shrivastava , K. R. Pardasani

A Decision Diagram Approach for the Parallel Machine Scheduling Problem with Chance Constraints

The Chance-Constrained Parallel Machine Scheduling Problem (CC-PMSP) assigns jobs with uncertain processing times to machines, ensuring that each machine's availability constraints are met with a certain probability. We present a…

Optimization and Control · Mathematics 2025-04-30 Nicolás Casassus , Margarita Castro , Gustavo Angulo