English
Related papers

Related papers: Distributed Representations for Biological Sequenc…

200 papers

Genomic sequence analysis plays a crucial role in various scientific and medical domains. Traditional machine-learning approaches often struggle to capture the complex relationships and hierarchical structures of sequence data when working…

Machine Learning · Computer Science 2025-10-02 Sarwan Ali , Haris Mansoor , Murray Patterson

Representation learning is an important step in the machine learning pipeline. Given the current biological sequencing data volume, learning an explicit representation is prohibitive due to the dimensionality of the resulting feature…

Machine Learning · Computer Science 2023-04-04 Sarwan Ali , Usama Sardar , Murray Patterson , Imdad Ullah Khan

Inferring the structural properties of a protein from its amino acid sequence is a challenging yet important problem in biology. Structures are not known for the vast majority of protein sequences, but structure is critical for…

Machine Learning · Computer Science 2019-10-17 Tristan Bepler , Bonnie Berger

This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ5). FASTA/FASTQ files have several current limitations, such as their large file sizes,…

Genomics · Quantitative Biology 2023-08-11 Daniel H. Um , David A. Knowles , Gail E. Kaiser

Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous…

Information Retrieval · Computer Science 2019-05-29 Zheng Gao , Gang Fu , Chunping Ouyang , Satoshi Tsutsui , Xiaozhong Liu , Jeremy Yang , Christopher Gessner , Brian Foote , David Wild , Qi Yu , Ying Ding

Relationships in scientific data, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations' relations in multivariate data, and the association of volumes in time-varying and…

Machine Learning · Computer Science 2022-07-25 Xiangyang He , Yubo Tao , Shuoliu Yang , Haoran Dai , Hai Lin

Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in…

Social and Information Networks · Computer Science 2017-02-23 Bijaya Adhikari , Yao Zhang , Naren Ramakrishnan , B. Aditya Prakash

Genetic information is encoded in a linear sequence of nucleotides, represented by letters ranging from thousands to billions. Mutations refer to changes in the DNA or RNA nucleotide sequence. Thus, mutation detection is vital in all areas…

Capturing the semantics of related biological concepts, such as genes and mutations, is of significant importance to many research tasks in computational biology such as protein-protein interaction detection, gene-drug association…

Computation and Language · Computer Science 2020-07-01 Qingyu Chen , Kyubum Lee , Shankai Yan , Sun Kim , Chih-Hsuan Wei , Zhiyong Lu

Machine learning for data-driven diagnosis has been actively studied in medicine to provide better healthcare. Supporting analysis of a patient cohort similar to a patient under treatment is a key task for clinicians to make decisions with…

Medical Physics · Physics 2020-03-25 Rongchen Guo , Takanori Fujiwara , Yiran Li , Kelly M. Lima , Soman Sen , Nam K. Tran , Kwan-Liu Ma

Recent works on representation learning for graph structured data predominantly focus on learning distributed representations of graph substructures such as nodes and subgraphs. However, many graph analytics tasks such as graph…

Artificial Intelligence · Computer Science 2017-07-18 Annamalai Narayanan , Mahinthan Chandramohan , Rajasekar Venkatesan , Lihui Chen , Yang Liu , Shantanu Jaiswal

Taxonomic classification in biodiversity research involves organizing biological specimens into structured hierarchies based on evidence, which can come from multiple modalities such as images and genetic information. We investigate whether…

Effective representation of data is crucial in various machine learning tasks, as it captures the underlying structure and context of the data. Embeddings have emerged as a powerful technique for data representation, but evaluating their…

Machine Learning · Computer Science 2023-09-21 Sarwan Ali

Gene annotation has traditionally required direct comparison of DNA sequences between an unknown gene and a database of known ones using string comparison methods. However, these methods do not provide useful information when a gene does…

Machine Learning · Computer Science 2019-09-17 James K. Senter , Taylor M. Royalty , Andrew D. Steen , Amir Sadovnik

The study of neural representations, both in biological and artificial systems, is increasingly revealing the importance of geometric and topological structures. Inspired by this, we introduce Event2Vec, a novel framework for learning…

Machine Learning · Computer Science 2025-12-02 Antonin Sulc

Advances in next-generation metagenome sequencing have the potential to revolutionize the point-of-care diagnosis of novel pathogen infections, which could help prevent potential widespread transmission of diseases. Given the high volume of…

The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical for large-scale biological research. However, popular machine learning approaches, based on…

Quantitative Methods · Quantitative Biology 2021-10-13 Gabriele Corso , Rex Ying , Michal Pándy , Petar Veličković , Jure Leskovec , Pietro Liò

Network Embedding (NE) methods, which map network nodes to low-dimensional feature vectors, have wide applications in network analysis and bioinformatics. Many existing NE methods rely only on network structure, overlooking other…

Artificial Intelligence · Computer Science 2019-06-21 Sotiris Kotitsas , Dimitris Pappas , Ion Androutsopoulos , Ryan McDonald , Marianna Apidianaki

Proteins perform much of the work in living organisms, and consequently the development of efficient computational methods for protein representation is essential for advancing large-scale biological research. Most current approaches…

Quantitative Methods · Quantitative Biology 2023-06-09 Francesco Ceccarelli , Lorenzo Giusti , Sean B. Holden , Pietro Liò

Semantic vector embedding techniques have proven useful in learning semantic representations of data across multiple domains. A key application enabled by such techniques is the ability to measure semantic similarity between given data…

Computation and Language · Computer Science 2020-09-01 Shalisha Witherspoon , Dean Steuer , Graham Bent , Nirmit Desai
‹ Prev 1 2 3 10 Next ›