Related papers: Unaligned Sequence Similarity Search Using Deep Le…

DNA: Denoised Neighborhood Aggregation for Fine-grained Category Discovery

Discovering fine-grained categories from coarsely labeled data is a practical and challenging task, which can bridge the gap between the demand for fine-grained analysis and the high annotation cost. Previous works mainly focus on…

Machine Learning · Computer Science 2023-10-17 Wenbin An , Feng Tian , Wenkai Shi , Yan Chen , Qinghua Zheng , QianYing Wang , Ping Chen

New Sequence Alignment Algorithm using AI Rules and Dynamic Seeds

DNA sequence alignment is important today as it is usually the first step in finding gene mutation, evolutionary similarities, protein structure, drug development and cancer treatment. Covid-19 is one recent example. There are many…

Genomics · Quantitative Biology 2023-06-01 Suchindra , Preetam Nagaraj

Deep Adversarial Network Alignment

Network alignment, in general, seeks to discover the hidden underlying correspondence between nodes across two (or more) networks when given their network structure. However, most existing network alignment methods have added assumptions of…

Social and Information Networks · Computer Science 2019-02-28 Tyler Derr , Hamid Karimi , Xiaorui Liu , Jiejun Xu , Jiliang Tang

Embed-Search-Align: DNA Sequence Alignment using Transformer Models

DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant calling, transcriptomics, and epigenomics.…

Genomics · Quantitative Biology 2024-12-06 Pavan Holur , K. C. Enevoldsen , Shreyas Rajesh , Lajoyce Mboning , Thalia Georgiou , Louis-S. Bouchard , Matteo Pellegrini , Vwani Roychowdhury

Unsupervised Learning via Network-Aware Embeddings

Data clustering, the task of grouping observations according to their similarity, is a key component of unsupervised learning -- with real world applications in diverse fields such as biology, medicine, and social science. Often in these…

Machine Learning · Computer Science 2023-09-20 Anne Sophie Riis Damstrup , Sofie Tosti Madsen , Michele Coscia

Vector Embeddings by Sequence Similarity and Context for Improved Compression, Similarity Search, Clustering, Organization, and Manipulation of cDNA Libraries

This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ5). FASTA/FASTQ files have several current limitations, such as their large file sizes,…

Genomics · Quantitative Biology 2023-08-11 Daniel H. Um , David A. Knowles , Gail E. Kaiser

Discovering alignment relations with Graph Convolutional Networks: a biomedical case study

Knowledge graphs are freely aggregated, published, and edited in the Web of data, and thus may overlap. Hence, a key task resides in aligning (or matching) their content. This task encompasses the identification, within an aggregated…

Machine Learning · Computer Science 2021-10-22 Pierre Monnin , Chedy Raïssi , Amedeo Napoli , Adrien Coulet

Attributed Sequence Embedding

Mining tasks over sequential data, such as clickstreams and gene sequences, require a careful design of embeddings usable by learning algorithms. Recent research in feature learning has been extended to sequential data, where each instance…

Machine Learning · Computer Science 2020-07-28 Zhongfang Zhuang , Xiangnan Kong , Elke Rundensteiner , Jihane Zouaoui , Aditya Arora

A Space-Efficient Approach towards Distantly Homologous Protein Similarity Searches

Protein similarity searches are a routine job for molecular biologists where a query sequence of amino acids needs to be compared and ranked against an ever-growing database of proteins. All available algorithms in this field can be grouped…

Computational Engineering, Finance, and Science · Computer Science 2015-08-27 Akash Nag , Sunil Karforma

Deep Metric Learning using Similarities from Nonlinear Rank Approximations

In recent years, deep metric learning has achieved promising results in learning high dimensional semantic feature embeddings where the spatial relationships of the feature vectors match the visual similarities of the images. Similarity…

Machine Learning · Computer Science 2019-09-25 Konstantin Schall , Kai Uwe Barthel , Nico Hezel , Klaus Jung

A Strategy for Label Alignment in Deep Neural Networks

One recent research demonstrated successful application of the label alignment property for unsupervised domain adaptation in a linear regression settings. Instead of regularizing representation learning to be domain invariant, the research…

Machine Learning · Computer Science 2025-03-13 Xuanrui Zeng

Network Together: Node Classification via Cross network Deep Network Embedding

Network embedding is a highly effective method to learn low-dimensional node vector representations with original network structures being well preserved. However, existing network embedding algorithms are mostly developed for a single…

Social and Information Networks · Computer Science 2021-05-06 Xiao Shen , Quanyu Dai , Sitong Mao , Fu-lai Chung , Kup-Sze Choi

A Survey on Efficient Processing of Similarity Queries over Neural Embeddings

Similarity query is the family of queries based on some similarity metrics. Unlike the traditional database queries which are mostly based on value equality, similarity queries aim to find targets "similar enough to" the given data objects,…

Databases · Computer Science 2022-04-19 Yifan Wang

Fixed-Length Protein Embeddings using Contextual Lenses

The Basic Local Alignment Search Tool (BLAST) is currently the most popular method for searching databases of biological sequences. BLAST compares sequences via similarity defined by a weighted edit distance, which results in it being…

Biomolecules · Quantitative Biology 2020-10-29 Amir Shanehsazzadeh , David Belanger , David Dohan

Low-Budget Label Query through Domain Alignment Enforcement

Deep learning revolution happened thanks to the availability of a massive amount of labelled data which have contributed to the development of models with extraordinary inference capabilities. Despite the public availability of a large…

Computer Vision and Pattern Recognition · Computer Science 2020-03-31 Jurandy Almeida , Cristiano Saltori , Paolo Rota , Nicu Sebe

Recognizing Variables from their Data via Deep Embeddings of Distributions

A key obstacle in automated analytics and meta-learning is the inability to recognize when different datasets contain measurements of the same variable. Because provided attribute labels are often uninformative in practice, this task may be…

Machine Learning · Computer Science 2019-09-12 Jonas Mueller , Alex Smola

Convolutional Embedding for Edit Distance

Edit-distance-based string similarity search has many applications such as spell correction, data de-duplication, and sequence alignment. However, computing edit distance is known to have high complexity, which makes string similarity…

Databases · Computer Science 2020-05-25 Xinyan Dai , Xiao Yan , Kaiwen Zhou , Yuxuan Wang , Han Yang , James Cheng

Deep Low-Density Separation for Semi-Supervised Classification

Given a small set of labeled data and a large set of unlabeled data, semi-supervised learning (SSL) attempts to leverage the location of the unlabeled datapoints in order to create a better classifier than could be obtained from supervised…

Machine Learning · Computer Science 2022-05-25 Michael C. Burkhart , Kyle Shan

Semi-Supervised Learning on Graphs Based on Local Label Distributions

Most approaches that tackle the problem of node classification consider nodes to be similar, if they have shared neighbors or are close to each other in the graph. Recent methods for attributed graphs additionally take attributes of…

Machine Learning · Computer Science 2018-05-23 Evgeniy Faerman , Felix Borutta , Julian Busch , Matthias Schubert

DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks

Motivation: Drug discovery demands rapid quantification of compound-protein interaction (CPI). However, there is a lack of methods that can predict compound-protein affinity from sequences alone with high applicability, accuracy, and…

Biomolecules · Quantitative Biology 2020-12-17 Mostafa Karimi , Di Wu , Zhangyang Wang , Yang Shen