Related papers: Protein sequence classification using natural lang…
Composed of amino acid chains that influence how they fold and thus dictating their function and features, proteins are a class of macromolecules that play a central role in major biological processes and are required for the structure,…
Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing. However, training an effective DNN model still poses challenges. This paper aims to propose a method…
Objective: We investigate whether deep learning techniques for natural language processing (NLP) can be used efficiently for patient phenotyping. Patient phenotyping is a classification task for determining whether a patient has a medical…
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning…
We introduce a protein language model for determining the complete sequence of a peptide based on measurement of a limited set of amino acids. To date, protein sequencing relies on mass spectrometry, with some novel edman degregation based…
Automated document classification is a trending topic in Natural Language Processing (NLP) due to the extensive growth in digital databases. However, a model that fits well for a specific classification task might perform weakly for another…
Natural Language Processing (NLP) is widely used in fields like machine translation and sentiment analysis. However, traditional NLP models struggle with accuracy and efficiency. This paper introduces Deep Convolutional Neural Networks…
In recent era prediction of enzyme class from an unknown protein is one of the challenging tasks in bioinformatics. Day to day the number of proteins is increases as result the prediction of enzyme class gives a new opportunity to…
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models taken from NLP. These LMs reach for new prediction frontiers at low inference costs. Here, we trained two…
Data-efficient image classification is a challenging task that aims to solve image classification using small training data. Neural network-based deep learning methods are effective for image classification, but they typically require…
For protein sequence datasets, unlabeled data has greatly outpaced labeled data due to the high cost of wet-lab characterization. Recent deep-learning approaches to protein prediction have shown that pre-training on unlabeled data can yield…
Representation learning and \emph{de novo} generation of proteins are pivotal computational biology tasks. Whilst natural language processing (NLP) techniques have proven highly effective for protein sequence modelling, structure modelling…
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of…
Natural Language Processing (NLP) is an important branch of artificial intelligence that studies how to enable computers to understand, process, and generate human language. Text classification is a fundamental task in NLP, which aims to…
This article reviews recent advances in applying natural language processing (NLP) to Electronic Health Records (EHRs) for computational phenotyping. NLP-based computational phenotyping has numerous applications including diagnosis…
Protein language models (pLMs) have recently gained significant attention for their ability to uncover relationships between sequence, structure, and function from evolutionary statistics, thereby accelerating therapeutic drug discovery.…
Accurate localization of proteins from fluorescence microscopy images is challenging due to the inter-class similarities and intra-class disparities introducing grave concerns in addressing multi-class classification problems. Conventional…
Deep learning is playing a vital role in every field which involves data. It has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using…
Advanced automated AI techniques allow us to classify protein sequences and discern their biological families and functions. Conventional approaches for classifying these protein families often focus on extracting N-Gram features from the…
The classification of DNA sequences is a key research area in bioinformatics as it enables researchers to conduct genomic analysis and detect possible diseases. In this paper, three state-of-the-art algorithms, namely Convolutional Neural…