Related papers: Gene Set Summarization using Large Language Models

GeneSUM: Large Language Model-based Gene Summary Extraction

Emerging topics in biomedical research are continuously expanding, providing a wealth of information about genes and their function. This rapid proliferation of knowledge presents unprecedented opportunities for scientific discovery and…

Genomics · Quantitative Biology 2024-12-25 Zhijian Chen , Chuan Hu , Min Wu , Qingqing Long , Xuezhi Wang , Yuanchun Zhou , Meng Xiao

Evaluation of large language models for discovery of gene set function

Gene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological…

Genomics · Quantitative Biology 2024-12-03 Mengzhou Hu , Sahar Alkhairy , Ingoo Lee , Rudolf T. Pillich , Dylan Fong , Kevin Smith , Robin Bachelder , Trey Ideker , Dexter Pratt

DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization

Finetuning a Large Language Model (LLM) is crucial for generating results towards specific objectives. This research delves into the realm of drug optimization and introduce a novel reinforcement learning algorithm to finetune a drug…

Machine Learning · Computer Science 2025-02-12 Xuefeng Liu , Songhao Jiang , Siyu Chen , Zhuoran Yang , Yuxin Chen , Ian Foster , Rick Stevens

GenOM: Ontology Matching with Description Generation and Large Language Model

Ontology matching (OM) plays an essential role in enabling semantic interoperability and integration across heterogeneous knowledge sources, particularly in the biomedical domain which contains numerous complex concepts related to diseases…

Artificial Intelligence · Computer Science 2026-04-03 Yiping Song , Jiaoyan Chen , Renate A. Schmidt

Using Large Language Models to Enrich the Documentation of Datasets for Machine Learning

Recent regulatory initiatives like the European AI Act and relevant voices in the Machine Learning (ML) community stress the need to describe datasets along several key dimensions for trustworthy AI, such as the provenance processes and…

Digital Libraries · Computer Science 2024-05-27 Joan Giner-Miguelez , Abel Gómez , Jordi Cabot

Towards Ontology-Enhanced Representation Learning for Large Language Models

Taking advantage of the widespread use of ontologies to organise and harmonize knowledge across several distinct domains, this paper proposes a novel approach to improve an embedding-Large Language Model (embedding-LLM) of interest by…

Computation and Language · Computer Science 2024-06-03 Francesco Ronzano , Jay Nanavati

Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph

Structured science summaries or research contributions using properties or dimensions beyond traditional keywords enhances science findability. Current methods, such as those used by the Open Research Knowledge Graph (ORKG), involve…

Artificial Intelligence · Computer Science 2024-05-06 Vladyslav Nechakhin , Jennifer D'Souza , Steffen Eger

GP-GPT: Large Language Model for Gene-Phenotype Mapping

Pre-trained large language models(LLMs) have attracted increasing attention in biomedical domains due to their success in natural language processing. However, the complex traits and heterogeneity of multi-sources genomics data pose…

Computation and Language · Computer Science 2025-11-25 Yanjun Lyu , Zihao Wu , Lu Zhang , Jing Zhang , Yiwei Li , Wei Ruan , Zhengliang Liu , Zeyu Zhang , Xiang Li , Rongjie Liu , Chao Huang , Wentao Li , Tianming Liu , Dajiang Zhu

MapperGPT: Large Language Models for Linking and Mapping Entities

Aligning terminological resources, including ontologies, controlled vocabularies, taxonomies, and value sets is a critical part of data integration in many domains such as healthcare, chemistry, and biomedical research. Entity mapping is…

Computation and Language · Computer Science 2023-10-06 Nicolas Matentzoglu , J. Harry Caufield , Harshad B. Hegde , Justin T. Reese , Sierra Moxon , Hyeongsik Kim , Nomi L. Harris , Melissa A Haendel , Christopher J. Mungall

A Large Language Model Outperforms Other Computational Approaches to the High-Throughput Phenotyping of Physician Notes

High-throughput phenotyping, the automated mapping of patient signs and symptoms to standardized ontology concepts, is essential to gaining value from electronic health records (EHR) in the support of precision medicine. Despite…

Artificial Intelligence · Computer Science 2024-06-24 Syed I. Munzir , Daniel B. Hier , Chelsea Oommen , Michael D. Carrithers

Gene-based and semantic structure of the Gene Ontology as a complex network

The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The information accumulated time-by-time and included in the…

Molecular Networks · Quantitative Biology 2021-08-25 Salvatore Miccichè

Using Large Language Models for OntoClean-based Ontology Refinement

This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology. OntoClean, critical for assessing the metaphysical…

Artificial Intelligence · Computer Science 2024-03-26 Yihang Zhao , Neil Vetter , Kaveh Aryan

Ontology engineering with Large Language Models

We tackle the task of enriching ontologies by automatically translating natural language sentences into Description Logic. Since Large Language Models (LLMs) are the best tools for translations, we fine-tuned a GPT-3 model to convert…

Artificial Intelligence · Computer Science 2023-08-01 Patricia Mateiu , Adrian Groza

A Study of Generative Large Language Model for Medical Research and Healthcare

There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using…

Computation and Language · Computer Science 2023-11-20 Cheng Peng , Xi Yang , Aokun Chen , Kaleb E Smith , Nima PourNejatian , Anthony B Costa , Cheryl Martin , Mona G Flores , Ying Zhang , Tanja Magoc , Gloria Lipori , Duane A Mitchell , Naykky S Ospina , Mustafa M Ahmed , William R Hogan , Elizabeth A Shenkman , Yi Guo , Jiang Bian , Yonghui Wu

Generative vector search to improve pathology foundation models across multimodal vision-language tasks

Retrieval-augmented generation improves large language models by grounding outputs in external knowledge sources, reducing hallucinations and addressing knowledge cutoffs. However, standard embedding-based retrieval fails to capture the…

Information Retrieval · Computer Science 2025-12-23 Markus Ekvall , Ludvig Bergenstråhle , Patrick Truong , Ben Murrell , Joakim Lundeberg

Can Large Language Models Augment a Biomedical Ontology with missing Concepts and Relations?

Ontologies play a crucial role in organizing and representing knowledge. However, even current ontologies do not encompass all relevant concepts and relationships. Here, we explore the potential of large language models (LLM) to expand an…

Computation and Language · Computer Science 2023-11-14 Antonio Zaitoun , Tomer Sagi , Szymon Wilk , Mor Peleg

A Hybrid GA LLM Framework for Structured Task Optimization

GA LLM is a hybrid framework that combines Genetic Algorithms with Large Language Models to handle structured generation tasks under strict constraints. Each output, such as a plan or report, is treated as a gene, and evolutionary…

Computation and Language · Computer Science 2025-06-17 William Shum , Rachel Chan , Jonas Lin , Benny Feng , Patrick Lau

LLaMA-Gene: A General-purpose Gene Task Large Language Model Based on Instruction Fine-tuning

Building a general-purpose task model similar to ChatGPT has been an important research direction for gene large language models. Instruction fine-tuning is a key component in building ChatGPT, but existing instructions are primarily based…

Genomics · Quantitative Biology 2024-12-03 Wang Liang

A Simplified Retriever to Improve Accuracy of Phenotype Normalizations by Large Language Models

Large language models (LLMs) have shown improved accuracy in phenotype term normalization tasks when augmented with retrievers that suggest candidate normalizations based on term definitions. In this work, we introduce a simplified…

Computation and Language · Computer Science 2025-03-06 Daniel B. Hier , Thanh Son Do , Tayo Obafemi-Ajayi

Assessing the Utility of Large Language Models for Phenotype-Driven Gene Prioritization in Rare Genetic Disorder Diagnosis

Phenotype-driven gene prioritization is a critical process in the diagnosis of rare genetic disorders for identifying and ranking potential disease-causing genes based on observed physical traits or phenotypes. While traditional approaches…

Quantitative Methods · Quantitative Biology 2024-04-04 Junyoung Kim , Jingye Yang , Kai Wang , Chunhua Weng , Cong Liu