English
Related papers

Related papers: Omics Data Discovery Agents

200 papers

Tools to explore scientific literature are essential for scientists, especially in biomedicine, where about a million new papers are published every year. Many such tools provide users the ability to search for specific entities (e.g.…

Computation and Language · Computer Science 2021-07-05 Sunil Mohan , Rico Angell , Nick Monath , Andrew McCallum

As the availability of omics data has increased in the last few years, more multi-omics data have been generated, that is, high-dimensional molecular data consisting of several types such as genomic, transcriptomic, or proteomic data, all…

Genomics · Quantitative Biology 2023-02-09 Roman Hornung , Frederik Ludwigs , Jonas Hagenberg , Anne-Laure Boulesteix

The potential benefits of applying machine learning methods to -omics data are becoming increasingly apparent, especially in clinical settings. However, the unique characteristics of these data are not always well suited to machine learning…

The development of vision-language models (VLMs) is driven by large-scale and diverse multimodal datasets. However, progress toward generalist biomedical VLMs is limited by the lack of annotated, publicly accessible datasets across biology…

Purpose: This paper introduces the concept of "Agentic Publication," a novel LLM-driven framework designed to complement traditional scientific publishing by transforming papers into interactive knowledge systems that address challenges…

Artificial Intelligence · Computer Science 2026-05-06 Roberto Pugliese , George Kourousias , Francesco Venier , Grazia Garlatti Costa

Developing large-scale foundational datasets is a critical milestone in advancing artificial intelligence (AI)-driven scientific innovation. However, unlike AI-mature fields such as natural language processing, materials science,…

Chemical Physics · Physics 2025-11-18 Ryo Yoshida , Yoshihiro Hayashi , Hidemine Furuya , Ryohei Hosoya , Kazuyoshi Kaneko , Hiroki Sugisawa , Yu Kaneko , Aiko Takahashi , Yoh Noguchi , Shun Nanjo , Keiko Shinoda , Tomu Hamakawa , Mitsuru Ohno , Takuya Kitamura , Misaki Yonekawa , Stephen Wu , Masato Ohnishi , Chang Liu , Teruki Tsurimoto , Arifin , Araki Wakiuchi , Kohei Noda , Junko Morikawa , Teruaki Hayakawa , Junichiro Shiomi , Masanobu Naito , Kazuya Shiratori , Tomoki Nagai , Norio Tomotsu , Hiroto Inoue , Ryuichi Sakashita , Masashi Ishii , Isao Kuwajima , Kenji Furuichi , Norihiko Hiroi , Yuki Takemoto , Takahiro Ohkuma , Keita Yamamoto , Naoya Kowatari , Masato Suzuki , Naoya Matsumoto , Seiryu Umetani , Hisaki Ikebata , Yasuyuki Shudo , Mayu Nagao , Shinya Kamada , Kazunori Kamio , Taichi Shomura , Kensaku Nakamura , Yudai Iwamizu , Atsutoshi Abe , Koki Yoshitomi , Yuki Horie , Katsuhiko Koike , Koichi Iwakabe , Shinya Gima , Kota Usui , Gikyo Usuki , Takuro Tsutsumi , Keitaro Matsuoka , Kazuki Sada , Masahiro Kitabata , Takuma Kikutsuji , Akitaka Kamauchi , Yusuke Iijima , Tsubasa Suzuki , Takenori Goda , Yuki Takabayashi , Kazuko Imai , Yuji Mochizuki , Hideo Doi , Koji Okuwaki , Hiroya Nitta , Taku Ozawa , Hitoshi Kamijima , Toshiaki Shintani , Takuma Mitamura , Massimiliano Zamengo , Yuitsu Sugami , Seiji Akiyama , Yoshinari Murakami , Atsushi Betto , Naoya Matsuo , Satoru Kagao , Tetsuya Kobayashi , Norie Matsubara , Shosei Kubo , Yuki Ishiyama , Yuri Ichioka , Mamoru Usami , Satoru Yoshizaki , Seigo Mizutani , Yosuke Hanawa , Shogo Kunieda , Mitsuru Yambe , Takeru Nakamura , Hiromori Murashima , Kenji Takahashi , Naoki Wada , Masahiro Kawano , Yosuke Harada , Takehiro Fujita , Erina Fujita , Ryoji Himeno , Hiori Kino , Kenji Fukumizu

This work presents an omics-driven modeling pipeline that integrates machine-learning tools to facilitate the dynamic modeling of multiscale biological systems. Random forests and permutation feature importance are proposed to mine omics…

Quantitative Methods · Quantitative Biology 2025-01-17 Sebastián Espinel-Ríos , José Montaño López , José L. Avalos

Large language models (LLMs) have grown in their usage to provide support for question answering across numerous disciplines. The models on their own have already shown promise for answering basic questions, however fail quickly where…

Information Retrieval · Computer Science 2025-04-15 David Brett , Anniek Myatt

Health informatics research is characterized by diverse data modalities, rapid knowledge expansion, and the need to integrate insights across biomedical science, data analytics, and clinical practice. These characteristics make it…

Artificial Intelligence · Computer Science 2025-09-24 Yuxiao Cheng , Jinli Suo

Corpus distillation for biomedical large language models (LLMs) seeks to address the pressing challenge of insufficient quantity and quality in open-source annotated scientific corpora, which remains a bottleneck for effective LLM training…

Computation and Language · Computer Science 2025-12-19 Meng Xiao , Xunxin Cai , Qingqing Long , Chengrui Wang , Yuanchun Zhou , Hengshu Zhu

With the exponential increase in online scientific literature, identifying reliable domain-specific data has become increasingly important but also very challenging. Manual data collection and filtering for domain-specific scientific…

Information Retrieval · Computer Science 2026-03-10 Nikita Gautam , Doina Caragea , Ignacio Ciampitti , Federico Gomez

Reproducibility of computational results remains a challenge in materials science, as simulation workflows and parameters are often reported only in unstructured text and tables. While literature data are valuable for validation and reuse,…

Pre-trained language models (PLMs) have proven to be effective for document re-ranking task. However, they lack the ability to fully interpret the semantics of biomedical and health-care queries and often rely on simplistic patterns for…

Computation and Language · Computer Science 2023-05-09 Deepak Gupta , Dina Demner-Fushman

Biomedical research results are being published at a high rate, and with existing search engines, the vast amount of published work is usually easily accessible. However, reproducing published results, either experimental data or…

Molecular Networks · Quantitative Biology 2017-06-19 Kai-Wen Liang , Qinsi Wang , Cheryl Telmer , Divyaa Ravichandran , Peter Spirtes , Natasa Miskov-Zivanov

Interpreting transcriptomic data is one of the most common analytical tasks in modern biology. Yet most current models either consume expression profiles without producing natural-language biological explanations, or reason in language…

Materials science workflows rely on structured and unstructured data from the vast body of available scientific literature. However, most of the experimental details remain buried in text, tables, graphs and figures. Thus, constructing…

Computation and Language · Computer Science 2026-05-07 Achuth Chandrasekhar , Omid Barati Farimani , Radheesh Sharma Meda , Amir Barati Farimani

Motivation: The size of available omics datasets is steadily increasing with technological advancement in recent years. While this increase in sample size can be used to improve the performance of relevant prediction tasks in healthcare,…

Quantitative Methods · Quantitative Biology 2023-05-04 Jonas C. Ditz , Bernhard Reuter , Nico Pfeifer

This study applies Large Language Models (LLMs) to two foundational Electronic Health Record (EHR) data science tasks: structured data querying (using programmatic languages, Python/Pandas) and information extraction from unstructured…

Computation and Language · Computer Science 2026-01-29 Juan Jose Rubio Jan , Jack Wu , Julia Ive

The substantial data volumes encountered in modern particle physics and other domains of fundamental physics research allow (and require) the use of increasingly complex data analysis tools and workflows. While the use of machine learning…

High Energy Physics - Phenomenology · Physics 2026-02-18 Sascha Diefenbacher , Anna Hallin , Gregor Kasieczka , Michael Krämer , Anne Lauscher , Tim Lukas

Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery, yet their static knowledge and hallucination issues hinder autonomous research applications. Recent advances integrate LLMs into agentic…

Artificial Intelligence · Computer Science 2025-12-23 Zeyu Xia , Jinzhe Ma , Congjie Zheng , Shufei Zhang , Yuqiang Li , Hang Su , P. Hu , Changshui Zhang , Xingao Gong , Wanli Ouyang , Lei Bai , Dongzhan Zhou , Mao Su
‹ Prev 1 2 3 10 Next ›