English
Related papers

Related papers: Multimodal Medical Code Tokenizer

200 papers

As two important textual modalities in electronic health records (EHR), both structured data (clinical codes) and unstructured data (clinical narratives) have recently been increasingly applied to the healthcare domain. Most existing…

Computation and Language · Computer Science 2022-11-01 Sicen Liu , Xiaolong Wang , Yongshuai Hou , Ge Li , Hui Wang , Hui Xu , Yang Xiang , Buzhou Tang

In electronic health record (EHR) mining, learning high-quality representations of medical concepts (e.g., standardized diagnosis, medication, and procedure codes) is fundamental for downstream clinical prediction. However, ro bust concept…

Machine Learning · Computer Science 2026-05-05 Mohsen Nayebi Kerdabadi , Arya Hadizadeh Moghaddam , Chen Chen , Dongjie Wang , Zijun Yao

Foundation models have emerged as a powerful approach for processing electronic health records (EHRs), offering flexibility to handle diverse medical data modalities. In this study, we present a comprehensive benchmark that evaluates the…

Machine Learning · Computer Science 2025-07-22 Kunyu Yu , Rui Yang , Jingchi Liao , Siqi Li , Huitao Li , Irene Li , Yifan Peng , Rishikesan Kamaleswaran , Nan Liu

Autoregressive modeling has driven major advances in multimodal AI, yet its application to medical imaging remains constrained by the absence of a unified image tokenizer that simultaneously preserves fine-grained anatomical structures and…

Image and Video Processing · Electrical Eng. & Systems 2026-04-02 Chenglong Ma , Yuanfeng Ji , Jin Ye , Zilong Li , Chenhui Wang , Junzhi Ning , Wei Li , Lihao Liu , Qiushan Guo , Tianbin Li , Junjun He , Hongming Shan

As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two…

Computation and Language · Computer Science 2022-03-21 Sungjin Park , Seongsu Bae , Jiho Kim , Tackeun Kim , Edward Choi

Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to…

Computation and Language · Computer Science 2024-07-18 Thao Minh Nguyen Phan , Cong-Tinh Dao , Chenwei Wu , Jian-Zhe Wang , Shun Liu , Jun-En Ding , David Restrepo , Feng Liu , Fang-Ming Hung , Wen-Chih Peng

Large Language Models (LLMs) have demonstrated remarkable performance across various domains, including healthcare. However, their ability to effectively represent structured non-textual data, such as the alphanumeric medical codes used in…

Electronic health record (EHR) foundation models have been an area ripe for exploration with their improved performance in various medical tasks. Despite the rapid advances, there exists a fundamental limitation: Processing unseen medical…

Artificial Intelligence · Computer Science 2025-08-15 Junmo Kim , Namkyeong Lee , Jiwon Kim , Kwangsoo Kim

Most existing medication recommendation models learn representations for medical concepts based on electronic health records (EHRs) and make recommendations with learnt representations. However, most medications appear in the dataset for…

Machine Learning · Computer Science 2024-02-16 Weicong Tan , Weiqing Wang , Xin Zhou , Wray Buntine , Gordon Bingham , Hongzhi Yin

The breadth, scale, and temporal granularity of modern electronic health records (EHR) systems offers great potential for estimating personalized and contextual patient health trajectories using sequential deep learning. However, learning…

Addressing the challenge of multimodal data fusion in high-dimensional biomedical informatics, we propose MMCTOP, a MultiModal Clinical-Trial Outcome Prediction framework that integrates heterogeneous biomedical signals spanning (i)…

Machine Learning · Computer Science 2025-12-29 Carolina Aparício , Qi Shi , Bo Wen , Tesfaye Yadete , Qiwei Han

Multimodal representation learning has demonstrated remarkable potential in enabling models to process and integrate diverse data modalities, such as text and images, for improved understanding and performance. While the medical domain can…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Shuvendu Roy , Franklin Ogidi , Ali Etemad , Elham Dolatabadi , Arash Afkanpour

The inherent multimodality and heterogeneous temporal structures of medical data pose significant challenges for modeling. We propose MedM2T, a time-aware multimodal framework designed to address these complexities. MedM2T integrates: (i)…

Machine Learning · Computer Science 2026-03-26 Yu-Chen Kuo , Yi-Ju Tseng

Foundation models for structured electronic health records (EHRs) are pretrained on longitudinal sequences of timestamped clinical events to learn adaptable patient representations. Tokenization -- how these timelines are converted into…

Large-scale EHR prediction across institutions is hindered by substantial heterogeneity in schemas and code systems. Although Common Data Models (CDMs) can standardize records for multi-institutional learning, the manual harmonization and…

Computation and Language · Computer Science 2026-04-02 Kyunghoon Hur , Heeyoung Kwak , Jinsu Jang , Nakhwan Kim , Edward Choi

Medical deep learning models depend heavily on domain-specific knowledge to perform well on knowledge-intensive clinical tasks. Prior work has primarily leveraged unimodal knowledge graphs, such as the Unified Medical Language System…

Artificial Intelligence · Computer Science 2025-05-26 Xiaochen Wang , Yuan Zhong , Lingwei Zhang , Lisong Dai , Ting Wang , Fenglong Ma

The advent of large language models (LLMs) has opened new avenues for analyzing complex, unstructured data, particularly within the medical domain. Electronic Health Records (EHRs) contain a wealth of information in various formats,…

Information Retrieval · Computer Science 2025-06-10 Wu Hao Ran , Xi Xi , Furong Li , Jingyi Lu , Jian Jiang , Hui Huang , Yuzhuan Zhang , Shi Li

One of the cardinal tasks in achieving robust medical question answering systems is textual entailment. The existing approaches make use of an ensemble of pre-trained language models or data augmentation, often to clock higher numbers on…

Computation and Language · Computer Science 2020-11-11 Shweta Yadav , Vishal Pallagani , Amit Sheth

Distributed representations of medical concepts have been used to support downstream clinical tasks recently. Electronic Health Records (EHR) capture different aspects of patients' hospital encounters and serve as a rich source for…

Computation and Language · Computer Science 2020-01-07 Shaika Chowdhury , Chenwei Zhang , Philip S. Yu , Yuan Luo

This study introduces a novel knowledge enhanced tokenisation mechanism, K-Tokeniser, for clinical text processing. Technically, at initialisation stage, K-Tokeniser populates global representations of tokens based on semantic types of…

Computation and Language · Computer Science 2024-06-21 Abul Hasan , Jinge Wu , Quang Ngoc Nguyen , Salomé Andres , Imane Guellil , Huayu Zhang , Arlene Casey , Beatrice Alex , Bruce Guthrie , Honghan Wu
‹ Prev 1 2 3 10 Next ›