English
Related papers

Related papers: LinkML: An Open Data Modeling Framework

200 papers

Linking information across sources is fundamental to a variety of analyses in social science, business, and government. While large language models (LLMs) offer enormous promise for improving record linkage in noisy datasets, in many…

Computation and Language · Computer Science 2024-06-26 Abhishek Arora , Melissa Dell

With the exponential increase in online scientific literature, identifying reliable domain-specific data has become increasingly important but also very challenging. Manual data collection and filtering for domain-specific scientific…

Information Retrieval · Computer Science 2026-03-10 Nikita Gautam , Doina Caragea , Ignacio Ciampitti , Federico Gomez

Achieving semantic interoperability across heterogeneous experimental data systems remains a major barrier to data-driven scientific discovery. The Analytical Information Markup Language (AnIML), a flexible XML-based standard for analytical…

Structured data offers a sophisticated mechanism for the organization of information. Existing methodologies for the text-serialization of structured data in the context of large language models fail to adequately address the heterogeneity…

Computation and Language · Computer Science 2024-02-20 YiQiu Guo , Yuchen Yang , Ya Zhang , Yu Wang , Yanfeng Wang

Multimodal Entity Linking (MEL) is a crucial task that aims at linking ambiguous mentions within multimodal contexts to the referent entities in a multimodal knowledge base, such as Wikipedia. Existing methods focus heavily on using complex…

Artificial Intelligence · Computer Science 2024-08-22 Liu Qi , He Yongyi , Lian Defu , Zheng Zhi , Xu Tong , Liu Che , Chen Enhong

Scientific Large Language Models (Sci-LLMs) are transforming how knowledge is represented, integrated, and applied in scientific research, yet their progress is shaped by the complex nature of scientific data. This survey presents a…

Computation and Language · Computer Science 2025-10-21 Ming Hu , Chenglong Ma , Wei Li , Wanghan Xu , Jiamin Wu , Jucheng Hu , Tianbin Li , Guohang Zhuang , Jiaqi Liu , Yingzhou Lu , Ying Chen , Chaoyang Zhang , Cheng Tan , Jie Ying , Guocheng Wu , Shujian Gao , Pengcheng Chen , Jiashi Lin , Haitao Wu , Lulu Chen , Fengxiang Wang , Yuanyuan Zhang , Xiangyu Zhao , Feilong Tang , Encheng Su , Junzhi Ning , Xinyao Liu , Ye Du , Changkai Ji , Pengfei Jiang , Cheng Tang , Ziyan Huang , Jiyao Liu , Jiaqi Wei , Yuejin Yang , Xiang Zhang , Guangshuai Wang , Yue Yang , Huihui Xu , Ziyang Chen , Yizhou Wang , Chen Tang , Jianyu Wu , Yuchen Ren , Siyuan Yan , Zhonghua Wang , Zhongxing Xu , Shiyan Su , Shangquan Sun , Runkai Zhao , Zhisheng Zhang , Dingkang Yang , Jinjie Wei , Jiaqi Wang , Jiahao Xu , Jiangtao Yan , Wenhao Tang , Hongze Zhu , Yu Liu , Fudi Wang , Yiqing Shen , Yuanfeng Ji , Yanzhou Su , Tong Xie , Hongming Shan , Chun-Mei Feng , Zhi Hou , Diping Song , Lihao Liu , Yanyan Huang , Lequan Yu , Bin Fu , Shujun Wang , Xiaomeng Li , Xiaowei Hu , Yun Gu , Ben Fei , Benyou Wang , Yuewen Cao , Minjie Shen , Jie Xu , Haodong Duan , Fang Yan , Hongxia Hao , Jielan Li , Jiajun Du , Yanbo Wang , Imran Razzak , Zhongying Deng , Chi Zhang , Lijun Wu , Conghui He , Zhaohui Lu , Jinhai Huang , Wenqi Shao , Yihao Liu , Siqi Luo , Yi Xin , Xiaohong Liu , Fenghua Ling , Yuqiang Li , Aoran Wang , Siqi Sun , Qihao Zheng , Nanqing Dong , Tianfan Fu , Dongzhan Zhou , Yan Lu , Wenlong Zhang , Jin Ye , Jianfei Cai , Yirong Chen , Wanli Ouyang , Yu Qiao , Zongyuan Ge , Shixiang Tang , Junjun He , Chunfeng Song , Lei Bai , Bowen Zhou

Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning…

Machine Learning · Computer Science 2014-08-04 Joaquin Vanschoren , Jan N. van Rijn , Bernd Bischl , Luis Torgo

Large Language Models (LLMs) have shown remarkable proficiency in natural language understanding (NLU), opening doors for innovative applications. We introduce StreamLink - an LLM-driven distributed data system designed to improve the…

Databases · Computer Science 2025-05-29 Dawei Feng , Di Mei , Huiri Tan , Lei Ren , Xianying Lou , Zhangxi Tan

Background: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise…

High-quality datasets are typically required for accomplishing data-driven tasks, such as training medical diagnosis models, predicting real-time traffic conditions, or conducting experiments to validate research hypotheses. Consequently,…

Information Retrieval · Computer Science 2025-09-03 Pengyue Li , Sheng Wang , Hua Dai , Zhiyu Chen , Zhifeng Bao , Brian D. Davison

Identifying disease interconnections through manual analysis of large-scale clinical data is labor-intensive, subjective, and prone to expert disagreement. While machine learning (ML) shows promise, three critical challenges remain: (1)…

While the biomedical community has published several "open data" sources in the last decade, most researchers still endure severe logistical and technical challenges to discover, query, and integrate heterogeneous data and knowledge from…

Artificial Intelligence · Computer Science 2020-06-09 Maulik R. Kamdar , Mark A. Musen

Multiple logic-based reconstructions of conceptual data modelling languages such as EER, UML Class Diagrams, and ORM exist. They mainly cover various fragments of the languages and none are formalised such that the logic applies…

Artificial Intelligence · Computer Science 2019-09-20 Pablo Rubén Fillottrani , C. Maria Keet

Data Linkage is an important step that can provide valuable insights for evidence-based decision making, especially for crucial events. Performing sensible queries across heterogeneous databases containing millions of records is a complex…

Databases · Computer Science 2015-10-09 Mohammed Gollapalli

Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of…

Computation and Language · Computer Science 2024-02-20 Fangzhi Xu , Zhiyong Wu , Qiushi Sun , Siyu Ren , Fei Yuan , Shuai Yuan , Qika Lin , Yu Qiao , Jun Liu

The drug development process necessitates that pharmacologists undertake various tasks, such as reviewing literature, formulating hypotheses, designing experiments, and interpreting results. Each stage requires accessing and querying vast…

Computation and Language · Computer Science 2023-08-01 Hong Lu , Chuan Li , Yinheng Li , Jie Zhao

Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations,…

Computation and Language · Computer Science 2026-02-27 Zhanhui Zhou , Lingjie Chen , Hanghang Tong , Dawn Song

The relational data model offers unrivaled rigor and precision in defining data structure and querying complex data. Yet the use of relational databases in scientific data pipelines is limited due to their perceived unwieldiness. We propose…

Databases · Computer Science 2018-07-31 Dimitri Yatsenko , Edgar Y. Walker , Andreas S. Tolias

Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. However, the understanding of their capability to process structured data like tables remains an under-explored area.…

Computation and Language · Computer Science 2024-07-18 Yuan Sui , Mengyu Zhou , Mingjie Zhou , Shi Han , Dongmei Zhang
‹ Prev 1 2 3 10 Next ›