Digital Libraries
Scaling laws describe how language model capabilities grow with compute and data, but say nothing about how long a model matters once released. We provide the first large-scale empirical account of how scientists adopt and abandon language…
Grant recommendation systems remain one of the least explored areas within academic recommender systems, and existing proposals are typically tied to specific funding agencies or disciplinary domains. This paper presents an…
Scientific publishing systematically filters out negative results. We argue that this long-standing asymmetry has become an urgent problem in the era of large language models, which inherit the positive bias of the literature they are…
User models for recommender systems (RecSys) typically assume stable preferences, similarity-based relevance, and session-bounded interactions -- assumptions derived from high-volume consumer contexts. This paper investigates these…
Scientific research is a key input into technological innovation, yet not all scientific knowledge is equally mobilized in patents. This paper examines how different scientific publishing models shape both the selection of scientific…
Research on the editorial boards of scholarly journals has predominantly relied on static, cross-sectional data, focusing on their composition or interlocking editorships at single points in time. To address this gap, a formal stock-flow…
The accelerating pace of scientific publishing makes it increasingly difficult for researchers to stay current. We present Paper Espresso, an open-source platform that automatically discovers, summarizes, and analyzes trending arXiv papers.…
Disambiguating scholars with identical names is essential for accurate authorship assignment and robust large-scale scientometric research. Existing methods are often designed for Latin-script metadata and perform poorly on Chinese names.…
The global landscape of art-technology institutions, including festivals, biennials, research labs, conferences, and hybrid organizations, has grown increasingly diverse, yet systematic frameworks for analyzing their multidimensional…
Large language models with web search are increasingly used in scientific publishing agents, yet they still produce BibTeX entries with pervasive field-level errors. Prior evaluations tested base models without search, which does not…
Purpose: This study compares the hierarchical structure of scientific teams across countries and investigates factors associated with the observed cross-national differences. Design/methodology/approach: Drawing on 150,817 publications with…
Academic policy engagement, the structured processes through which researchers contribute evidence and expertise to public decision-making, is shaped not only by research quality but by the accessibility of engagement opportunities. In…
Traditional Online Public Access Catalogues (OPACs) are becoming less effective due to the rapid growth of scholarly literature. Conventional search methods, such as keyword indexing and Boolean queries, often fail to support efficient…
Search engines and information platforms are increasingly scrutinized for their role in spreading misinformation. Traditional responses often focus on detecting falsehoods or verifying the ultimate validity of claims. This paper argues that…
This paper presents a specialized methodology for digitizing and segmenting mathematical documents from zbMATH Open, a comprehensive database of mathematical literature, to enhance machine processing capabilities. Currently, approximately…
The IMU-ICIAM working group's new report on Fraudulent Publishing in the Mathematical Sciences documents how gaming of bibliometrics, predatory outlets and paper-mill activity are eroding trust in research, mathematics included. This short…
The original European ESPRIT ProCoS I and II projects on Provably Correct Systems} took place around a quarter of a century ago. Since then the legacy of the initiative has spawned many researchers with careers in formal methods. One of the…
Systematic reviews provide comprehensive syntheses of research fields. As a result, systematic reviews often emphasize synthesizing across the large bodies of literature rather than just describing the studies from which the conclusions…
The digitisation of historical documents has traditionally been conceived as a process limited to character-level transcription, producing flat text that lacks the structural and semantic information necessary for substantive computational…
Most existing approaches to AI in pharmacy collapse three epistemologically distinct operations into a single technical layer: document preservation, semantic interpretation, and contextual presentation. This conflation is a root cause of…