数字图书馆 — Scifaro

MathModDB: A Database for Mathematical Models

When researchers need a mathematical model for a research problem, they face a fragmented landscape: relevant formulas, quantities, assumptions, and model variants are scattered across publications and domain-specific conventions. The…

数字图书馆 · 计算机科学 2026-06-26 Jochen Fiedler , Christine Biedinger , Marco Reidelbach , Björn Schembera , Burkhard Schmidt , Aurela Shehu , Thomas Koprucki

A&A community survey on the future of scientific publishing: Credibility over speed, fairness over profit, human judgment over automation

(Abridged) Scientific publishing is undergoing major change, driven by a shift toward open access (OA), the rise of artificial intelligence (AI), and growing demands for transparency, reproducibility, and equity. At the same time, rapid…

数字图书馆 · 计算机科学 2026-06-25 João Alves , Arūnas Kučinskas , Charlotte Van Rooyen , Marc Audard , Pierre-Alain Duc , David Elbaz , Thierry Forveille , Laszlo L. Kiss , Tiago Pereira , Eva Villaver

Codex Mutabilis: Preserving The Reasons For Changes In Scientific Names

Digital preservation infrastructures often prioritize the stability of content and metadata. In taxonomy, species names are formed according to the Articles listed in the International Code of Zoological Nomenclature. The reasons for these…

数字图书馆 · 计算机科学 2026-06-25 Richard Littauer , Jessamyn West

EconSimulacra: A Digital Twin Platform of Socio-Economic Systems Powered by LLM Agents

Real-world social behavior emerges from tightly coupled domains: economic conditions shape mobility and social interactions, while online attention and offline activity feed back into local popularity and consumer behavior. Capturing these…

数字图书馆 · 计算机科学 2026-06-25 Ryuji Hashimoto , Masahiro Kaneko , Kentaro Ueda , Takehiro Takayanagi , Kiyoshi Izumi

Lacuna: A Research Map for Machine Learning

Lacuna is a research map for machine learning that uses LLMs to turn papers and scholarly metadata into markdown summaries, concept elements, research directions, and research proposals. Each item keeps links to the primary source records…

数字图书馆 · 计算机科学 2026-06-24 Martin Weiss , Miles Q. Li , Alejandro H. Artiles , Yacine Mkhinini , Chris Pal , Hugo Larochelle , Nasim Rahaman

A General Pipeline for Digesting Scientific Literature into a Shared Scientific Knowledge Base

The published scientific literature is a rich, continuously growing record of measurements, correlations, and observations that modern AI tools can now make accessible in new ways. The Materials Explorer Pipeline digests collections of…

数字图书馆 · 计算机科学 2026-06-11 Charles T. Black

The Biosecurity Blind Spot: Systematic Dual-use Detection in Open Science Infrastructure

AI is transforming life sciences research at unprecedented speed, accelerating discovery across protein structure prediction, genome modeling, and drug development (Jumper et al., 2021; Mak et al., 2024). Yet this rapid advancement, coupled…

数字图书馆 · 计算机科学 2026-05-29 Vasudha Sharma , Chakresh Kumar Singh , Jayesh Choudhari , Dharmit Nakrani

Co-creation of AI technology, empowering curators of cultural heritage information and guarding research commons

The substance of this paper is the description of the use of Retrieval-Augmented Generation (RAG) for specific digital collections of cultural assets. The collections are provided by institutions operating in the cultural sector. The…

数字图书馆 · 计算机科学 2026-05-29 Andrea Scharnhorst , Han Yang , Jetze Touber , Kim Ferguson , Philipp Mayr , Vyacheslav Tykhonov

Verified Misguidance: Measuring Structural Citation Failures in Search-Augmented LLMs

Users of search-augmented LLMs rely on citations as evidence that responses are grounded in real sources, and rarely verify the cited pages themselves. Millions of queries per day now pass through these systems, making citation quality a…

数字图书馆 · 计算机科学 2026-05-28 Yongsik Seo , Wooseok Jeong , Eunyoung Kim , Hyeonseo Jang , Dongha Lee

CiteCheck: Retrieval-Grounded Detection of LLM Citation Hallucinations in Scientific Text

Large language models (LLMs) are increasingly used to generate scientific reports, but they can produce references that appear plausible while containing corrupted metadata or pointing to papers that do not exist. We introduce CiteCheck, a…

数字图书馆 · 计算机科学 2026-05-28 Khashayar Khajavi , Shaghayegh Sadeghi , Rise Adhikari , Alexander Tessier

sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing

Scientific papers make claims about prior work backed by citations. Verifying those citations at scale (that each cited paper exists, says what the citation claims, and is itself reliable) is structurally beyond what human review can…

数字图书馆 · 计算机科学 2026-05-26 Sergey V Samsonau

BookReconciler: An Open-Source Tool for Metadata Enrichment and Work-Level Clustering

We present BookReconciler, an open-source tool for enhancing and clustering book data. BookReconciler allows users to take spreadsheets with minimal metadata, such as book title and author, and automatically 1) add authoritative, persistent…

数字图书馆 · 计算机科学 2026-05-26 Matt Miller , Dan Sinykin , Melanie Walsh

HALvest-Contrastive: Retrieval-Like Authorship Attribution with Patch-Level Late Interaction

Authorship attribution asks whether two pieces of text share a writer, but topical confound makes the task deceptively easy: two authors covering the same topic may look more alike than one author covering two topics. Scholarly prose offers…

数字图书馆 · 计算机科学 2026-05-26 Francis Kulumba , Wissam Antoun , Guillaume Vimont , Laurent Romary , Florian Cafiero

Tracking a Decade of Research at the University of Nigeria, Nsukka: A Scientometric Analysis (2014-2023)

This study employs scientometric methods to assess the research output and performance of the University of Nigeria from 2014 to 2023. By analyzing publication trends, citation patterns, and collaboration networks, the research aims to…

数字图书馆 · 计算机科学 2026-05-25 Muneer Ahmad , Joseph U Igligli

Thinking like a business: Reconfiguring relationships to sustain open data infrastructures

Sustaining open data infrastructures over time is a complex puzzle, involving dynamic funding models and relationships with customers, collaborators, and competitors. Despite their importance, these mechanisms are often hidden from view,…

数字图书馆 · 计算机科学 2026-05-25 Kathleen Gregory , Dorothea Strecker

The Ephemeral Web and the Case for Proactive Archiving

The web is often treated as a durable record of institutional and social life, yet in practice it is fragile, revisable, and frequently ephemeral. Domains change, redesigns erase earlier material, institutions relocate, maintainers…

数字图书馆 · 计算机科学 2026-05-22 Meliksah Yorulmazlar

The Curious Case of Max Planck retracted papers. When past scientific practices meet contemporary publishing norms

This article examines the case of two papers published in Naturwissenschaften by the physicist Max Planck that were retrospectively marked as retracted on Springer digital platform. Rather than originating in scientific fraud, these…

数字图书馆 · 计算机科学 2026-05-22 Yves Gingras , Mahdi Khelfaoui

General Science Ranking (GSR): An Open-Source, Citation-Normalized Journal and Conference Classification System for Computer Science and Medicine

The academic journal zoning system is central to evaluating research talent, funding, and institutions. The CAS journal partition system, one of East Asia's most widely used tools, will cease operation in March 2026, creating a policy gap.…

数字图书馆 · 计算机科学 2026-05-21 Zhikai Yu

Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models

Parliamentary proceedings represent a rich yet challenging resource for computational analysis, particularly when preserved only as scanned historical documents. Existing efforts to transcribe Italian parliamentary speeches have relied on…

数字图书馆 · 计算机科学 2026-05-21 Luigi Curini , Alfio Ferrara , Giovanni Pagano , Sergio Picascia

One in Eight OpenAlex Abstracts Has Integrity Issues

Scientific abstracts are increasingly used as primary data in computational metascience research, yet the quality of these abstracts in widely used bibliographic databases has not been systematically examined. We assess the integrity of…

数字图书馆 · 计算机科学 2026-05-20 Seorin Kim , Vincent Holst , Vincent Ginis