Digital Libraries

The OAI Data-Provider Registration and Validation Service

I present a summary of recent use of the Open Archives Initiative (OAI) registration and validation services for data-providers. The registration service has seen a steady stream of registrations since its launch in 2002, and there are now…

Digital Libraries · Computer Science 2007-05-23 Simeon Warner

The Convergence of Digital-Libraries and the Peer-Review Process

Pre-print repositories have seen a significant increase in use over the past fifteen years across multiple research domains. Researchers are beginning to develop applications capable of using these repositories to assist the scientific…

Digital Libraries · Computer Science 2007-05-23 Marko A. Rodriguez , Johan Bollen , Herbert Van de Sompel

mod_oai: An Apache Module for Metadata Harvesting

We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAIPMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose…

Digital Libraries · Computer Science 2007-05-23 Michael L. Nelson , Herbert Van de Sompel , Xiaoming Liu , Terry L. Harrison , Nathan McFarland

File-based storage of Digital Objects and constituent datastreams: XMLtapes and Internet Archive ARC files

This paper introduces the write-once/read-many XMLtape/ARC storage approach for Digital Objects and their constituent datastreams. The approach combines two interconnected file-based storage mechanisms that are made accessible in a…

Digital Libraries · Computer Science 2007-05-23 Xiaoming Liu , Lyudmila Balakireva , Patrick Hochstenbach , Herbert Van de Sompel

Toward alternative metrics of journal impact: A comparison of download and citation data

We generated networks of journal relationships from citation and download data, and determined journal impact rankings from these networks using a set of social network centrality metrics. The resulting journal impact rankings were compared…

Digital Libraries · Computer Science 2007-05-23 Johan Bollen , Herbert Van de Sompel , Joan Smith , Rick Luce

aDORe: a modular, standards-based Digital Object Repository

This paper describes the aDORe repository architecture, designed and implemented for ingesting, storing, and accessing a vast collection of Digital Objects at the Research Library of the Los Alamos National Laboratory. The aDORe…

Digital Libraries · Computer Science 2007-05-23 Herbert Van de Sompel , Jeroen Bekaert , Xiaoming Liu , Luda Balakireva , Thorsten Schwander

Orchestrating Metadata Enhancement Services: Introducing Lenny

Harvested metadata often suffers from uneven quality to the point that utility is compromised. Although some aggregators have developed methods for evaluating and repairing specific metadata problems, it has been unclear how these methods…

Digital Libraries · Computer Science 2007-05-23 Jon Phipps , Diane I. Hillmann , Gordon Paynter

An Information Network Overlay Architecture for the NSDL

We describe the underlying data model and implementation of a new architecture for the National Science Digital Library (NSDL) by the Core Integration Team (CI). The architecture is based on the notion of an information network overlay.…

Digital Libraries · Computer Science 2007-05-23 Carl Lagoze , Dean B. Krafft , Susan Jesuroga , Tim Cornwell , Ellen J. Cramer , Eddie Shin

Clustering SPIRES with EqRank

SPIRES is the largest database of scientific papers in the subject field of high energy and nuclear physics. It contains information on the citation graph of more than half a million of papers (vertexes of the citation graph). We outline…

Digital Libraries · Computer Science 2007-05-23 G. B. Pivovarov , S. E. Trunov

Fedora: An Architecture for Complex Objects and their Relationships

The Fedora architecture is an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them. Fedora accommodates the aggregation of local and distributed content into digital objects…

Digital Libraries · Computer Science 2007-05-23 Carl Lagoze , Sandy Payette , Edwin Shin , Chris Wilper

A Link Clustering Based Approach for Clustering Categorical Data

Categorical data clustering (CDC) and link clustering (LC) have been considered as separate research and application areas. The main focus of this paper is to investigate the commonalities between these two problems and the uses of these…

Digital Libraries · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

EURYDICE : A platform for unified access to documents

In this paper we present Eurydice, a platform dedicated to provide a unified gateway to documents. Its basic functionalities about collecting documents have been designed based on a long experience about the management of scientific…

Digital Libraries · Computer Science 2007-05-23 Serge Rouveyrol , Yves Chiaramella , Francesca Leinardi , Joanna Janik , Bruno Marmol , Carole Silvy , Catherine Allauzun

Trustworthy 100-Year Digital Objects: Durable Encoding for When It's Too Late to Ask

How can an author store digital information so that it will be reliably useful, even years later when he is no longer available to answer questions? Methods that might work are not good enough; what is preserved today should be reliably…

Digital Libraries · Computer Science 2007-05-23 H. M. Gladney , R. A. Lorie

Principles for Digital Preservation

The immense investments in creating and disseminating digitally represented information have not been accompanied by commensurate effort to ensure the longevity of information of permanent interest. Asserted difficulties with long-term…

Digital Libraries · Computer Science 2007-05-23 H. M. Gladney

Notes On The Design Of An Internet Adversary

The design of the defenses Internet systems can deploy against attack, especially adaptive and resilient defenses, must start from a realistic model of the threat. This requires an assessment of the capabilities of the adversary. The design…

Digital Libraries · Computer Science 2007-05-23 David S. H. Rosenthal , Petros Maniatis , Mema Roussopoulos , T. J. Giuli , Mary Baker

Transparent Format Migration of Preserved Web Content

The LOCKSS digital preservation system collects content by crawling the web and preserves it in the format supplied by the publisher. Eventually, browsers will no longer understand that format. A process called format migration converts it…

Digital Libraries · Computer Science 2007-05-23 David S. H. Rosenthal , Thomas Lipkis , Thomas Robertson , Seth Morabito

A knowledge-based approach to semi-automatic annotation of multimedia documents via user adaptation

Current approaches to the annotation process focus on annotation schemas, languages for annotation, or are very application driven. In this paper it is proposed that a more flexible architecture for annotation requires a knowledge component…

Digital Libraries · Computer Science 2007-05-23 Afzal Ballim , Nastaran Fatemi , Hatem Ghorbel , Vincenzo Pallotta

Providing Authentic Long-term Archival Access to Complex Relational Data

We discuss long-term preservation of and access to relational databases. The focus is on national archives and science data archives which have to ingest and integrate data from a broad spectrum of vendor-specific relational database…

Digital Libraries · Computer Science 2007-05-23 Stephan Heuscher , Stephan Jaermann , Peter Keller-Marxer , Frank Moehle

Automatically Generating Interfaces for Personalized Interaction with Digital Libraries

We present an approach to automatically generate interfaces supporting personalized interaction with digital libraries; these interfaces augment the user-DL dialog by empowering the user to (optionally) supply out-of-turn information during…

Digital Libraries · Computer Science 2007-05-23 Saverio Perugini , Naren Ramakrishnan , Edward A. Fox

Dynamic Linking of Smart Digital Objects Based on User Navigation Patterns

We discuss a methodology to dynamically generate links among digital objects by means of an unsupervised learning mechanism which analyzes user link traversal patterns. We performed an experiment with a test bed of 150 complex data objects,…

Digital Libraries · Computer Science 2007-05-23 Aravind Elango , Johan Bollen , Michael L. Nelson