Related papers: Policy Aware Geospatial Data

Assigning Creative Commons Licenses to Research Metadata: Issues and Cases

This paper discusses the problem of lack of clear licensing and transparency of usage terms and conditions for research metadata. Making research data connected, discoverable and reusable are the key enablers of the new data revolution in…

Computers and Society · Computer Science 2016-09-20 Marta Poblet , Amir Aryani , Paolo Manghi , Kathryn Unsworth , Jingbo Wang , Brigitte Hausstein , Sunje Dallmeier-Tiessen , Claus-Peter Klas , Pompeu Casanovas , Victor Rodriguez Doncel

Towards Geo-Distributed Machine Learning

Latency to end-users and regulatory requirements push large companies to build data centers all around the world. The resulting data is "born" geographically distributed. On the other hand, many machine learning applications require a…

Machine Learning · Computer Science 2016-03-31 Ignacio Cano , Markus Weimer , Dhruv Mahajan , Carlo Curino , Giovanni Matteo Fumarola

Towards Standardization of Data Licenses: The Montreal Data License

This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is to build towards a common framework for data licensing akin to the licensing of open source…

Computers and Society · Computer Science 2019-04-01 Misha Benjamin , Paul Gagnon , Negar Rostamzadeh , Chris Pal , Yoshua Bengio , Alex Shee

ERMrest: an entity-relationship data storage service for web-based, data-oriented collaboration

Scientific discovery is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. While the details vary from domain to domain, these data often consist of…

Databases · Computer Science 2016-10-20 Karl Czajkowski , Carl Kesselman , Robert Schuler , Hongsuda Tangmunarunkit

DRMS Co-design by F4MS

In this paper, we present Digital Rights Management systems (DRMS) which are becoming more and more complex due to technology revolution in relation with telecommunication networks, multimedia applications and the reading equipments (Mobile…

Software Engineering · Computer Science 2010-04-20 Aissam Berrahou , Mourad Rafi , Mohsine Eleuldj

A Formal Foundation for XrML

XrML is becoming a popular language in industry for writing software licenses. The semantics for XrML is implicitly given by an algorithm that determines if a permission follows from a set of licenses. We focus on a fragment of the language…

Cryptography and Security · Computer Science 2008-08-11 Joseph Y. Halpern , Vicky Weissman

XML for Domain Viewpoints

Within research institutions like CERN (European Organization for Nuclear Research) there are often disparate databases (different in format, type and structure) that users need to access in a domain-specific manner. Users may want to…

Instrumentation and Detectors · Physics 2007-05-23 F. van Lingen , R. McClatchey , P. v/d Stok , I. Willers

Formalizing Privacy Laws for License Generation and Data Repository Decision Automation

In this paper, we summarize work-in-progress on expert system support to automate some data deposit and release decisions within a data repository, and to generate custom license agreements for those data transfers. Our approach formalizes…

Cryptography and Security · Computer Science 2019-10-23 Micah Altman , Stephen Chong , Alexandra Wood

Explainable Mixed Data Representation and Lossless Visualization Toolkit for Knowledge Discovery

Developing Machine Learning (ML) algorithms for heterogeneous/mixed data is a longstanding problem. Many ML algorithms are not applicable to mixed data, which include numeric and non-numeric data, text, graphs and so on to generate…

Machine Learning · Computer Science 2022-06-15 Boris Kovalerchuk , Elijah McCoy

Towards Operationalizing Right to Data Protection

The widespread practice of indiscriminate data scraping to fine-tune language models (LMs) raises significant legal and ethical concerns, particularly regarding compliance with data protection laws such as the General Data Protection…

Machine Learning · Computer Science 2024-11-19 Abhinav Java , Simra Shahid , Chirag Agarwal

Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse

The rapid advancement of immersive technologies has propelled the development of the Metaverse, where the convergence of virtual and physical realities necessitates the generation of high-quality, photorealistic images to enhance user…

Networking and Internet Architecture · Computer Science 2025-01-17 Guangyuan Liu , Hongyang Du , Jiacheng Wang , Dusit Niyato , Dong In Kim

Geospatial Representation Learning: A Survey from Deep Learning to The LLM Era

The ability to transform location-centric geospatial data into meaningful computational representations has become fundamental to modern spatial analysis and decision-making. Geospatial Representation Learning (GRL), the process of…

Computer Vision and Pattern Recognition · Computer Science 2026-02-12 Xixuan Hao , Yutian Jiang , Xingchen Zou , Jiabo Liu , Yifang Yin , Song Gao , Flora Salim , Tianrui Li , Yuxuan Liang

Mining Feature Relationships in Data

When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover…

Machine Learning · Computer Science 2021-02-03 Andrew Lensen

Geospatial Machine Learning Libraries

Recent advances in machine learning have been supported by the emergence of domain-specific software libraries, enabling streamlined workflows and increased reproducibility. For geospatial machine learning (GeoML), the availability of Earth…

Machine Learning · Computer Science 2025-11-19 Adam J. Stewart , Caleb Robinson , Arindam Banerjee

Substra: a framework for privacy-preserving, traceable and collaborative Machine Learning

Machine learning is promising, but it often needs to process vast amounts of sensitive data which raises concerns about privacy. In this white-paper, we introduce Substra, a distributed framework for privacy-preserving, traceable and…

Cryptography and Security · Computer Science 2019-10-28 Mathieu N Galtier , Camille Marini

An investigation of licensing of datasets for machine learning based on the GQM model

Dataset licensing is currently an issue in the development of machine learning systems. And in the development of machine learning systems, the most widely used are publicly available datasets. However, since the images in the publicly…

Software Engineering · Computer Science 2023-03-27 Junyu Chen , Norihiro Yoshida , Hiroaki Takada

Can I use this publicly available dataset to build commercial AI software? -- A Case Study on Publicly Available Image Datasets

Publicly available datasets are one of the key drivers for commercial AI software. The use of publicly available datasets is governed by dataset licenses. These dataset licenses outline the rights one is entitled to on a given dataset and…

Machine Learning · Computer Science 2022-04-12 Gopi Krishnan Rajbahadur , Erika Tuck , Li Zi , Dayi Lin , Boyuan Chen , Zhen Ming , Jiang , Daniel M. German

Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning

Data heterogeneity in federated learning, characterized by a significant misalignment between local and global distributions, leads to divergent local optimization directions and hinders global model training. Existing studies mainly focus…

Computer Vision and Pattern Recognition · Computer Science 2025-05-06 Yanbiao Ma , Wei Dai , Wenke Huang , Jiayi Chen

Transforming Unstructured Text into Data with Context Rule Assisted Machine Learning (CRAML)

We describe a method and new no-code software tools enabling domain experts to build custom structured, labeled datasets from the unstructured text of documents and build niche machine learning text classification models traceable to…

Computation and Language · Computer Science 2023-01-23 Stephen Meisenbacher , Peter Norlander

Using Object-Relational Mapping to Create the Distributed Databases in a Hybrid Cloud Infrastructure

One of the challenges currently problems in the use of cloud services is the task of designing of specialized data management systems. This is especially important for hybrid systems in which the data are located in public and private…

Databases · Computer Science 2015-01-06 Oleg Lukyanchikov , Evgeniy Pluzhnik , Simon Payain , Evgeny Nikulchev