English
Related papers

Related papers: Pseudonymization at Scale: OLCF's Summit Usage Dat…

200 papers

We observe and analyze usage of the login nodes of the leadership class Summit supercomputer from the perspective of an ordinary user -- not a system administrator -- by periodically sampling user activities (job queues, running processes,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-09-22 Sean R. Wilkinson , Ketan Maheshwari , Rafael Ferreira da Silva

HPC systems used for research run a wide variety of software and workflows. This software is often written or modified by users to meet the needs of their research projects, and rarely is built with security in mind. In this paper we…

High Performance Computing (HPC) centers provide advanced infrastructure that enables scientific research at extreme scale. These centers operate with hardware configurations, software environments, and security requirements that differ…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-03 Sean R. Wilkinson , Patrick Widener , Sarp Oral , Rafael Ferreira da Silva

This work investigates the effectiveness of different pseudonymization techniques, ranging from rule-based substitutions to using pre-trained Large Language Models (LLMs), on a variety of datasets and models used for two widely used NLP…

Computation and Language · Computer Science 2023-06-12 Oleksandr Yermilov , Vipul Raheja , Artem Chernodub

System logs constitute valuable information for analysis and diagnosis of system behavior. The size of parallel computing systems and the number of their components steadily increase. The volume of generated logs by the system is in…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-23 Siavash Ghiasvand , Florina M. Ciorba

Today's high-performance computing (HPC) systems are heavily instrumented, generating logs containing information about abnormal events, such as critical conditions, faults, errors and failures, system resource utilization, and about the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-24 Byung H. Park , Saurabh Hukerikar , Ryan Adamson , Christian Engelmann

We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online…

Cryptography and Security · Computer Science 2026-02-27 Simon Lermen , Daniel Paleka , Joshua Swanson , Michael Aerni , Nicholas Carlini , Florian Tramèr

An increasing number of companies have begun providing services that leverage cloud-based large language models (LLMs), such as ChatGPT. However, this development raises substantial privacy concerns, as users' prompts are transmitted to and…

Cryptography and Security · Computer Science 2025-02-24 Shilong Hou , Ruilin Shang , Zi Long , Xianghua Fu , Yin Chen

Computer-based scientific experiments are becoming increasingly data-intensive, necessitating the use of High-Performance Computing (HPC) clusters to handle large scientific workflows. These workflows result in complex data and control…

Databases · Computer Science 2025-02-17 Zahra Sadeghibogar , Alessandro Berti , Marco Pegoraro , Wil M. P. van der Aalst

High Performance Computing (HPC) centers provide resources to users who require greater scale to "get science done". They deploy infrastructure with singular hardware architectures, cutting-edge software environments, and stricter security…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-22 Sean R. Wilkinson , Patrick Widener

With the increasing use of conversational AI systems, there is growing concern over privacy leaks, especially when users share sensitive personal data in interactions with Large Language Models (LLMs). Conversations shared with these models…

Computation and Language · Computer Science 2025-11-03 Jayden Serenari , Stephen Lee

The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more…

Runtime scheduling and workflow systems are an increasingly popular algorithmic component in HPC because they allow full system utilization with relaxed synchronization requirements. There are so many special-purpose tools for task…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-11-03 David M. Rogers

Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmarks and detailed system analyses. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-04 Leonid Kondrashov , Boxi Zhou , Hancheng Wang , Dmitrii Ustiugov

Users increasingly rely on large language models (LLMs) for personal, emotionally charged, and socially sensitive conversations. However, prompts sent to cloud-hosted models can contain personally identifiable information (PII) that users…

Cryptography and Security · Computer Science 2025-11-18 Chelsea McMurray , Hayder Tirmazi

The rapid deployment of large language models (LLMs) in consumer applications has led to frequent exchanges of personal information. To obtain useful responses, users often share more than necessary, increasing privacy risks via…

Machine Learning · Computer Science 2025-10-07 Jijie Zhou , Niloofar Mireshghallah , Tianshi Li

Data privacy and anonymisation are critical concerns in today's data-driven society, particularly when handling personal and sensitive user data. Regulatory frameworks worldwide recommend privacy-preserving protocols such as k-anonymisation…

Information Theory · Computer Science 2025-07-02 Kailash Reddy , Novoneel Chakraborty , Amogh Dharmavaram , Anshoo Tandon

Understanding HPC facilities users' behaviors and how computational resources are requested and utilized is not only crucial for the cluster productivity but also essential for designing and constructing future exascale HPC systems. This…

Computational Engineering, Finance, and Science · Computer Science 2026-05-04 Sergio Iserte

This work presents AnonLFI 2.0, a modular pseudonymization framework for CSIRTs that uses HMAC SHA256 to generate strong and reversible pseudonyms, preserves XML and JSON structures, and integrates OCR and technical recognizers for PII and…

Cryptography and Security · Computer Science 2025-11-21 Cristhian Kapelinski , Douglas Lautert , Beatriz Machado , Diego Kreutz

High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly…

‹ Prev 1 2 3 10 Next ›