Related papers: Helix: Holistic Optimization for Accelerating Iter…

Helix: Accelerating Human-in-the-loop Machine Learning

Data application developers and data scientists spend an inordinate amount of time iterating on machine learning (ML) workflows -- by modifying the data pre-processing, model training, and post-processing steps -- via trial-and-error to…

Machine Learning · Computer Science 2018-08-06 Doris Xin , Litian Ma , Jialin Liu , Stephen Macke , Shuchen Song , Aditya Parameswaran

Accelerating Human-in-the-loop Machine Learning: Challenges and Opportunities

Development of machine learning (ML) workflows is a tedious process of iterative experimentation: developers repeatedly make changes to workflows until the desired accuracy is attained. We describe our vision for a "human-in-the-loop" ML…

Databases · Computer Science 2018-04-18 Doris Xin , Litian Ma , Jialin Liu , Stephen Macke , Shuchen Song , Aditya Parameswaran

Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow

This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving in heterogeneous GPU clusters. The key idea behind Helix is to formulate inference computation of LLMs over heterogeneous…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-07 Yixuan Mei , Yonghao Zhuang , Xupeng Miao , Juncheng Yang , Zhihao Jia , Rashmi Vinayak

Helix: Evolutionary Reinforcement Learning for Open-Ended Scientific Problem Solving

Large language models (LLMs) with reasoning abilities have demonstrated growing promise for tackling complex scientific problems. Yet such tasks are inherently domain-specific, unbounded and open-ended, demanding exploration across vast and…

Machine Learning · Computer Science 2026-03-10 Chang Su , Zhongkai Hao , Zhizhou Zhang , Zeyu Xia , Youjia Wu , Hang Su , Jun Zhu

KML: Using Machine Learning to Improve Storage Systems

Operating systems include many heuristic algorithms designed to improve overall storage performance and throughput. Because such heuristics cannot work well for all conditions and workloads, system designers resorted to exposing numerous…

Operating Systems · Computer Science 2022-01-27 Ibrahim Umit Akgun , Ali Selman Aydin , Andrew Burford , Michael McNeill , Michael Arkhangelskiy , Aadil Shaikh , Lukas Velikov , Erez Zadok

An Efficient Fault Tolerant Workflow Scheduling Approach using Replication Heuristics and Checkpointing in the Cloud

Scientific workflows have been predominantly used for complex and large scale data analysis and scientific computation/automation and the need for robust workflow scheduling techniques has grown considerably. But, most of the existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-11-04 S. Jaya Nirmala , Amrith Rajagopal Setlur , Har Simrat Singh , Sudhanshu Khoriya

Helix 1.0: An Open-Source Framework for Reproducible and Interpretable Machine Learning on Tabular Scientific Data

Helix is an open-source, extensible, Python-based software framework to facilitate reproducible and interpretable machine learning workflows for tabular data. It addresses the growing need for transparent experimental data analytics…

Machine Learning · Computer Science 2025-07-25 Eduardo Aguilar-Bejarano , Daniel Lea , Karthikeyan Sivakumar , Jimiama M. Mase , Reza Omidvar , Ruizhe Li , Troy Kettle , James Mitchell-White , Morgan R Alexander , David A Winkler , Grazziela Figueredo

Hierarchical Meta Learning

Meta learning is a promising solution to few-shot learning problems. However, existing meta learning methods are restricted to the scenarios where training and application tasks share the same out-put structure. To obtain a meta model…

Machine Learning · Computer Science 2019-04-22 Yingtian Zou , Jiashi Feng

HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning

The use of machine learning (ML) models in decision-making contexts, particularly those used in high-stakes decision-making, are fraught with issue and peril since a person - not a machine - must ultimately be held accountable for the…

Machine Learning · Computer Science 2022-06-06 Michael T. Lash

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

Despite recent advancements in large language models (LLMs), their performance on complex reasoning problems requiring multi-step thinking and combining various skills is still limited. To address this, we propose a novel framework HDFlow…

Computation and Language · Computer Science 2024-09-27 Wenlin Yao , Haitao Mi , Dong Yu

How Developers Iterate on Machine Learning Workflows -- A Survey of the Applied Machine Learning Literature

Machine learning workflow development is anecdotally regarded to be an iterative process of trial-and-error with humans-in-the-loop. However, we are not aware of quantitative evidence corroborating this popular belief. A quantitative…

Machine Learning · Computer Science 2018-05-21 Doris Xin , Litian Ma , Shuchen Song , Aditya Parameswaran

Continuous Deep Learning: A Workflow to Bring Models into Production

Researchers have been highly active to investigate the classical machine learning workflow and integrate best practices from the software engineering lifecycle. However, deep learning exhibits deviations that are not yet covered in this…

Software Engineering · Computer Science 2022-08-30 Janosch Baltensperger , Pasquale Salza , Harald C. Gall

REX: Recursive, Delta-Based Data-Centric Computation

In today's Web and social network environments, query workloads include ad hoc and OLAP queries, as well as iterative algorithms that analyze data relationships (e.g., link analysis, clustering, learning). Modern DBMSs support ad hoc and…

Databases · Computer Science 2012-08-02 Svilen R. Mihaylov , Zachary G. Ives , Sudipto Guha

Efficient LLM Serving for Agentic Workflows: A Data Systems Perspective

Agentic workflows are composed of sequences of interdependent Large Language Model (LLM) calls, and they have become a dominant workload in modern AI systems. These workflows exhibit extensive redundancy from overlapping prompts and…

Multiagent Systems · Computer Science 2026-03-18 Noppanat Wadlom , Junyi Shen , Yao Lu

A Framework for Model Search Across Multiple Machine Learning Implementations

Several recently devised machine learning (ML) algorithms have shown improved accuracy for various predictive problems. Model searches, which explore to find an optimal ML algorithm and hyperparameter values for the target problem, play a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-28 Yoshiki Takahashi , Masato Asahara , Kazuyuki Shudo

Spinning Fast Iterative Data Flows

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

An Application Driven Analysis of the ParalleX Execution Model

Exascale systems, expected to emerge by the end of the next decade, will require the exploitation of billion-way parallelism at multiple hierarchical levels in order to achieve the desired sustained performance. The task of assessing future…

Distributed, Parallel, and Cluster Computing · Computer Science 2011-09-27 Matthew Anderson , Maciej Brodowicz , Hartmut Kaiser , Thomas Sterling

Hybrid Learning for Orchestrating Deep Learning Inference in Multi-user Edge-cloud Networks

Deep-learning-based intelligent services have become prevalent in cyber-physical applications including smart cities and health-care. Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency…

Machine Learning · Computer Science 2022-02-24 Sina Shahhosseini , Tianyi Hu , Dongjoo Seo , Anil Kanduri , Bryan Donyanavard , Amir M. Rahmani , Nikil Dutt

FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge Systems

Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications. These edge-based machine learning systems are often battery-powered (i.e., energy-limited). They…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-22 Ali Mokhtari , Md Abir Hossen , Pooyan Jamshidi , Mohsen Amini Salehi

Heterogeneous Continual Learning

We propose a novel framework and a solution to tackle the continual learning (CL) problem with changing network architectures. Most CL methods focus on adapting a single architecture to a new task/class by modifying its weights. However,…

Computer Vision and Pattern Recognition · Computer Science 2023-06-16 Divyam Madaan , Hongxu Yin , Wonmin Byeon , Jan Kautz , Pavlo Molchanov