English
Related papers

Related papers: Causify DataFlow: A Framework For High-performance…

200 papers

The rapidly growing demand for high-quality data in Large Language Models (LLMs) has intensified the need for scalable, reliable, and semantically rich data preparation pipelines. However, current practices remain dominated by ad-hoc…

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of…

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-05-09 Yuan Yu , Martín Abadi , Paul Barham , Eugene Brevdo , Mike Burrows , Andy Davis , Jeff Dean , Sanjay Ghemawat , Tim Harley , Peter Hawkins , Michael Isard , Manjunath Kudlur , Rajat Monga , Derek Murray , Xiaoqiang Zheng

Numerical algorithms and computational tools are instrumental in navigating and addressing complex simulation and data processing tasks. The exponential growth of metadata and parameter-driven simulations has led to an increasing demand for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-02 Pavan L. Veluvali , Jan Heiland , Peter Benner

Dataflow computing was shown to bring significant benefits to multiple niches of systems engineering and has the potential to become a general-purpose paradigm of choice for data-driven application development. One of the characteristic…

Software Engineering · Computer Science 2023-04-25 Andrei Paleyes , Neil D. Lawrence

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-06-17 Claudia Misale , Maurizio Drocco , Marco Aldinucci , Guy Tremblay

Dataflow programming is a popular and convenient programming paradigm in systems modelling, optimisation, and machine learning. It has a number of advantages, for instance the lacks of control flow allows computation to be carried out in…

Programming Languages · Computer Science 2021-03-03 Steven W. T. Cheung , Dan R. Ghica , Koko Muroya

Serverless computing that runs functions with auto-scaling is a popular task execution pattern in the cloud-native era. By connecting serverless functions into workflows, tenants can achieve complex functionality. Prior researches adopt the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-01 Zijun Li , Chuhao Xu , Quan Chen , Jieru Zhao , Chen Chen , Minyi Guo

Time-series forecasting increasingly demands not only accurate observational predictions but also causal forecasting under interventional and counterfactual queries in multivariate systems. We present DoFlow, a flow-based generative model…

Machine Learning · Statistics 2026-03-03 Dongze Wu , Feng Qiu , Yao Xie

Generating high-quality time series data has emerged as a critical research topic due to its broad utility in supporting downstream time series mining tasks. A major challenge lies in modeling the intrinsic stochasticity of temporal…

Artificial Intelligence · Computer Science 2025-11-20 He Panjing , Cheng Mingyue , Li Li , Zhang XiaoHan

GraphFlow is a visual workflow system designed to improve the reliability of agentic AI automation in multi-step, mission-critical processes. In these workflows, small errors compound rapidly: under an idealized model of independent steps,…

Artificial Intelligence · Computer Science 2026-05-15 Drewry H. Morris , Luis Valles , Reza Hosseini Ghomi

We present Pathway, a new unified data processing framework that can run workloads on both bounded and unbounded data streams. The framework was created with the original motivation of resolving challenges faced when analyzing and…

Agentic workflows in large language model systems integrate retrieval, reasoning, and memory, but existing frameworks suffer from scalability and reproducibility limitations due to fragmented data orchestration, serialization overhead, and…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-05 Arup Kumar Sarker , Mills Staylor , Aymen Alsaadi , Gregor von Laszewski , Shantenu Jha , Geoffrey Fox

Predicting program behavior without execution is a critical task in software engineering. Existing models often fall short in capturing the dynamic dependencies among program elements. To address this, we present CodeFlow, a novel machine…

Software Engineering · Computer Science 2025-02-11 Cuong Chi Le , Hoang Nhat Phan , Huy Nhat Phan , Tien N. Nguyen , Nghi D. Q. Bui

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

Proof autoformalization, the task of translating natural language theorems and proofs into machine-verifiable code, is a critical step for integrating large language models into rigorous mathematical workflows. Current approaches focus on…

Artificial Intelligence · Computer Science 2025-10-21 Rafael Cabral , Tuan Manh Do , Xuejun Yu , Wai Ming Tai , Zijin Feng , Xin Shen

Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to…

Databases · Computer Science 2022-06-20 Yasith Jayawardana , Vikas G. Ashok , Sampath Jayarathna

We consider the topic of data imputation, a foundational task in machine learning that addresses issues with missing data. To that end, we propose MCFlow, a deep framework for imputation that leverages normalizing flow generative models and…

Machine Learning · Computer Science 2020-03-31 Trevor W. Richardson , Wencheng Wu , Lei Lin , Beilei Xu , Edgar A. Bernal

TensorFlow is a popular cloud computing framework that targets machine learning applications. It separates the specification of application logic (in a dataflow graph) from the execution of the logic. TensorFlow's native runtime executes…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-08-27 Sam Whitlock , James Larus , Edouard Bugnion

Dataflow applications, such as machine learning algorithms, can run for days, making it desirable to have assurances that they will work correctly. Current tools are not good enough: too often the interactions between tasks are not…

Programming Languages · Computer Science 2021-11-25 Riley Evans , Samantha Frohlich , Meng Wang
‹ Prev 1 2 3 10 Next ›