English
Related papers

Related papers: REX: Recursive, Delta-Based Data-Centric Computati…

200 papers

Machine learning workflow development is a process of trial-and-error: developers iterate on workflows by testing out small modifications until the desired accuracy is achieved. Unfortunately, existing machine learning systems focus…

Databases · Computer Science 2018-12-17 Doris Xin , Stephen Macke , Litian Ma , Jialin Liu , Shuchen Song , Aditya Parameswaran

Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making. Due to the…

Databases · Computer Science 2018-12-11 Zhiwei Fan , Jianqiao Zhu , Zuyu Zhang , Aws Albarghouthi , Paraschos Koutris , Jignesh Patel

A number of popular systems, most notably Google's TensorFlow, have been implemented from the ground up to support machine learning tasks. We consider how to make a very small set of changes to a modern relational database management system…

Databases · Computer Science 2019-04-26 Dimitrije Jankov , Shangyu Luo , Binhang Yuan , Zhuhua Cai , Jia Zou , Chris Jermaine , Zekai J. Gao

Large datasets ("Big Data") are becoming ubiquitous because the potential value in deriving insights from data, across a wide range of business and scientific applications, is increasingly recognized. In particular, machine learning - one…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-03-15 Joshua Rosen , Neoklis Polyzotis , Vinayak Borkar , Yingyi Bu , Michael J. Carey , Markus Weimer , Tyson Condie , Raghu Ramakrishnan

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature of many analysis and machine learning algorithms, however, is still a challenge for current systems. While certain types of bulk…

Databases · Computer Science 2012-08-02 Stephan Ewen , Kostas Tzoumas , Moritz Kaufmann , Volker Markl

Most cloud services and distributed applications rely on hashing algorithms that allow dynamic scaling of a robust and efficient hash table. Examples include AWS, Google Cloud and BitTorrent. Consistent and rendezvous hashing are algorithms…

Data Structures and Algorithms · Computer Science 2022-05-17 Mike Heddes , Igor Nunes , Tony Givargis , Alexandru Nicolau , Alex Veidenbaum

The enormous quantity of data produced every day together with advances in data analytics has led to a proliferation of data management and analysis systems. Typically, these systems are built around highly specialized monolithic operators…

Databases · Computer Science 2021-09-30 Dimitrios Koutsoukos , Ingo Müller , Renato Marroquín , Ana Klimovic , Gustavo Alonso

Applications such as web search and social networking have been moving from centralized to decentralized cloud architectures to improve their scalability. MapReduce, a programming framework for processing large amounts of data using…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-24 Pedro A. R. S. Costa , Xiao Bai , Fernando M. V. Ramos , Miguel Correia

As data volumes grow across applications, analytics of large amounts of data is becoming increasingly important. Big data processing frameworks such as Apache Hadoop, Apache AsterixDB, and Apache Spark have been built to meet this demand. A…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-15 Avinash Kumar

Over the past 40 years, database management systems (DBMSs) have evolved to provide a sophisticated variety of data management capabilities. At the same time, tools for managing queries over the data have remained relatively primitive. One…

Databases · Computer Science 2009-09-15 Nodira Khoussainova , Magda Balazinska , Wolfgang Gatterbauer , YongChul Kwon , Dan Suciu

Distributed Hash Tables offer a resilient lookup service for unstable distributed environments. Resilient data storage, however, requires additional data replication and maintenance algorithms. These algorithms can have an impact on both…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Matthew Leslie

Data application developers and data scientists spend an inordinate amount of time iterating on machine learning (ML) workflows -- by modifying the data pre-processing, model training, and post-processing steps -- via trial-and-error to…

Machine Learning · Computer Science 2018-08-06 Doris Xin , Litian Ma , Jialin Liu , Stephen Macke , Shuchen Song , Aditya Parameswaran

How can we expand the tensor decomposition to reveal a hierarchical structure of the multi-modal data in a self-adaptive way? Current tensor decomposition provides only a single layer of clusters. We argue that with the abundance of…

Information Retrieval · Computer Science 2020-11-17 Risul Islam , Md Omar Faruk Rokon , Evangelos E. Papalexakis , Michalis Faloutsos

In the past few years, we have envisioned an increasing number of businesses start driving by big data analytics, such as Amazon recommendations and Google Advertisements. At the back-end side, the businesses are powered by big data…

Performance · Computer Science 2021-10-26 Ying Mao , Victoria Green , Jiayin Wang , Haoyi Xiong , Zhishan Guo

Modern scientific repositories are growing rapidly in size. Scientists are increasingly interested in viewing the latest data as part of query results. Current scientific middleware cache systems, however, assume repositories are static.…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-09-21 Tanu Malik , Xiaodan Wang , Philip Little , Amitabh Chaudhary , Ani Thakar

The exponential growth of data necessitates distributed storage models, such as peer-to-peer systems and data federations. While distributed storage can reduce costs and increase reliability, the heterogeneity in storage capacity, I/O…

Multimodal recommender systems leverage diverse data sources, such as user interactions, content features, and contextual information, to address challenges like cold-start and data sparsity. However, existing methods often suffer from one…

Information Retrieval · Computer Science 2026-02-24 Adamya Shyam , Venkateswara Rao Kagita , Bharti Rana , Vikas Kumar

The exponential growth of data in current times and the demand to gain information and knowledge from the data present new challenges for database researchers. Known database systems and algorithms are no longer capable of effectively…

Databases · Computer Science 2017-12-06 Yaron Gonen

Parallel shared-nothing data management systems have been widely used to exploit a cluster of machines for efficient and scalable data processing. When a cluster needs to be dynamically scaled in or out, data must be efficiently rebalanced.…

Databases · Computer Science 2021-05-25 Chen Luo , Michael J. Carey

Compute Express Link (CXL) 3.0 and beyond allows the compute nodes of a cluster to share data with hardware cache coherence and at the granularity of a cache line. This enables shared-memory semantics for distributed computing, but…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-10 Antonis Psistakis , Burak Ocalan , Chloe Alverti , Fabien Chaix , Ramnatthan Alagappan , Josep Torrellas
‹ Prev 1 2 3 10 Next ›