Related papers: Tempura: A General Cost Based Optimizer Framework …
As declarative query processing techniques expand in scope --- to the Web, data streams, network routers, and cloud platforms --- there is an increasing need for adaptive query processing techniques that can re-plan in the presence of…
Incremental computation aims to compute more efficiently on changed input by reusing previously computed results. We give a high-level overview of works on incremental computation, and highlight the essence underlying all of them, which we…
Stream processing is usually done either on a tuple-by-tuple basis or in micro-batches. There are many applications where tuples over a predefined duration/window must be processed within certain deadlines. Processing such queries using…
Data augmentation in time series forecasting plays a crucial role in enhancing model performance by introducing variability while maintaining the underlying temporal patterns. However, time series data offers fewer augmentation strategies…
Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this paper is adopted in almost all current database systems. A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and…
Data pipeline frameworks provide abstractions for implementing sequences of data-intensive transformation operators, automating the deployment and execution of such transformations in a cluster. Deploying a data pipeline, however, requires…
This paper presents an improved mixed-integer model for the Thermal Unit Commitment Problem. By introducing new variables for the temperature of each thermal unit, the off-time-dependent start-up costs are modeled accurately and with a…
In pursuit of efficient and scalable data analytics, the insight that "one size does not fit all" has given rise to a plethora of specialized data processing platforms and today's complex data analytics are moving beyond the limits of a…
We study the problem of efficiently scheduling a computational DAG on multiple processors. The majority of previous works have developed and compared algorithms for this problem in relatively simple models; in contrast to this, we analyze…
Analyzing big data in a highly dynamic environment becomes more and more critical because of the increasingly need for end-to-end processing of this data. Modern data flows are quite complex and there are not efficient, cost-based,…
We study the problem of optimizing data storage and access costs on the cloud while ensuring that the desired performance or latency is unaffected. We first propose an optimizer that optimizes the data placement tier (on the cloud) and the…
Scheduling precedence-constrained tasks under shared renewable resources is central to modern computing platforms. The Resource Investment Problem (RIP) models this setting by minimizing the cost of provisioned renewable resources under…
The principal component of conventional database query optimizers is a cost model that is used to estimate expected performance of query plans. The accuracy of the cost model has direct impact on the optimality of execution plans selected…
With the more and more growing demand for semantic Web services over large databases, an efficient evaluation of Datalog queries is arousing a renewed interest among researchers and industry experts. In this scenario, to reduce memory…
Multivariate time series forecasting plays a pivotal role in numerous real-world applications, including financial analysis, energy management, and traffic planning. While Transformer-based architectures have gained popularity for this…
Optimization is an important module of modern machine learning applications. Tremendous efforts have been made to accelerate optimization algorithms. A common formulation is achieving a lower loss at a given time. This enables a…
This paper addresses the generation expansion planning (GEP) problem, formulated as a mixed-integer linear programming model with intertemporal storage constraints. Being generally NP-hard, the problem's computational complexity grows…
Integer Linear Programming (ILP) serves as a versatile framework for modeling a wide range of combinatorial optimization problems, typically addressed by sophisticated exact solvers or heuristics. While learning-based approaches have…
With the increasing adoption of AI applications such as large language models and computer vision AI, the computational demands on AI inference systems are continuously rising, making the enhancement of task processing capacity using…
Useful knowledge, embedded in a database, is likely to change over time. Identifying recent changes in temporal databases can provide valuable up-to-date information to decision-makers. Nevertheless, techniques for mining high-utility…