Related papers: Extensible Database Simulator for Fast Prototyping…
Scientific applications produce a huge amount of data, which imposes serious management and analysis challenges. In particular, limitations in current database management systems prevent their adoption in simulation applications, in which…
In today's world data is being generated at a high rate due to which it has become inevitable to analyze and quickly get results from this data. Most of the relational databases primarily support SQL querying with a limited support for…
Database Management System (DBMS) is designed to help store and process large collections of data, and is incredibly flexible to perform various kinds of optimizations as long as it achieves serializability with a high-level interface…
We present FabSim, a toolkit developed to simplify a range of computational tasks for researchers in diverse disciplines. FabSim is flexible, adaptable, and allows users to perform a wide range of tasks with ease. It also provides a…
Emerging paradigms of big data and Software-Defined Networking (SDN) in cloud data centers have gained significant attention from industry and academia. The integration and coordination of big data and SDN are required to improve the…
Data science agents promise to accelerate discovery and insight-generation by turning data into executable analyses and findings. Yet existing data science benchmarks fall short due to fragmented evaluation interfaces that make…
Database system is an indispensable part of software projects. It plays an important role in data organization and storage. Its performance and efficiency are directly related to the performance of software. Nowadays, we have many general…
There are significant benefits to serve deep learning models from relational databases. First, features extracted from databases do not need to be transferred to any decoupled deep learning systems for inferences, and thus the system…
In the rapidly evolving AI era with large language models (LLMs) at the core, making LLMs more trustworthy and efficient, especially in output generation (inference), has gained significant attention. This is to reduce plausible but faulty…
Simulation tools are commonly used in the development and testing of new protocols or new networks. However, as satellite networks start to grow to encompass thousands of nodes, and as companies and space agencies begin to realize the…
Large language models (LLMs) excel in many natural language processing (NLP) tasks. However, since LLMs can only incorporate new knowledge through training or supervised fine-tuning processes, they are unsuitable for applications that…
The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity…
Many fields of science rely on relational database management systems to analyze, publish and share data. Since RDBMS are originally designed for, and their development directions are primarily driven by, business use cases they often lack…
The study of complex many-body systems via analysis of the trajectories of the units that dynamically move and interact within them is a non-trivial task. The workflow for extracting meaningful information from the raw trajectory data is…
Bulk-bitwise processing-in-memory (PIM), an emerging computational paradigm utilizing memory arrays as computational units, has been shown to benefit database applications. This paper demonstrates how GROUP-BY and JOIN, database operations…
Large-scale distributed computing infrastructures such as the Worldwide LHC Computing Grid (WLCG) require comprehensive simulation tools for evaluating performance, testing new algorithms, and optimizing resource allocation strategies.…
Relational databases (RDBs) underpin the majority of global data management systems, where information is structured into multiple interdependent tables. To effectively use the knowledge within RDBs for predictive tasks, recent advances…
Data analysis often involves comparing subsets of data across many dimensions for finding unusual trends and patterns. While the comparison between subsets of data can be expressed using SQL, they tend to be complex to write, and suffer…
Recent studies from several hyperscalars pinpoint to embedding layers as the most memory-intensive deep learning (DL) algorithm being deployed in today's datacenters. This paper addresses the memory capacity and bandwidth challenges of…
The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are…