English
Related papers

Related papers: FLASH: Fast Bayesian Optimization for Data Analyti…

200 papers

Most problems in search-based software engineering involve balancing conflicting objectives. Prior approaches to this task have required a large number of evaluations- making them very slow to execute and very hard to comprehend. To solve…

Software Engineering · Computer Science 2017-05-19 Vivek Nair , Zhe Yu , Tim Menzies

Finding good configurations for a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal…

Software Engineering · Computer Science 2018-09-05 Vivek Nair , Zhe Yu , Tim Menzies , Norbert Siegmund , Sven Apel

We present FLASH (\textbf{F}ast \textbf{L}SH \textbf{A}lgorithm for \textbf{S}imilarity search accelerated with \textbf{H}PC), a similarity search system for ultra-high dimensional datasets on a single machine, that does not require…

Data Structures and Algorithms · Computer Science 2018-07-04 Yiqiu Wang , Anshumali Shrivastava , Jonathan Wang , Junghee Ryu

Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs). As the performance requirements of ML applications grow continuously, the hardware accelerators start playing a…

Computer Vision and Pattern Recognition · Computer Science 2021-08-03 Guihong Li , Sumit K. Mandal , Umit Y. Ogras , Radu Marculescu

In order to achieve state-of-the-art performance, modern machine learning techniques require careful data pre-processing and hyperparameter tuning. Moreover, given the ever increasing number of machine learning models being developed, model…

Machine Learning · Statistics 2018-05-03 Nicolo Fusi , Rishit Sheth , Huseyn Melih Elibol

The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from…

Deep learning-based segmentation and classification are crucial to large-scale biomedical imaging, particularly for 3D data, where manual analysis is impractical. Although many methods exist, selecting suitable models and tuning parameters…

Computer Vision and Pattern Recognition · Computer Science 2026-02-18 David Exler , Joaquin Eduardo Urrutia Gómez , Martin Krüger , Maike Schliephake , John Jbeily , Mario Vitacolonna , Rüdiger Rudolf , Markus Reischl

Flow cytometry (FC) is a single-cell profiling platform for measuring the phenotypes of individual cells from millions of cells in biological samples. FC employs high-throughput technologies and generates high-dimensional data, and hence…

Quantitative Methods · Quantitative Biology 2015-01-16 Ariful Azad

Data pre-processing pipelines are the bread and butter of any successful AI project. We introduce a novel programming model for pipelines in a data lakehouse, allowing users to interact declaratively with assets in object storage. Motivated…

Databases · Computer Science 2024-11-14 Jacopo Tagliabue , Ryan Curtin , Ciro Greco

Interactive response time is important in analytical pipelines for users to explore a sufficient number of possibilities and make informed business decisions. We consider a forecasting pipeline with large volumes of high-dimensional time…

Databases · Computer Science 2021-01-19 Shuyuan Yan , Bolin Ding , Wei Guo , Jingren Zhou , Zhewei Wei , Xiaowei Jiang , Sheng Xu

The most common approach to implementing data analysis pipelines involves obtaining point estimates from the upstream modules and then treating these as known quantities when working with the downstream ones. This approach is…

Methodology · Statistics 2024-02-19 Erin Lipman , Abel Rodriguez

LiDAR super-resolution addresses the challenge of achieving high-quality 3D perception from cost-effective, low-resolution sensors. While recent transformer-based approaches like TULIP show promise, they remain limited to spatial-domain…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 June Moh Goo , Zichao Zeng , Jan Boehm

Pipeline parallelism enables efficient training of Large Language Models (LLMs) on large-scale distributed accelerator clusters. Yet, pipeline bubbles during startup and tear-down reduce the utilization of accelerators. Although efficient…

Machine Learning · Computer Science 2023-05-16 Kazuki Osawa , Shigang Li , Torsten Hoefler

The analyst effort in data cleaning is gradually shifting away from the design of hand-written scripts to building and tuning complex pipelines of automated data cleaning libraries. Hyper-parameter tuning for data cleaning is very different…

Databases · Computer Science 2019-05-08 Sanjay Krishnan , Eugene Wu

To solve a machine learning problem, one typically needs to perform data preprocessing, modeling, and hyperparameter tuning, which is known as model selection and hyperparameter optimization.The goal of automated machine learning (AutoML)…

Machine Learning · Computer Science 2019-04-19 Weilin Zhou , Frederic Precioso

Machine learning pipeline potentially consists of several stages of operations like data preprocessing, feature engineering and machine learning model training. Each operation has a set of hyper-parameters, which can become irrelevant for…

Machine Learning · Computer Science 2021-05-04 Xudong Sun , Jiali Lin , Bernd Bischl

Data and pipeline parallelism are key strategies for scaling neural network training across distributed devices, but their high communication cost necessitates co-located computing clusters with fast interconnects, limiting their…

Artificial intelligence (AI) is widely used in various fields including healthcare, autonomous vehicles, robotics, traffic monitoring, and agriculture. Many modern AI applications in these fields are multi-tasking in nature (i.e. perform…

Machine Learning · Computer Science 2024-12-04 Md Hafizur Rahman , Md Mashfiq Rizvee , Sumaiya Shomaji , Prabuddha Chakraborty

Unconstrained optimization problems are typically solved using iterative methods, which often depend on line search techniques to determine optimal step lengths in each iteration. This paper introduces a novel line search approach.…

Optimization and Control · Mathematics 2024-05-20 Sören Laue , Tomislav Prusina

Extending Bayesian optimization to batch evaluation can enable the designer to make the most use of parallel computing technology. However, most of current batch approaches do not scale well with the batch size. That is, their performances…

Machine Learning · Computer Science 2025-04-25 Dawei Zhan , Zhaoxi Zeng , Shuoxiao Wei , Ping Wu
‹ Prev 1 2 3 10 Next ›