English
Related papers

Related papers: Connecting and Comparing Language Model Interpolat…

200 papers

We focus on an interpolation method referred to Bayesian reconstruction in this paper. Whereas in standard interpolation methods missing data are interpolated deterministically, in Bayesian reconstruction, missing data are interpolated…

Machine Learning · Statistics 2015-03-27 Shun Kataoka , Muneki Yasuda , Kazuyuki Tanaka

The Bag-of-Words (BoW) representation is well applied to recent state-of-the-art image retrieval works. Typically, multiple vocabularies are generated to correct quantization artifacts and improve recall. However, this routine is corrupted…

Computer Vision and Pattern Recognition · Computer Science 2014-04-15 Liang Zheng , Shengjin Wang , Wengang Zhou , Qi Tian

With the development of speech synthesis, recent research has focused on challenging tasks, such as speaker generation and emotion intensity control. Attribute interpolation is a common approach to these tasks. However, most previous…

Sound · Computer Science 2024-07-02 Masato Murata , Koichi Miyazaki , Tomoki Koriyama

Model merging, typically on Instruct and Thinking models, has shown remarkable performance for efficient reasoning. In this paper, we systematically revisit the simplest merging method that interpolates two weights directly. Particularly,…

Artificial Intelligence · Computer Science 2026-01-27 Taiqiang Wu , Runming Yang , Tao Liu , Jiahao Wang , Ngai Wong

This paper investigates model merging, a technique for deriving Markov models from text or speech corpora. Models are derived by starting with a large and specific model and by successively combining states to build smaller and more general…

cmp-lg · Computer Science 2008-02-03 Thorsten Brants

Fine-tuning pre-trained models for downstream tasks is a widely adopted technique known for its adaptability and reliability across various domains. Despite its conceptual simplicity, fine-tuning entails several troublesome engineering…

Artificial Intelligence · Computer Science 2024-12-30 Chaeyun Jang , Hyungi Lee , Jungtaek Kim , Juho Lee

Model merging has emerged as a promising technique for enhancing large language models, though its application in large-scale pre-training remains relatively unexplored. In this paper, we present a comprehensive investigation of model…

Recent work raises concerns about the use of standard splits to compare natural language processing models. We propose a Bayesian statistical model comparison technique which uses k-fold cross-validation across multiple data sets to…

Computation and Language · Computer Science 2020-10-08 Piotr Szymański , Kyle Gorman

Bayesian inference is attractive for its coherence and good frequentist properties. However, it is a common experience that eliciting a honest prior may be difficult and, in practice, people often take an {\em empirical Bayes} approach,…

Statistics Theory · Mathematics 2012-04-09 Sonia Petrone , Judith Rousseau , Catia Scricciolo

Model merging aims to combine multiple task-specific expert models into a single model without joint retraining, offering a practical alternative to multi-task learning when data access or computational budget is limited. Existing methods,…

Machine Learning · Computer Science 2026-05-14 Kaiyang Li , Shaobo Han , Qing Su , Shihao Ji

Mixed effects regression models are widely used by language researchers. However, these regressions are implemented with an algorithm which may not converge on a solution. While convergence issues in linear mixed effects models can often be…

Applications · Statistics 2018-09-10 Amelia Kimball , Kailen Shantz , Christopher Eager , Joseph Roy

Large Language Models (LLMs) require instruction fine-tuning to perform different downstream tasks. However, the instruction fine-tuning phase still demands significant computational resources and labeled data, lacking a paradigm that can…

Computation and Language · Computer Science 2025-03-10 Yiguan Lin , Bin Xu , Yinghao Li , Yang Gao

In recent years important progress has been achieved towards proving the validity of the replica predictions for the (asymptotic) mutual information (or "free energy") in Bayesian inference problems. The proof techniques that have emerged…

Information Theory · Computer Science 2018-10-30 Jean Barbier , Nicolas Macris

In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n-grams, skipping, interpolated Kneser-Ney smoothing, and clustering. We present…

Computation and Language · Computer Science 2007-05-23 Joshua Goodman

Multiple kernel learning algorithms are proposed to combine kernels in order to obtain a better similarity measure or to integrate feature representations coming from different data sources. Most of the previous research on such methods is…

Machine Learning · Computer Science 2012-07-03 Mehmet Gonen

Despite interest in using cross-lingual knowledge to learn word embeddings for various tasks, a systematic comparison of the possible approaches is lacking in the literature. We perform an extensive evaluation of four popular approaches of…

Computation and Language · Computer Science 2016-06-09 Shyam Upadhyay , Manaal Faruqui , Chris Dyer , Dan Roth

Effective verification and validation techniques for modern scientific machine learning workflows are challenging to devise. Statistical methods are abundant and easily deployed, but often rely on speculative assumptions about the data and…

Machine Learning · Computer Science 2025-02-11 Tyler Chang , Andrew Gillette , Romit Maulik

In this manuscript, a general method for deriving filtering algorithms that involve a network of interconnected Bayesian filters is proposed. This method is based on the idea that the processing accomplished inside each of the Bayesian…

Statistics Theory · Mathematics 2020-04-22 Giorgio M. Vitetta , Pasquale Di Viesti , Emilio Sirignano , Francesco Montorsi

Integrative modeling of macromolecular assemblies allows for structural characterization of large assemblies that are recalcitrant to direct experimental observation. A Bayesian inference approach facilitates combining data from…

Biomolecules · Quantitative Biology 2026-01-13 Shreyas Arvindekar , Kartik Majila , Shruthi Viswanath

Methods for combining predictions from different models in a supervised learning setting must somehow estimate/predict the quality of a model's predictions at unknown future inputs. Many of these methods (often implicitly) make the…

Methodology · Statistics 2014-06-25 Thijs van Ommen
‹ Prev 1 2 3 10 Next ›