Related papers: FINE: Factorizing Knowledge for Initialization of …

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

The growing complexity of model parameters underscores the significance of pre-trained models. However, deployment constraints often necessitate models of varying sizes, exposing limitations in the conventional pre-training and fine-tuning…

Machine Learning · Computer Science 2025-03-18 Fu Feng , Yucheng Xie , Jing Wang , Xin Geng

Exploring Learngene via Stage-wise Weight Sharing for Initializing Variable-sized Models

In practice, we usually need to build variable-sized models adapting for diverse resource constraints in different application scenarios, where weight initialization is an important step prior to training. The Learngene framework,…

Machine Learning · Computer Science 2024-04-29 Shi-Yu Xia , Wenxuan Zhu , Xu Yang , Xin Geng

One-for-All Model Initialization with Frequency-Domain Knowledge

Transferring knowledge by fine-tuning large-scale pre-trained networks has become a standard paradigm for downstream tasks, yet the knowledge of a pre-trained model is tightly coupled with monolithic architecture, which restricts flexible…

Machine Learning · Computer Science 2026-05-26 Jianlu Shen , Fu Feng , Yucheng Xie , Jiaqi Lv , Xin Geng

DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines

Diffusion models have emerged as dominant performers for image generation. To support training large diffusion models, this paper studies pipeline parallel training of diffusion models and proposes DiffusionPipe, a synchronous pipeline…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-03 Ye Tian , Zhen Jia , Ziyue Luo , Yida Wang , Chuan Wu

DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning

Diffusion models have proven to be highly effective in generating high-quality images. However, adapting large pre-trained diffusion models to new domains remains an open challenge, which is critical for real-world applications. This paper…

Computer Vision and Pattern Recognition · Computer Science 2023-07-28 Enze Xie , Lewei Yao , Han Shi , Zhili Liu , Daquan Zhou , Zhaoqiang Liu , Jiawei Li , Zhenguo Li

Masked Diffusion Models Are Fast Distribution Learners

Diffusion model has emerged as the \emph{de-facto} model for image generation, yet the heavy training overhead hinders its broader adoption in the research community. We observe that diffusion models are commonly trained to learn all…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Jiachen Lei , Qinglong Wang , Peng Cheng , Zhongjie Ba , Zhan Qin , Zhibo Wang , Zhenguang Liu , Kui Ren

SPIRE: Conditional Personalization for Federated Diffusion Generative Models

Recent advances in diffusion models have revolutionized generative AI, but their sheer size makes on device personalization, and thus effective federated learning (FL), infeasible. We propose Shared Backbone Personal Identity Representation…

Machine Learning · Computer Science 2025-06-17 Kaan Ozkara , Ruida Zhou , Suhas Diggavi

FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models

In recent years, large-scale pre-trained diffusion models have demonstrated their outstanding capabilities in image and video generation tasks. However, existing models tend to produce visual objects commonly found in the training dataset,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-30 Changgu Chen , Libing Yang , Xiaoyan Yang , Lianggangxu Chen , Gaoqi He , CHangbo Wang , Yang Li

DMin: Scalable Training Data Influence Estimation for Diffusion Models

Identifying the training data samples that most influence a generated image is a critical task in understanding diffusion models (DMs), yet existing influence estimation methods are constrained to small-scale or LoRA-tuned models due to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-10 Huawei Lin , Yingjie Lao , Weijie Zhao

DiffPINN: Generative diffusion-initialized physics-informed neural networks for accelerating seismic wavefield representation

Physics-informed neural networks (PINNs) offer a powerful framework for seismic wavefield modeling, yet they typically require time-consuming retraining when applied to different velocity models. Moreover, their training can suffer from…

Geophysics · Physics 2025-06-03 Shijun Cheng , Tariq Alkhalifah

Fourier-Invertible Neural Encoder (FINE) for Homogeneous Flows

We present the Fourier-Invertible Neural Encoder (FINE), a compact and interpretable architecture for dimension reduction in translation-equivariant datasets. FINE integrates reversible filters and monotonic activation functions with a…

Machine Learning · Computer Science 2025-12-02 Anqiao Ouyang , Hongyi Ke , Qi Wang

End-to-End Training for Unified Tokenization and Latent Denoising

Latent diffusion models (LDMs) enable high-fidelity synthesis by operating in learned latent spaces. However, training state-of-the-art LDMs requires complex staging: a tokenizer must be trained first, before the diffusion model can be…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Shivam Duggal , Xingjian Bai , Zongze Wu , Richard Zhang , Eli Shechtman , Antonio Torralba , Phillip Isola , William T. Freeman

Reinforced Fast Weights with Next-Sequence Prediction

Fast weight architectures offer a promising alternative to attention-based transformers for long-context modeling by maintaining constant memory overhead regardless of context length. However, their potential is limited by the next-token…

Computation and Language · Computer Science 2026-02-19 Hee Seung Hwang , Xindi Wu , Sanghyuk Chun , Olga Russakovsky

On Distillation of Guided Diffusion Models

Classifier-free guided diffusion models have recently been shown to be highly effective at high-resolution image generation, and they have been widely used in large-scale diffusion frameworks including DALLE-2, Stable Diffusion and Imagen.…

Computer Vision and Pattern Recognition · Computer Science 2023-04-14 Chenlin Meng , Robin Rombach , Ruiqi Gao , Diederik P. Kingma , Stefano Ermon , Jonathan Ho , Tim Salimans

DIVINE: Diverse Influential Training Points for Data Visualization and Model Refinement

As the complexity of machine learning (ML) models increases, resulting in a lack of prediction explainability, several methods have been developed to explain a model's behavior in terms of the training data points that most influence the…

Machine Learning · Computer Science 2021-07-14 Umang Bhatt , Isabel Chien , Muhammad Bilal Zafar , Adrian Weller

Fine-Grained Model Merging via Modular Expert Recombination

Model merging constructs versatile models by integrating task-specific models without requiring labeled data or expensive joint retraining. Although recent methods improve adaptability to heterogeneous tasks by generating customized merged…

Machine Learning · Computer Science 2026-02-09 Haiyun Qiu , Xingyu Wu , Liang Feng , Kay Chen Tan

BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion

Bagging has achieved great success in the field of machine learning by integrating multiple base classifiers to build a single strong classifier to reduce model variance. The performance improvement of bagging mainly relies on the number…

Machine Learning · Computer Science 2024-03-26 Jia Wei , Xingjun Zhang , Witold Pedrycz

KIND: Knowledge Integration and Diversion for Training Decomposable Models

Pre-trained models have become the preferred backbone due to the increasing complexity of model parameters. However, traditional pre-trained models often face deployment challenges due to their fixed sizes, and are prone to negative…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Yucheng Xie , Fu Feng , Ruixiao Shi , Jing Wang , Yong Rui , Xin Geng

Composing Partial Differential Equations with Physics-Aware Neural Networks

We introduce a compositional physics-aware FInite volume Neural Network (FINN) for learning spatiotemporal advection-diffusion processes. FINN implements a new way of combining the learning abilities of artificial neural networks with…

Machine Learning · Computer Science 2022-05-30 Matthias Karlbauer , Timothy Praditia , Sebastian Otte , Sergey Oladyshkin , Wolfgang Nowak , Martin V. Butz

Initialization and Regularization of Factorized Neural Layers

Factorized layers--operations parameterized by products of two or more matrices--occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention…

Machine Learning · Statistics 2022-10-07 Mikhail Khodak , Neil Tenenholtz , Lester Mackey , Nicolò Fusi