Related papers: ALPINE: An adaptive language-agnostic pruning meth…

Integrating Newton's Laws with deep learning for enhanced physics-informed compound flood modelling

Coastal communities increasingly face compound floods, where multiple drivers like storm surge, high tide, heavy rainfall, and river discharge occur together or in sequence to produce impacts far greater than any single driver alone.…

Geophysics · Physics 2025-07-22 Soheil Radfar , Faezeh Maghsoodifar , Hamed Moftakhari , Hamid Moradkhani

ALPINE: Analog In-Memory Acceleration with Tight Processor Integration for Deep Learning

Analog in-memory computing (AIMC) cores offers significant performance and energy benefits for neural network inference with respect to digital logic (e.g., CPUs). AIMCs accelerate matrix-vector multiplications, which dominate these…

Hardware Architecture · Computer Science 2022-12-19 Joshua Klein , Irem Boybat , Yasir Qureshi , Martino Dazzi , Alexandre Levisse , Giovanni Ansaloni , Marina Zapater , Abu Sebastian , David Atienza

The Structural Scalpel: Automated Contiguous Layer Pruning for Large Language Models

Although large language models (LLMs) have achieved revolutionary breakthroughs in many fields, their large model size and high computational cost pose significant challenges for practical deployment on resource-constrained edge devices. To…

Machine Learning · Computer Science 2025-10-29 Yao Lu , Yuqi Li , Wenbin Xie , Shanqing Yu , Qi Xuan , Zhaowei Zhu , Shiping Wen

Adaptive Pruning for Large Language Models with Structural Importance Awareness

The recent advancements in large language models (LLMs) have significantly improved language understanding and generation capabilities. However, it is difficult to deploy LLMs on resource-constrained edge devices due to their high…

Computation and Language · Computer Science 2024-12-20 Haotian Zheng , Jinke Ren , Yushan Sun , Ruichen Zhang , Wenbo Zhang , Zhen Li , Dusit Niyato , Shuguang Cui , Yatong Han

Greening Large Language Models of Code

Large language models of code have shown remarkable effectiveness across various software engineering tasks. Despite the availability of many cloud services built upon these powerful models, there remain several scenarios where developers…

Software Engineering · Computer Science 2024-01-15 Jieke Shi , Zhou Yang , Hong Jin Kang , Bowen Xu , Junda He , David Lo

Scalene: Scripting-Language Aware Profiling for Python

Existing profilers for scripting languages (a.k.a. "glue" languages) like Python suffer from numerous problems that drastically limit their usefulness. They impose order-of-magnitude overheads, report information at too coarse a…

Programming Languages · Computer Science 2020-07-28 Emery D. Berger

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

With the rapid expansion of large language models (LLMs), the demand for memory and computational resources has grown significantly. Recent advances in LLM pruning aim to reduce the size and computational cost of these models. However,…

Machine Learning · Computer Science 2025-05-29 Zhendong Mi , Zhenglun Kong , Geng Yuan , Shaoyi Huang

On the Compression of Language Models for Code: An Empirical Study on CodeBERT

Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various…

Software Engineering · Computer Science 2024-12-19 Giordano d'Aloisio , Luca Traini , Federica Sarro , Antinisca Di Marco

APE: Selective Fine-tuning with Acceptance Criteria for Language Model Adaptation

We present Adjacent Possible Exploration (APE), a selective fine-tuning method for adapting large language models that systematically explores parameter modifications while maintaining model stability. Inspired by evolutionary optimization…

Computation and Language · Computer Science 2025-06-10 Javier Marín

AdamFLIP: Adaptive Momentum Feedback Linearization Optimization for Hard Constrained PINN Training

Physics-informed neural networks (PINNs) provide a flexible framework for solving forward and inverse problems governed by partial differential equations (PDEs), but standard PINN training typically relies on soft penalty formulations that…

Machine Learning · Computer Science 2026-05-12 Binghang Lu , Runyu Zhang , Changhong Mou , Na Li , Guang Lin

Less is More: Towards Green Code Large Language Models via Unified Structural Pruning

The extensive application of Large Language Models (LLMs) in generative coding tasks has raised concerns due to their high computational demands and energy consumption. Unlike previous structural pruning methods designed for classification…

Software Engineering · Computer Science 2025-04-25 Guang Yang , Yu Zhou , Xiangyu Zhang , Wei Cheng , Ke Liu , Xiang Chen , Terry Yue Zhuo , Taolue Chen

Choose Your Model Size: Any Compression of Large Language Models Without Re-Computation

The adoption of Foundation Models in resource-constrained environments remains challenging due to their large size and inference costs. A promising way to overcome these limitations is post-training compression, which aims to balance…

Machine Learning · Computer Science 2025-11-11 Martin Genzel , Patrick Putzky , Pengfei Zhao , Sebastian Schulze , Mattes Mollenhauer , Robert Seidel , Stefan Dietzel , Thomas Wollmann

REFINE: Enhancing Program Repair Agents through Context-Aware Patch Refinement

Large Language Models (LLMs) have recently shown strong potential in automatic program repair (APR), especially in repository-level settings where the goal is to generate patches based on natural language issue descriptions, large…

Software Engineering · Computer Science 2025-10-07 Anvith Pabba , Simin Chen , Alex Mathai , Anindya Chakraborty , Baishakhi Ray

Efficient Vision-Language Reasoning via Adaptive Token Pruning

Real-world deployment of Vision-Language Models (VLMs) is hindered by high computational demands, as existing architectures inefficiently process all tokens uniformly. We introduce Adaptive Token Pruning (ATP), a dynamic inference mechanism…

Computer Vision and Pattern Recognition · Computer Science 2025-12-16 Xue Li , Xiaonan Song , Henry Hu

Streamlining Redundant Layers to Compress Large Language Models

This paper introduces LLM-Streamline, a pioneer work on layer pruning for large language models (LLMs). It is based on the observation that different layers have varying impacts on hidden states, enabling the identification of less…

Computation and Language · Computer Science 2025-01-28 Xiaodong Chen , Yuxuan Hu , Jing Zhang , Yanling Wang , Cuiping Li , Hong Chen

SPAP: Structured Pruning via Alternating Optimization and Penalty Methods

The deployment of large language models (LLMs) is often constrained by their substantial computational and memory demands. While structured pruning presents a viable approach by eliminating entire network components, existing methods suffer…

Machine Learning · Computer Science 2025-05-07 Hanyu Hu , Xiaoming Yuan

PINE: Pruning Boosted Tree Ensembles with Conformal In-Distribution Prediction Equivalence

Tree ensembles are machine learning models with strong predictive performance and interpretability, and remain widely used for tabular data. Standard pruning methods for tree ensembles typically optimize an accuracy-compression trade-off…

Machine Learning · Computer Science 2026-05-28 Haruki Yajima , Yusuke Matsui

Deriving Coding-Specific Sub-Models from LLMs using Resource-Efficient Pruning

Large Language Models (LLMs) have demonstrated their exceptional performance in various complex code generation tasks. However, their broader adoption is limited by significant computational demands and high resource requirements,…

Machine Learning · Computer Science 2025-01-10 Laura Puccioni , Alireza Farshin , Mariano Scazzariello , Changjie Wang , Marco Chiesa , Dejan Kostic

AL-PINN: Active Learning-Driven Physics-Informed Neural Networks for Efficient Sample Selection in Solving Partial Differential Equations

Physics-Informed Neural Networks (PINNs) have emerged as a promising approach for solving Partial Differential Equations (PDEs) by incorporating physical constraints into deep learning models. However, standard PINNs often require a large…

Machine Learning · Computer Science 2025-05-05 Keon Vin Park

RAP: Runtime Adaptive Pruning for LLM Inference

Large language models (LLMs) excel at language understanding and generation, but their enormous computational and memory requirements hinder deployment. Compression offers a potential solution to mitigate these constraints. However, most…

Machine Learning · Computer Science 2026-05-19 Huanrong Liu , Chunlin Tian , Xuyang Wei , Qingbiao Li , Li Li