Related papers: Fuzzy Knowledge Distillation from High-Order TSK t…

Distilling a Deep Neural Network into a Takagi-Sugeno-Kang Fuzzy Inference System

Deep neural networks (DNNs) demonstrate great success in classification tasks. However, they act as black boxes and we don't know how they make decisions in a particular classification task. To this end, we propose to distill the knowledge…

Artificial Intelligence · Computer Science 2020-10-13 Xiangming Gu , Xiang Cheng

Fuzzy Rule-based Differentiable Representation Learning

Representation learning has emerged as a crucial focus in machine and deep learning, involving the extraction of meaningful and useful features and patterns from the input data, thereby enhancing the performance of various downstream tasks…

Machine Learning · Computer Science 2025-03-19 Wei Zhang , Zhaohong Deng , Guanjin Wang , Kup-Sze Choi

Optimize TSK Fuzzy Systems for Classification Problems: Mini-Batch Gradient Descent with Uniform Regularization and Batch Normalization

Takagi-Sugeno-Kang (TSK) fuzzy systems are flexible and interpretable machine learning models; however, they may not be easily optimized when the data size is large, and/or the data dimensionality is high. This paper proposes a mini-batch…

Machine Learning · Computer Science 2020-12-04 Yuqi Cui , Jian Huang , Dongrui Wu

Multi-Label Takagi-Sugeno-Kang Fuzzy System

Multi-label classification can effectively identify the relevant labels of an instance from a given set of labels. However,the modeling of the relationship between the features and the labels is critical to the classification performance.…

Artificial Intelligence · Computer Science 2023-09-21 Qiongdan Lou , Zhaohong Deng , Zhiyong Xiao , Kup-Sze Choi , Shitong Wang

Concise Fuzzy System Modeling Integrating Soft Subspace Clustering and Sparse Learning

The superior interpretability and uncertainty modeling ability of Takagi-Sugeno-Kang fuzzy system (TSK FS) make it possible to describe complex nonlinear systems intuitively and efficiently. However, classical TSK FS usually adopts the…

Machine Learning · Computer Science 2019-04-25 Peng Xu , Zhaohong Deng , Chen Cui , Te Zhang , Kup-Sze Choi , Gu Suhang , Jun Wang , ShiTong Wang

ToDi: Token-wise Distillation via Fine-Grained Divergence Control

Large language models (LLMs) offer impressive performance but are impractical for resource-constrained deployment due to high latency and energy consumption. Knowledge distillation (KD) addresses this by transferring knowledge from a large…

Computation and Language · Computer Science 2025-09-30 Seongryong Jung , Suwan Yoon , DongGeon Kim , Hwanhee Lee

Revisiting Knowledge Distillation via Label Smoothing Regularization

Knowledge Distillation (KD) aims to distill the knowledge of a cumbersome teacher model into a lightweight student model. Its success is generally attributed to the privileged information on similarities among categories provided by the…

Computer Vision and Pattern Recognition · Computer Science 2021-03-05 Li Yuan , Francis E. H. Tay , Guilin Li , Tao Wang , Jiashi Feng

FiGKD: Fine-Grained Knowledge Distillation via High-Frequency Detail Transfer

Knowledge distillation (KD) is a widely adopted technique for transferring knowledge from a high-capacity teacher model to a smaller student model by aligning their output distributions. However, existing methods often underperform in…

Computer Vision and Pattern Recognition · Computer Science 2026-03-25 Seonghak Kim

Optimize TSK Fuzzy Systems for Regression Problems: Mini-Batch Gradient Descent with Regularization, DropRule and AdaBound (MBGD-RDA)

Takagi-Sugeno-Kang (TSK) fuzzy systems are very useful machine learning models for regression problems. However, to our knowledge, there has not existed an efficient and effective training algorithm that ensures their generalization…

Machine Learning · Computer Science 2019-12-03 Dongrui Wu , Ye Yuan , Yihua Tan

Interpretable Style Takagi-Sugeno-Kang Fuzzy Clustering

Clustering is an efficient and essential technique for exploring latent knowledge of data. However, limited attention has been given to the interpretability of the clusters detected by most clustering algorithms. In addition, due to the…

Machine Learning · Computer Science 2025-04-08 Suhang Gu , Ye Wang , Yongxin Chou , Jinliang Cong , Mingli Lu , Zhuqing Jiao

A Robust Multilabel Method Integrating Rule-based Transparent Model, Soft Label Correlation Learning and Label Noise Resistance

Model transparency, label correlation learning and the robust-ness to label noise are crucial for multilabel learning. However, few existing methods study these three characteristics simultaneously. To address this challenge, we propose the…

Artificial Intelligence · Computer Science 2023-09-26 Qiongdan Lou , Zhaohong Deng , Kup-Sze Choi , Shitong Wang

Hybrid Interval Type-2 Mamdani-TSK Fuzzy System for Regression Analysis

Regression analysis is employed to examine and quantify the relationships between input variables and a dependent and continuous output variable. It is widely used for predictive modelling in fields such as finance, healthcare, and…

Machine Learning · Computer Science 2025-10-16 Ashish Bhatia , Renato Cordeiro de Amorim , Vito De Feo

Knowledge Distillation with Deep Supervision

Knowledge distillation aims to enhance the performance of a lightweight student model by exploiting the knowledge from a pre-trained cumbersome teacher model. However, in the traditional knowledge distillation, teacher predictions are only…

Machine Learning · Computer Science 2023-05-26 Shiya Luo , Defang Chen , Can Wang

Subclass Knowledge Distillation with Known Subclass Labels

This work introduces a novel knowledge distillation framework for classification tasks where information on existing subclasses is available and taken into consideration. In classification tasks with a small number of classes or binary…

Machine Learning · Computer Science 2022-07-19 Ahmad Sajedi , Yuri A. Lawryshyn , Konstantinos N. Plataniotis

DDK: Distilling Domain Knowledge for Efficient Large Language Models

Despite the advanced intelligence abilities of large language models (LLMs) in various applications, they still face significant computational and storage demands. Knowledge Distillation (KD) has emerged as an effective strategy to improve…

Computation and Language · Computer Science 2024-07-24 Jiaheng Liu , Chenchen Zhang , Jinyang Guo , Yuanxing Zhang , Haoran Que , Ken Deng , Zhiqi Bai , Jie Liu , Ge Zhang , Jiakai Wang , Yanan Wu , Congnan Liu , Wenbo Su , Jiamang Wang , Lin Qu , Bo Zheng

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

Knowledge Distillation (KD) uses the teacher's prediction logits as soft labels to guide the student, while self-KD does not need a real teacher to require the soft labels. This work unifies the formulations of the two tasks by decomposing…

Computer Vision and Pattern Recognition · Computer Science 2023-07-18 Zhendong Yang , Ailing Zeng , Zhe Li , Tianke Zhang , Chun Yuan , Yu Li

UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations

Knowledge distillation (KD) is an effective model compression technique that transfers knowledge from a high-performance teacher to a lightweight student, reducing computational and storage costs while maintaining competitive accuracy.…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Fengming Yu , Haiwei Pan , Kejia Zhang , Jian Guan , Haiying Jiang

PLD: A Choice-Theoretic List-Wise Knowledge Distillation

Knowledge distillation is a model compression technique in which a compact "student" network is trained to replicate the predictive behavior of a larger "teacher" network. In logit-based knowledge distillation, it has become the de facto…

Machine Learning · Computer Science 2026-05-12 Ejafa Bassam , Dawei Zhu , Kaigui Bian

Multi-level Knowledge Distillation via Knowledge Alignment and Correlation

Knowledge distillation (KD) has become an important technique for model compression and knowledge transfer. In this work, we first perform a comprehensive analysis of the knowledge transferred by different KD methods. We demonstrate that…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Fei Ding , Yin Yang , Hongxin Hu , Venkat Krovi , Feng Luo

Student-friendly Knowledge Distillation

In knowledge distillation, the knowledge from the teacher model is often too complex for the student model to thoroughly process. However, good teachers in real life always simplify complex material before teaching it to students. Inspired…

Computer Vision and Pattern Recognition · Computer Science 2023-05-19 Mengyang Yuan , Bo Lang , Fengnan Quan