Zebin Yang — Scifaro

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable capability for complex and long-horizon embodied planning. By keeping track of past experiences and environmental states, memory enables LLMs to maintain a global…

Robotics · Computer Science 2026-03-18 Zebin Yang , Tong Xie , Baotong Lu , Shaoshan Liu , Bo Yu , Meng Li

DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation

Vision-Language-Action (VLA) models have shown remarkable success in robotic tasks like manipulation by fusing a language model's reasoning with a vision model's 3D understanding. However, their high computational cost remains a major…

Robotics · Computer Science 2026-03-18 Zebin Yang , Yijiahao Qi , Tong Xie , Bo Yu , Shaoshan Liu , Meng Li

EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval

Object-goal navigation (ObjNav) tasks an agent with navigating to the location of a specific object in an unseen environment. Embodied agents equipped with large language models (LLMs) and online constructed navigation maps can perform…

Robotics · Computer Science 2026-03-18 Zebin Yang , Sunjian Zheng , Tong Xie , Tianshi Xu , Bo Yu , Fan Wang , Jie Tang , Shaoshan Liu , Meng Li

LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design

State space models (SSMs) like Mamba have recently attracted much attention. Compared to Transformer-based large language models (LLMs), Mamba achieves linear computation complexity with the sequence length and demonstrates superior…

Computation and Language · Computer Science 2025-10-13 Renjie Wei , Songqiang Xu , Linfeng Zhong , Zebin Yang , Qingyu Guo , Yuan Wang , Runsheng Wang , Meng Li

Inherently Interpretable Tree Ensemble Learning

Tree ensemble models like random forests and gradient boosting machines are widely used in machine learning due to their excellent predictive performance. However, a high-performance ensemble consisting of a large number of decision trees…

Machine Learning · Statistics 2024-10-28 Zebin Yang , Agus Sudjianto , Xiaoming Li , Aijun Zhang

MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers

In this paper, we propose MCUBERT to enable language models like BERT on tiny microcontroller units (MCUs) through network and scheduling co-optimization. We observe the embedding table contributes to the major storage bottleneck for tiny…

Machine Learning · Computer Science 2024-10-24 Zebin Yang , Renze Chen , Taiqiang Wu , Ngai Wong , Yun Liang , Runsheng Wang , Ru Huang , Meng Li

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

With the fast evolution of large language models (LLMs), privacy concerns with user queries arise as they may contain sensitive information. Private inference based on homomorphic encryption (HE) has been proposed to protect user query…

Cryptography and Security · Computer Science 2024-05-28 Chenqi Lin , Tianshi Xu , Zebin Yang , Runsheng Wang , Ru Huang , Meng Li

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

Recent advancements in generative large language models (LLMs) have significantly boosted the performance in natural language processing tasks. However, their efficiency is hampered by the inherent limitations in autoregressive token…

Machine Learning · Computer Science 2024-02-22 Shuzhang Zhong , Zebin Yang , Meng Li , Ruihao Gong , Runsheng Wang , Ru Huang

AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

Large language models (LLMs) with Transformer architectures have become phenomenal in natural language processing, multimodal generative artificial intelligence, and agent-oriented artificial intelligence. The self-attention module is the…

Hardware Architecture · Computer Science 2024-01-23 Rongqing Cong , Wenyang He , Mingxuan Li , Bangning Luo , Zebin Yang , Yuchao Yang , Ru Huang , Bonan Yan

PiML Toolbox for Interpretable Machine Learning Model Development and Diagnostics

PiML (read $\pi$-ML, /`pai`em`el/) is an integrated and open-access Python toolbox for interpretable machine learning model development and model diagnostics. It is designed with machine learning workflows in both low-code and high-code…

Machine Learning · Computer Science 2023-12-21 Agus Sudjianto , Aijun Zhang , Zebin Yang , Yu Su , Ningzhou Zeng

Hyperparameter Optimization via Sequential Uniform Designs

Hyperparameter optimization (HPO) plays a central role in the automated machine learning (AutoML). It is a challenging task as the response surfaces of hyperparameters are generally unknown, hence essentially a global optimization problem.…

Machine Learning · Computer Science 2021-06-18 Zebin Yang , Aijun Zhang

GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions

The lack of interpretability is an inevitable problem when using neural network models in real applications. In this paper, an explainable neural network based on generalized additive models with structured interactions (GAMI-Net) is…

Machine Learning · Statistics 2021-06-03 Zebin Yang , Aijun Zhang , Agus Sudjianto

Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions

In recent years, the field of recommendation systems has attracted increasing attention to developing predictive models that provide explanations of why an item is recommended to a user. The explanations can be either obtained by post-hoc…

Machine Learning · Computer Science 2020-12-16 Yifeng Guo , Yu Su , Zebin Yang , Aijun Zhang

Unwrapping The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification

The deep neural networks (DNNs) have achieved great success in learning complex patterns with strong predictive power, but they are often thought of as "black box" models without a sufficient level of transparency and interpretability. It…

Machine Learning · Computer Science 2020-11-10 Agus Sudjianto , William Knauth , Rahul Singh , Zebin Yang , Aijun Zhang

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks…

Machine Learning · Computer Science 2020-06-26 Zebin Yang , Hengtao Zhang , Agus Sudjianto , Aijun Zhang

Enhancing Explainability of Neural Networks through Architecture Constraints

Prediction accuracy and model explainability are the two most important objectives when developing machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of…

Machine Learning · Statistics 2019-09-04 Zebin Yang , Aijun Zhang , Agus Sudjianto

Interval-valued Data Prediction via Regularized Artificial Neural Network

A regularized artificial neural network (RANN) is proposed for interval-valued data prediction. The ANN model is selected due to its powerful capability in fitting linear and nonlinear functions. To meet mathematical coherence requirement…

Computation · Statistics 2018-08-22 Zebin Yang , Dennis K. J. Lin , Aijun Zhang