Distributed, Parallel, and Cluster Computing · Computer Science
DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the Cloud
Qinlong Wang, Tingfeng Lan, Yinghao Tang, Ziling Huang +7
2024-07-01
Distributed, Parallel, and Cluster Computing · Computer Science
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui, Yavuz Yetim, Özgür Özkan, Zhuoran Zhao +3
2020-11-13
Information Retrieval · Computer Science
UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture
Sitian Chen, Haobin Tan, Amelie Chi Zhou, Yusen Li +1
2024-10-10
Hardware Architecture · Computer Science
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs
Rishabh Jain, Vivek M. Bhasi, Adwait Jog, Anand Sivasubramaniam +2
2024-10-30
Information Retrieval · Computer Science
Mem-Rec: Memory Efficient Recommendation System using Alternative Representation
Gopi Krishna Jha, Anthony Thomas, Nilesh Jain, Sameh Gobriel +2
2026-01-06
Distributed, Parallel, and Cluster Computing · Computer Science
Two-dimensional Sparse Parallelism for Large Scale Deep Learning Recommendation Model Training
Xin Zhang, Quanyu Zhu, Liangbei Xu, Zain Huda +7
2025-08-07
Distributed, Parallel, and Cluster Computing · Computer Science
Optimizing Deep Learning Recommender Systems' Training On CPU Cluster Architectures
Dhiraj Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen +2
2020-05-12
Information Retrieval · Computer Science
A Frequency-aware Software Cache for Large Recommendation System Embeddings
Jiarui Fang, Geng Zhang, Jiatong Han, Shenggui Li +4
2022-08-11
Hardware Architecture · Computer Science
MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions
Wenqi Jiang, Zhenhao He, Shuai Zhang, Thomas B. Preußer +8
2021-02-22
Distributed, Parallel, and Cluster Computing · Computer Science
Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems
Fabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer +2
2024-10-14
Distributed, Parallel, and Cluster Computing · Computer Science
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke, Udit Gupta, Carole-Jean Wu, Benjamin Youngjae Cho +17
2020-01-01
Machine Learning · Computer Science
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yipeng Du, Zihao Wang, Ahmad Farhan, Claudio Angione +6
2025-08-14
Information Retrieval · Computer Science
Deep Learning Model Acceleration and Optimization Strategies for Real-Time Recommendation Systems
Junli Shao, Jing Dong, Dingzhou Wang, Kowei Shih +2
2025-08-14
Cryptography and Security · Computer Science
HE-LRM: Efficient Private Embedding Lookups for Neural Inference Using Fully Homomorphic Encryption
Karthik Garimella, Austin Ebel, Gabrielle De Micheli, Brandon Reagen
2026-02-23
Machine Learning · Computer Science
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang, Yipeng Du, Ahmad Farhan, Claudio Angione +5
2024-10-30
Information Retrieval · Computer Science
Random Offset Block Embedding Array (ROBE) for CriteoTB Benchmark MLPerf DLRM Model : 1000$\times$ Compression and 3.1$\times$ Faster Inference
Aditya Desai, Li Chou, Anshumali Shrivastava
2022-01-25
Hardware Architecture · Computer Science
Supporting Massive DLRM Inference Through Software Defined Memory
Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan +16
2021-11-10
Machine Learning · Computer Science
Mixed-Precision Embedding Using a Cache
Jie Amy Yang, Jianyu Huang, Jongsoo Park, Ping Tak Peter Tang +1
2020-10-26
Distributed, Parallel, and Cluster Computing · Computer Science
An efficient and flexible inference system for serving heterogeneous ensembles of deep neural networks
Pierrick Pochelu, Serge G. Petiton, Bruno Conche
2022-08-31
Distributed, Parallel, and Cluster Computing · Computer Science
Near-Zero-Overhead Freshness for Recommendation Systems via Inference-Side Model Updates
Wenjun Yu, Sitian Chen, Cheng Chen, Amelie Chi Zhou
2025-12-18
Distributed, Parallel, and Cluster Computing · Computer Science
AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality
Ilias Bournias, Lukas Cavigelli, Georgios Zacharopoulos
2024-11-11
Hardware Architecture · Computer Science
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference
Joyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis +3
2024-07-23