Mojan Javaheripi

Phi-4-reasoning Technical Report

We introduce Phi-4-reasoning, a 14-billion parameter reasoning model that achieves strong performance on complex reasoning tasks. Trained via supervised fine-tuning of Phi-4 on carefully curated set of "teachable" prompts-selected for the…

Artificial Intelligence · Computer Science 2025-05-01 Marah Abdin , Sahaj Agarwal , Ahmed Awadallah , Vidhisha Balachandran , Harkirat Behl , Lingjiao Chen , Gustavo de Rosa , Suriya Gunasekar , Mojan Javaheripi , Neel Joshi , Piero Kauffmann , Yash Lara , Caio César Teodoro Mendes , Arindam Mitra , Besmira Nushi , Dimitris Papailiopoulos , Olli Saarikivi , Shital Shah , Vaishnavi Shrivastava , Vibhav Vineet , Yue Wu , Safoora Yousefi , Guoqing Zheng

Phi-4 Technical Report

We present phi-4, a 14-billion parameter language model developed with a training recipe that is centrally focused on data quality. Unlike most language models, where pre-training is based primarily on organic data sources such as web…

Computation and Language · Computer Science 2024-12-13 Marah Abdin , Jyoti Aneja , Harkirat Behl , Sébastien Bubeck , Ronen Eldan , Suriya Gunasekar , Michael Harrison , Russell J. Hewett , Mojan Javaheripi , Piero Kauffmann , James R. Lee , Yin Tat Lee , Yuanzhi Li , Weishung Liu , Caio C. T. Mendes , Anh Nguyen , Eric Price , Gustavo de Rosa , Olli Saarikivi , Adil Salim , Shital Shah , Xin Wang , Rachel Ward , Yue Wu , Dingli Yu , Cyril Zhang , Yi Zhang

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5…

Computation and Language · Computer Science 2024-09-04 Marah Abdin , Jyoti Aneja , Hany Awadalla , Ahmed Awadallah , Ammar Ahmad Awan , Nguyen Bach , Amit Bahree , Arash Bakhtiari , Jianmin Bao , Harkirat Behl , Alon Benhaim , Misha Bilenko , Johan Bjorck , Sébastien Bubeck , Martin Cai , Qin Cai , Vishrav Chaudhary , Dong Chen , Dongdong Chen , Weizhu Chen , Yen-Chun Chen , Yi-Ling Chen , Hao Cheng , Parul Chopra , Xiyang Dai , Matthew Dixon , Ronen Eldan , Victor Fragoso , Jianfeng Gao , Mei Gao , Min Gao , Amit Garg , Allie Del Giorno , Abhishek Goswami , Suriya Gunasekar , Emman Haider , Junheng Hao , Russell J. Hewett , Wenxiang Hu , Jamie Huynh , Dan Iter , Sam Ade Jacobs , Mojan Javaheripi , Xin Jin , Nikos Karampatziakis , Piero Kauffmann , Mahoud Khademi , Dongwoo Kim , Young Jin Kim , Lev Kurilenko , James R. Lee , Yin Tat Lee , Yuanzhi Li , Yunsheng Li , Chen Liang , Lars Liden , Xihui Lin , Zeqi Lin , Ce Liu , Liyuan Liu , Mengchen Liu , Weishung Liu , Xiaodong Liu , Chong Luo , Piyush Madan , Ali Mahmoudzadeh , David Majercak , Matt Mazzola , Caio César Teodoro Mendes , Arindam Mitra , Hardik Modi , Anh Nguyen , Brandon Norick , Barun Patra , Daniel Perez-Becker , Thomas Portet , Reid Pryzant , Heyang Qin , Marko Radmilac , Liliang Ren , Gustavo de Rosa , Corby Rosset , Sambudha Roy , Olatunji Ruwase , Olli Saarikivi , Amin Saied , Adil Salim , Michael Santacroce , Shital Shah , Ning Shang , Hiteshi Sharma , Yelong Shen , Swadheen Shukla , Xia Song , Masahiro Tanaka , Andrea Tupini , Praneetha Vaddamanu , Chunyu Wang , Guanhua Wang , Lijuan Wang , Shuohang Wang , Xin Wang , Yu Wang , Rachel Ward , Wen Wen , Philipp Witte , Haiping Wu , Xiaoxia Wu , Michael Wyatt , Bin Xiao , Can Xu , Jiahang Xu , Weijian Xu , Jilong Xue , Sonali Yadav , Fan Yang , Jianwei Yang , Yifan Yang , Ziyi Yang , Donghan Yu , Lu Yuan , Chenruidong Zhang , Cyril Zhang , Jianwen Zhang , Li Lyna Zhang , Yi Zhang , Yue Zhang , Yunan Zhang , Xiren Zhou

Textbooks Are All You Need

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality"…

Computation and Language · Computer Science 2023-10-03 Suriya Gunasekar , Yi Zhang , Jyoti Aneja , Caio César Teodoro Mendes , Allie Del Giorno , Sivakanth Gopi , Mojan Javaheripi , Piero Kauffmann , Gustavo de Rosa , Olli Saarikivi , Adil Salim , Shital Shah , Harkirat Singh Behl , Xin Wang , Sébastien Bubeck , Ronen Eldan , Adam Tauman Kalai , Yin Tat Lee , Yuanzhi Li

zPROBE: Zero Peek Robustness Checks for Federated Learning

Privacy-preserving federated learning allows multiple users to jointly train a model with coordination of a central server. The server only learns the final aggregation result, thus the users' (private) training data is not leaked from the…

Machine Learning · Computer Science 2023-09-06 Zahra Ghodsi , Mojan Javaheripi , Nojan Sheybani , Xinqiao Zhang , Ke Huang , Farinaz Koushanfar

NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

Video compression plays a significant role in IoT devices for the efficient transport of visual data while satisfying all underlying bandwidth constraints. Deep learning-based video compression methods are rapidly replacing traditional…

Image and Video Processing · Electrical Eng. & Systems 2023-04-05 Jung-Woo Chang , Nojan Sheybani , Shehzeen Samarah Hussain , Mojan Javaheripi , Seira Hidano , Farinaz Koushanfar

RoVISQ: Reduction of Video Service Quality via Adversarial Attacks on Deep Learning-based Video Compression

Video compression plays a crucial role in video streaming and classification systems by maximizing the end-user quality of experience (QoE) at a given bandwidth budget. In this paper, we conduct the first systematic study for adversarial…

Computer Vision and Pattern Recognition · Computer Science 2022-12-09 Jung-Woo Chang , Mojan Javaheripi , Seira Hidano , Farinaz Koushanfar

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints…

Machine Learning · Computer Science 2022-10-19 Mojan Javaheripi , Gustavo H. de Rosa , Subhabrata Mukherjee , Shital Shah , Tomasz L. Religa , Caio C. T. Mendes , Sebastien Bubeck , Farinaz Koushanfar , Debadeepta Dey

Machine Learning-Assisted E-jet Printing of Organic Flexible Biosensors

Electrohydrodynamic-jet (e-jet) printing technique enables the high-resolution printing of complex soft electronic devices. As such, it has an unmatched potential for becoming the conventional technique for printing soft electronic devices.…

Machine Learning · Computer Science 2021-11-09 Mehran Abbasi Shirsavar , Mehrnoosh Taghavimehr , Lionel J. Ouedraogo , Mojan Javaheripi , Nicole N. Hashemi , Farinaz Koushanfar , Reza Montazami

HASHTAG: Hash Signatures for Online Detection of Fault-Injection Attacks on Deep Neural Networks

We propose HASHTAG, the first framework that enables high-accuracy detection of fault-injection attacks on Deep Neural Networks (DNNs) with provable bounds on detection performance. Recent literature in fault-injection attacks shows the…

Cryptography and Security · Computer Science 2021-11-04 Mojan Javaheripi , Farinaz Koushanfar

Trojan Signatures in DNN Weights

Deep neural networks have been shown to be vulnerable to backdoor, or trojan, attacks where an adversary has embedded a trigger in the network at training time such that the model correctly classifies all standard inputs, but generates a…

Machine Learning · Computer Science 2021-09-08 Greg Fields , Mohammad Samragh , Mojan Javaheripi , Farinaz Koushanfar , Tara Javidi

Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution

Knowledge distillation has been used to transfer knowledge learned by a sophisticated model (teacher) to a simpler model (student). This technique is widely used to compress model complexity. However, in most applications the compressed…

Machine Learning · Computer Science 2020-11-24 Hadi Pouransari , Mojan Javaheripi , Vinay Sharma , Oncel Tuzel

CLEANN: Accelerated Trojan Shield for Embedded Neural Networks

We propose CLEANN, the first end-to-end framework that enables online mitigation of Trojans for embedded Deep Neural Network (DNN) applications. A Trojan attack works by injecting a backdoor in the DNN while training; during inference, the…

Machine Learning · Computer Science 2020-09-08 Mojan Javaheripi , Mohammad Samragh , Gregory Fields , Tara Javidi , Farinaz Koushanfar

GeneCAI: Genetic Evolution for Acquiring Compact AI

In the contemporary big data realm, Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy. Model compression techniques can be leveraged to efficiently deploy such compute-intensive…

Machine Learning · Computer Science 2020-04-15 Mojan Javaheripi , Mohammad Samragh , Tara Javidi , Farinaz Koushanfar

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Autoregressive convolutional neural networks (CNNs) have been widely exploited for sequence generation tasks such as audio synthesis, language modeling and neural machine translation. WaveNet is a deep autoregressive CNN composed of several…

Audio and Speech Processing · Electrical Eng. & Systems 2020-02-13 Shehzeen Hussain , Mojan Javaheripi , Paarth Neekhara , Ryan Kastner , Farinaz Koushanfar

ASCAI: Adaptive Sampling for acquiring Compact AI

This paper introduces ASCAI, a novel adaptive sampling methodology that can learn how to effectively compress Deep Neural Networks (DNNs) for accelerated inference on resource-constrained platforms. Modern DNN compression techniques…

Machine Learning · Computer Science 2019-11-18 Mojan Javaheripi , Mohammad Samragh , Tara Javidi , Farinaz Koushanfar

SWNet: Small-World Neural Networks and Rapid Convergence

Training large and highly accurate deep learning (DL) models is computationally costly. This cost is in great part due to the excessive number of trained parameters, which are well-known to be redundant and compressible for the execution…

Machine Learning · Computer Science 2019-04-11 Mojan Javaheripi , Bita Darvish Rouhani , Farinaz Koushanfar

CodeX: Bit-Flexible Encoding for Streaming-based FPGA Acceleration of DNNs

This paper proposes CodeX, an end-to-end framework that facilitates encoding, bitwidth customization, fine-tuning, and implementation of neural networks on FPGA platforms. CodeX incorporates nonlinear encoding to the computation flow of…

Machine Learning · Computer Science 2019-01-18 Mohammad Samragh , Mojan Javaheripi , Farinaz Koushanfar

DeepFense: Online Accelerated Defense Against Adversarial Deep Learning

Recent advances in adversarial Deep Learning (DL) have opened up a largely unexplored surface for malicious attacks jeopardizing the integrity of autonomous DL systems. With the wide-spread usage of DL in critical and time-sensitive…

Cryptography and Security · Computer Science 2018-08-22 Bita Darvish Rouhani , Mohammad Samragh , Mojan Javaheripi , Tara Javidi , Farinaz Koushanfar