Vijayan N. Nair — Scifaro

Using Markov Boundary Approach for Interpretable and Generalizable Feature Selection

The perceived advantage of machine learning (ML) models is that they are flexible and can incorporate a large number of features. However, many of these are typically correlated or dependent, and incorporating all of them can hinder model…

Applications · Statistics 2025-03-11 Anwesha Bhattacharyya , Yaqun Wang , Joel Vaughan , Vijayan N. Nair

Cross Spline Net and a Unified World

In today's machine learning world for tabular data, XGBoost and fully connected neural network (FCNN) are two most popular methods due to their good model performance and convenience to use. However, they are highly complicated, hard to…

Methodology · Statistics 2024-10-28 Linwei Hu , Ye Jin Choi , Vijayan N. Nair

Assessing Robustness of Machine Learning Models using Covariate Perturbations

As machine learning models become increasingly prevalent in critical decision-making models and systems in fields like finance, healthcare, etc., ensuring their robustness against adversarial attacks and changes in the input data is…

Machine Learning · Statistics 2024-08-05 Arun Prakash R , Anwesha Bhattacharyya , Joel Vaughan , Vijayan N. Nair

Using Model-Based Trees with Boosting to Fit Low-Order Functional ANOVA Models

Low-order functional ANOVA (fANOVA) models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and GAMI-Net (Yang…

Machine Learning · Statistics 2023-12-19 Linwei Hu , Jie Chen , Vijayan N. Nair

Monotone Tree-Based GAMI Models by Adapting XGBoost

Recent papers have used machine learning architecture to fit low-order functional ANOVA models with main effects and second-order interactions. These GAMI (GAM + Interaction) models are directly interpretable as the functional main effects…

Machine Learning · Statistics 2023-09-06 Linwei Hu , Soroush Aramideh , Jie Chen , Vijayan N. Nair

Document Automation Architectures: Updated Survey in Light of Large Language Models

This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and…

Computation and Language · Computer Science 2023-08-21 Mohammad Ahmadi Achachlouei , Omkar Patil , Tarun Joshi , Vijayan N. Nair

Interpretable Machine Learning based on Functional ANOVA Framework: Algorithms and Comparisons

In the early days of machine learning (ML), the emphasis was on developing complex algorithms to achieve best predictive performance. To understand and explain the model results, one had to rely on post hoc explainability techniques, which…

Machine Learning · Statistics 2023-05-26 Linwei Hu , Vijayan N. Nair , Agus Sudjianto , Aijun Zhang , Jie Chen

Behavior of Hyper-Parameters for Selected Machine Learning Algorithms: An Empirical Investigation

Hyper-parameters (HPs) are an important part of machine learning (ML) model development and can greatly influence performance. This paper studies their behavior for three algorithms: Extreme Gradient Boosting (XGB), Random Forest (RF), and…

Machine Learning · Computer Science 2022-11-17 Anwesha Bhattacharyya , Joel Vaughan , Vijayan N. Nair

Comparing Baseline Shapley and Integrated Gradients for Local Explanation: Some Additional Insights

There are many different methods in the literature for local explanation of machine learning results. However, the methods differ in their approaches and often do not provide same explanations. In this paper, we consider two recent methods:…

Machine Learning · Computer Science 2022-08-15 Tianshu Feng , Zhipu Zhou , Joshi Tarun , Vijayan N. Nair

Quantifying Inherent Randomness in Machine Learning Algorithms

Most machine learning (ML) algorithms have several stochastic elements, and their performances are affected by these sources of randomness. This paper uses an empirical study to systematically examine the effects of two sources: randomness…

Machine Learning · Statistics 2022-06-27 Soham Raste , Rahul Singh , Joel Vaughan , Vijayan N. Nair

Interpretable Feature Engineering for Time Series Predictors using Attention Networks

Regression problems with time-series predictors are common in banking and many other areas of application. In this paper, we use multi-head attention networks to develop interpretable features and use them to achieve good predictive…

Machine Learning · Computer Science 2022-05-26 Tianjie Wang , Jie Chen , Joel Vaughan , Vijayan N. Nair

Performance and Interpretability Comparisons of Supervised Machine Learning Algorithms: An Empirical Study

This paper compares the performances of three supervised machine learning algorithms in terms of predictive ability and model interpretation on structured or tabular data. The algorithms considered were scikit-learn implementations of…

Machine Learning · Statistics 2022-05-06 Alice J. Liu , Arpita Mukherjee , Linwei Hu , Jie Chen , Vijayan N. Nair

Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a…

Machine Learning · Statistics 2022-04-27 Vijayan N. Nair , Tianshu Feng , Linwei Hu , Zach Zhang , Jie Chen , Agus Sudjianto

Document Automation Architectures and Technologies: A Survey

This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically integrating input from different sources and assembling…

Computation and Language · Computer Science 2021-09-27 Mohammad Ahmadi Achachlouei , Omkar Patil , Tarun Joshi , Vijayan N. Nair

Self-interpretable Convolutional Neural Networks for Text Classification

Deep learning models for natural language processing (NLP) are inherently complex and often viewed as black box in nature. This paper develops an approach for interpreting convolutional neural networks for text classification problems by…

Computation and Language · Computer Science 2021-07-12 Wei Zhao , Rahul Singh , Tarun Joshi , Agus Sudjianto , Vijayan N. Nair

SHAP values for Explaining CNN-based Text Classification Models

Deep neural networks are increasingly used in natural language processing (NLP) models. However, the need to interpret and explain the results from complex algorithms are limiting their widespread adoption in regulated industries such as…

Computation and Language · Computer Science 2021-07-12 Wei Zhao , Tarun Joshi , Vijayan N. Nair , Agus Sudjianto

Bias, Fairness, and Accountability with AI and ML Algorithms

The advent of AI and ML algorithms has led to opportunities as well as challenges. In this paper, we provide an overview of bias and fairness issues that arise with the use of ML algorithms. We describe the types and sources of data bias,…

Machine Learning · Statistics 2021-05-17 Nengfeng Zhou , Zach Zhang , Vijayan N. Nair , Harsh Singhal , Jie Chen , Agus Sudjianto

Recent Trends in the Use of Deep Learning Models for Grammar Error Handling

Grammar error handling (GEH) is an important topic in natural language processing (NLP). GEH includes both grammar error detection and grammar error correction. Recent advances in computation systems have promoted the use of deep learning…

Computation and Language · Computer Science 2020-09-08 Mina Naghshnejad , Tarun Joshi , Vijayan N. Nair

Model Robustness with Text Classification: Semantic-preserving adversarial attacks

We propose algorithms to create adversarial attacks to assess model robustness in text classification problems. They can be used to create white box attacks and black box attacks while at the same time preserving the semantics and syntax of…

Computation and Language · Computer Science 2020-08-17 Rahul Singh , Tarun Joshi , Vijayan N. Nair , Agus Sudjianto

Supervised Machine Learning Techniques: An Overview with Applications to Banking

This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural…

General Finance · Quantitative Finance 2020-08-11 Linwei Hu , Jie Chen , Joel Vaughan , Hanyu Yang , Kelly Wang , Agus Sudjianto , Vijayan N. Nair