Lin Pan — Scifaro

PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries

Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered. However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack…

Computation and Language · Computer Science 2026-01-26 Mingwen Dong , Nischal Ashok Kumar , Yiqun Hu , Anuj Chauhan , Chung-Wei Hang , Shuaichen Chang , Lin Pan , Wuwei Lan , Henghui Zhu , Jiarong Jiang , Patrick Ng , Zhiguo Wang

Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall

Large language models (LLMs) have shown remarkable performance on a variety of NLP tasks, and are being rapidly adopted in a wide range of use cases. It is therefore of vital importance to holistically evaluate the factuality of their…

Computation and Language · Computer Science 2024-04-26 Jiaqing Yuan , Lin Pan , Chung-Wei Hang , Jiang Guo , Jiarong Jiang , Bonan Min , Patrick Ng , Zhiguo Wang

Curvilinear object segmentation in medical images based on ODoS filter and deep learning network

Automatic segmentation of curvilinear objects in medical images plays an important role in the diagnosis and evaluation of human diseases, yet it is a challenging uncertainty in the complex segmentation tasks due to different issues such as…

Image and Video Processing · Electrical Eng. & Systems 2023-12-05 Yuanyuan Peng , Lin Pan , Pengpeng Luan , Hongbin Tu , Xiong Li

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied…

Computation and Language · Computer Science 2023-07-17 Wuwei Lan , Zhiguo Wang , Anuj Chauhan , Henghui Zhu , Alexander Li , Jiang Guo , Sheng Zhang , Chung-Wei Hang , Joseph Lilien , Yiqun Hu , Lin Pan , Mingwen Dong , Jun Wang , Jiarong Jiang , Stephen Ash , Vittorio Castelli , Patrick Ng , Bing Xiang

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous…

Computation and Language · Computer Science 2023-01-31 Shuaichen Chang , Jun Wang , Mingwen Dong , Lin Pan , Henghui Zhu , Alexander Hanbo Li , Wuwei Lan , Sheng Zhang , Jiarong Jiang , Joseph Lilien , Steve Ash , William Yang Wang , Zhiguo Wang , Vittorio Castelli , Patrick Ng , Bing Xiang

Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

Recently, there has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did…

Computation and Language · Computer Science 2022-12-20 Yiyun Zhao , Jiarong Jiang , Yiqun Hu , Wuwei Lan , Henry Zhu , Anuj Chauhan , Alexander Li , Lin Pan , Jun Wang , Chung-Wei Hang , Sheng Zhang , Marvin Dong , Joe Lilien , Patrick Ng , Zhiguo Wang , Vittorio Castelli , Bing Xiang

Composite learning control with application to inverted pendulums

Composite adaptive control (CAC) that integrates direct and indirect adaptive control techniques can achieve smaller tracking errors and faster parameter convergence compared with direct and indirect adaptive control techniques. However,…

Systems and Control · Computer Science 2022-07-08 Yongping Pan , Lin Pan , Haoyong Yu

Improved Text Classification via Contrastive Adversarial Training

We propose a simple and general method to regularize the fine-tuning of Transformer-based encoders for text classification tasks. Specifically, during fine-tuning we generate adversarial examples by perturbing the word embeddings of the…

Computation and Language · Computer Science 2022-02-21 Lin Pan , Chung-Wei Hang , Avirup Sil , Saloni Potdar

Interpretative Computer-aided Lung Cancer Diagnosis: from Radiology Analysis to Malignancy Evaluation

Background and Objective:Computer-aided diagnosis (CAD) systems promote diagnosis effectiveness and alleviate pressure of radiologists. A CAD system for lung cancer diagnosis includes nodule candidate detection and nodule malignancy…

Image and Video Processing · Electrical Eng. & Systems 2022-01-17 Shaohua Zheng , Zhiqiang Shen , Chenhao Peia , Wangbin Ding , Haojin Lin , Jiepeng Zheng , Lin Pan , Bin Zheng , Liqin Huang

Benchmarking Commercial Intent Detection Services with Practice-Driven Evaluations

Intent detection is a key component of modern goal-oriented dialog systems that accomplish a user task by predicting the intent of users' text input. There are three primary challenges in designing robust and accurate intent detection…

Computation and Language · Computer Science 2021-06-04 Haode Qi , Lin Pan , Atin Sood , Abhishek Shah , Ladislav Kunc , Mo Yu , Saloni Potdar

Automatic Pulmonary Artery-Vein Separation in CT Images using Twin-Pipe Network and Topology Reconstruction

With the development of medical computer-aided diagnostic systems, pulmonary artery-vein(A/V) separation plays a crucial role in assisting doctors in preoperative planning for lung cancer surgery. However, distinguishing arterial from…

Image and Video Processing · Electrical Eng. & Systems 2021-05-31 Lin Pan , Yaoyong Zheng , Liqin Huang , Liuqing Chen , Zhen Zhang , Rongda Fu , Bin Zheng , Shaohua Zheng

Multilingual BERT Post-Pretraining Alignment

We propose a simple method to align multilingual contextual embeddings as a post-pretraining step for improved zero-shot cross-lingual transferability of the pretrained models. Using parallel data, our method aligns embeddings on the word…

Computation and Language · Computer Science 2021-04-13 Lin Pan , Chung-Wei Hang , Haode Qi , Abhishek Shah , Saloni Potdar , Mo Yu

Coarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing

Automatic airway segmentation from chest computed tomography (CT) scans plays an important role in pulmonary disease diagnosis and computer-assisted therapy. However, low contrast at peripheral branches and complex tree-like structures…

Image and Video Processing · Electrical Eng. & Systems 2021-02-26 Jinquan Guo , Rongda Fu , Lin Pan , Shaohua Zheng , Liqin Huang , Bin Zheng , Bingwei He

Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this…

Computation and Language · Computer Science 2020-12-14 Mihaela Bornea , Lin Pan , Sara Rosenthal , Radu Florian , Avirup Sil

CFO: A Framework for Building Production NLP Systems

This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to…

Computation and Language · Computer Science 2020-06-23 Rishav Chakravarti , Cezar Pendus , Andrzej Sakrajda , Anthony Ferritto , Lin Pan , Michael Glass , Vittorio Castelli , J. William Murdock , Radu Florian , Salim Roukos , Avirup Sil

Span Selection Pre-training for Question Answering

BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two…

Computation and Language · Computer Science 2020-06-22 Michael Glass , Alfio Gliozzo , Rishav Chakravarti , Anthony Ferritto , Lin Pan , G P Shrivatsa Bhargav , Dinesh Garg , Avirup Sil

The TechQA Dataset

We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed…

Computation and Language · Computer Science 2019-11-11 Vittorio Castelli , Rishav Chakravarti , Saswati Dana , Anthony Ferritto , Radu Florian , Martin Franz , Dinesh Garg , Dinesh Khandelwal , Scott McCarley , Mike McCawley , Mohamed Nasr , Lin Pan , Cezar Pendus , John Pitrelli , Saurabh Pujar , Salim Roukos , Andrzej Sakrajda , Avirup Sil , Rosario Uceda-Sosa , Todd Ward , Rong Zhang

Ensembling Strategies for Answering Natural Questions

Many of the top question answering systems today utilize ensembling to improve their performance on tasks such as the Stanford Question Answering Dataset (SQuAD) and Natural Questions (NQ) challenges. Unfortunately most of these systems do…

Computation and Language · Computer Science 2019-11-07 Anthony Ferritto , Lin Pan , Rishav Chakravarti , Salim Roukos , Radu Florian , J. William Murdock , Avirup Sil

Frustratingly Easy Natural Question Answering

Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa. Additionally, a lot of systems on the QA leaderboards do not…

Computation and Language · Computer Science 2019-09-13 Lin Pan , Rishav Chakravarti , Anthony Ferritto , Michael Glass , Alfio Gliozzo , Salim Roukos , Radu Florian , Avirup Sil

Multi-Granular Text Encoding for Self-Explaining Categorization

Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence. A popular type of evidence is sub-sequences extracted from the input text which are sufficient for the classifier to make the…

Computation and Language · Computer Science 2019-07-22 Zhiguo Wang , Yue Zhang , Mo Yu , Wei Zhang , Lin Pan , Linfeng Song , Kun Xu , Yousef El-Kurdi