Related papers: Active Inductive Logic Programming for Code Search

ALICE: Active Learning with Contrastive Natural Language Explanations

Training a supervised neural network classifier typically requires many annotated training samples. Collecting and annotating a large number of data points are costly and sometimes even infeasible. Traditional annotation process uses a…

Computation and Language · Computer Science 2020-10-02 Weixin Liang , James Zou , Zhou Yu

Enhancing Retrieval Systems with Inference-Time Logical Reasoning

Traditional retrieval methods rely on transforming user queries into vector representations and retrieving documents based on cosine similarity within an embedding space. While efficient and scalable, this approach often fails to handle…

Computation and Language · Computer Science 2025-03-25 Felix Faltings , Wei Wei , Yujia Bao

You Don't Know Search: Helping Users Find Code by Automatically Evaluating Alternative Queries

Tens of thousands of engineers use Sourcegraph day-to-day to search for code and rely on it to make progress on software development tasks. We face a key challenge in designing a query language that accommodates the needs of a broad…

Software Engineering · Computer Science 2022-12-08 Rijnard van Tonder

ALICE: Combining Feature Selection and Inter-Rater Agreeability for Machine Learning Insights

This paper presents a new Python library called Automated Learning for Insightful Comparison and Evaluation (ALICE), which merges conventional feature selection and the concept of inter-rater agreeability in a simple, user-friendly manner…

Machine Learning · Computer Science 2024-04-16 Bachana Anasashvili , Vahidin Jeleskovic

AIDE: An Automated Sample-based Approach for Interactive Data Exploration

In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from…

Databases · Computer Science 2015-11-02 Kyriaki Dimitriadou , Olga Papaemmanouil , Yanlei Diao

ALE: A Simulation-Based Active Learning Evaluation Framework for the Parameter-Driven Comparison of Query Strategies for NLP

Supervised machine learning and deep learning require a large amount of labeled data, which data scientists obtain in a manual, and time-consuming annotation process. To mitigate this challenge, Active Learning (AL) proposes promising data…

Computation and Language · Computer Science 2023-08-08 Philipp Kohl , Nils Freyer , Yoka Krämer , Henri Werth , Steffen Wolf , Bodo Kraft , Matthias Meinecke , Albert Zündorf

Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion

Large Language Models (LLMs) show remarkable potential for few-shot information extraction (IE), yet their performance is highly sensitive to the choice of in-context examples. Conventional selection strategies often fail to provide…

Computation and Language · Computer Science 2026-05-13 Dong Zhao , Yadong Wang , Xiang Chen , Chenxi Wang , Hongliang Dai , Chuanxing Geng , Shengzhong Zhang , Shaoyuan Li , Sheng-Jun Huang

ICE-Score: Instructing Large Language Models to Evaluate Code

Recent advancements in the field of natural language generation have facilitated the use of large language models to assess the quality of generated text. Although these models have shown promising results in tasks such as machine…

Artificial Intelligence · Computer Science 2024-01-23 Terry Yue Zhuo

Enabling Large Language Models to Generate Text with Citations

Large language models (LLMs) have emerged as a widely-used tool for information seeking, but their generated outputs are prone to hallucination. In this work, our aim is to allow LLMs to generate text with citations, improving their factual…

Computation and Language · Computer Science 2023-11-01 Tianyu Gao , Howard Yen , Jiatong Yu , Danqi Chen

Towards Automated Augmentation and Instrumentation of Legacy Cryptographic Executables: Extended Version

Implementation flaws in cryptographic libraries, design flaws in underlying cryptographic primitives, and weaknesses in protocols using both, can all lead to exploitable vulnerabilities in software. Manually fixing such issues is…

Cryptography and Security · Computer Science 2020-04-23 Karim Eldefrawy , Michael Locasto , Norrathep Rattanavipanon , Hassen Saidi

Active Preference Inference using Language Models and Probabilistic Reasoning

Actively inferring user preferences, for example by asking good questions, is important for any human-facing decision-making system. Active inference allows such systems to adapt and personalize themselves to nuanced individual preferences.…

Computation and Language · Computer Science 2024-06-27 Wasu Top Piriyakulkij , Volodymyr Kuleshov , Kevin Ellis

Accelerating Code Search with Deep Hashing and Code Classification

Code search is to search reusable code snippets from source code corpus based on natural languages queries. Deep learning-based methods of code search have shown promising results. However, previous methods focus on retrieval accuracy but…

Software Engineering · Computer Science 2022-04-01 Wenchao Gu , Yanlin Wang , Lun Du , Hongyu Zhang , Shi Han , Dongmei Zhang , Michael R. Lyu

ANDRE: An Attention-based Neuro-symbolic Differentiable Rule Extractor

Inductive Logic Programming (ILP) aims to learn interpretable first-order rules from data, but existing symbolic and neuro-symbolic approaches struggle to scale to noisy and probabilistic settings. Classical ILP relies on discrete…

Artificial Intelligence · Computer Science 2026-05-07 Iman Sharifi , Peng Wei , Saber Fallah

OASIS: Order-Augmented Strategy for Improved Code Search

Code embeddings capture the semantic representations of code and are crucial for various code-related large language model (LLM) applications, such as code search. Previous training primarily relies on optimizing the InfoNCE loss by…

Computation and Language · Computer Science 2025-07-18 Zuchen Gao , Zizheng Zhan , Xianming Li , Erxin Yu , Ziqi Zhan , Haotian Zhang , Bin Chen , Yuqun Zhang , Jing Li

Mining Implicit Relevance Feedback from User Behavior for Web Question Answering

Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior…

Information Retrieval · Computer Science 2020-06-17 Linjun Shou , Shining Bo , Feixiang Cheng , Ming Gong , Jian Pei , Daxin Jiang

Wizard of Search Engine: Access to Information Through Conversations with Search Engines

Conversational information seeking (CIS) is playing an increasingly important role in connecting people to information. Due to the lack of suitable resource, previous studies on CIS are limited to the study of theoretical/conceptual…

Information Retrieval · Computer Science 2021-05-19 Pengjie Ren , Zhongkun Liu , Xiaomeng Song , Hongtao Tian , Zhumin Chen , Zhaochun Ren , Maarten de Rijke

AutoICE: Automatically Synthesizing Verifiable C Code via LLM-driven Evolution

Automatically synthesizing verifiable code from natural language requirements ensures software correctness and reliability while significantly lowering the barrier to adopting the techniques of formal methods. With the rise of large…

Software Engineering · Computer Science 2025-12-09 Weilin Luo , Xueyi Liang , Haotian Deng , Yanan Liu , Hai Wan

RACK: Code Search in the IDE using Crowdsourced Knowledge

Traditional code search engines often do not perform well with natural language queries since they mostly apply keyword matching. These engines thus require carefully designed queries containing information about programming APIs for code…

Software Engineering · Computer Science 2018-07-13 Mohammad Masudur Rahman , Chanchal K. Roy , David Lo

Improving Legal Case Retrieval with Brain Signals

The tasks of legal case retrieval have received growing attention from the IR community in the last decade. Relevance feedback techniques with implicit user feedback (e.g., clicks) have been demonstrated to be effective in traditional…

Information Retrieval · Computer Science 2024-03-21 Ruizhe Zhang , Qingyao Ai , Ziyi Ye , Yueyue Wu , Xiaohui Xie , Yiqun Liu

REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models

This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during…

Software Engineering · Computer Science 2024-04-17 Anthony Saieva , Saikat Chakraborty , Gail Kaiser