Related papers: LLMs for Domain Generation Algorithm Detection

Fine-tuning Large Language Models for DGA and DNS Exfiltration Detection

Domain Generation Algorithms (DGAs) are malicious techniques used by malware to dynamically generate seemingly random domain names for communication with Command & Control (C&C) servers. Due to the fast and simple generation of DGA domains,…

Cryptography and Security · Computer Science 2024-11-08 Md Abu Sayed , Asif Rahman , Christopher Kiekintveld , Sebastian Garcia

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in securing modern codebases. This paper presents a comparative study on the effectiveness of…

Software Engineering · Computer Science 2026-01-05 Md Hasan Saju , Maher Muhtadi , Akramul Azim

Large Language Models for Anomaly Detection in Computational Workflows: from Supervised Fine-Tuning to In-Context Learning

Anomaly detection in computational workflows is critical for ensuring system reliability and security. However, traditional rule-based methods struggle to detect novel anomalies. This paper leverages large language models (LLMs) for…

Software Engineering · Computer Science 2024-07-26 Hongwei Jin , George Papadimitriou , Krishnan Raghavan , Pawel Zuk , Prasanna Balaprakash , Cong Wang , Anirban Mandal , Ewa Deelman

Building Domain-Specific Small Language Models via Guided Data Generation

Large Language Models (LLMs) have shown remarkable success in supporting a wide range of knowledge-intensive tasks. In specialized domains, there is growing interest in leveraging LLMs to assist subject matter experts with domain-specific…

Computation and Language · Computer Science 2025-12-01 Aman Kumar , Ekant Muljibhai Amin , Xian Yeow Lee , Lasitha Vidyaratne , Ahmed K. Farahat , Dipanjan D. Ghosh , Yuta Koreeda , Chetan Gupta

Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings

Domain generation algorithms (DGAs) are frequently employed by malware to generate domains used for connecting to command-and-control (C2) servers. Recent work in DGA detection leveraged deep learning architectures like convolutional neural…

Cryptography and Security · Computer Science 2019-01-29 Joewie J. Koh , Barton Rhodes

Predicting Domain Generation Algorithms with Long Short-Term Memory Networks

Various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to a command and control (C&C) server. In order to block DGA C&C traffic, security organizations must…

Cryptography and Security · Computer Science 2016-11-04 Jonathan Woodbridge , Hyrum S. Anderson , Anjum Ahuja , Daniel Grant

CyberLLM-FINDS 2025: Instruction-Tuned Fine-tuning of Domain-Specific LLMs with Retrieval-Augmented Generation and Graph Integration for MITRE Evaluation

Large Language Models (LLMs) such as Gemma-2B have shown strong performance in various natural language processing tasks. However, general-purpose models often lack the domain expertise required for cybersecurity applications. This work…

Cryptography and Security · Computer Science 2026-01-13 Vasanth Iyer , Leonardo Bobadilla , S. S. Iyengar

Command & Control (C2) Traffic Detection Via Algorithm Generated Domain (Dga) Classification Using Deep Learning And Natural Language Processing

The sophistication of modern malware, specifically regarding communication with Command and Control (C2) servers, has rendered static blacklist-based defenses obsolete. The use of Domain Generation Algorithms (DGA) allows attackers to…

Machine Learning · Computer Science 2025-12-10 Maria Milena Araujo Felix

SLearnLLM: A Self-Learning Framework for Efficient Domain-Specific Adaptation of Large Language Models

When using supervised fine-tuning (SFT) to adapt large language models (LLMs) to specific domains, a significant challenge arises: should we use the entire SFT dataset for fine-tuning? Common practice often involves fine-tuning directly on…

Computation and Language · Computer Science 2025-05-26 Xiang Liu , Zhaoxiang Liu , Peng Wang , Kohou Wang , Huan Hu , Kai Wang , Shiguo Lian

Learning to Poison Large Language Models for Downstream Manipulation

The advent of Large Language Models (LLMs) has marked significant achievements in language processing and reasoning capabilities. Despite their advancements, LLMs face vulnerabilities to data poisoning attacks, where the adversary inserts…

Machine Learning · Computer Science 2025-05-30 Xiangyu Zhou , Yao Qiang , Saleh Zare Zade , Mohammad Amin Roshani , Prashant Khanduri , Douglas Zytko , Dongxiao Zhu

Large Language Model-Aware In-Context Learning for Code Generation

Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation. LLMs take a prompt consisting of requirement-code examples and a new requirement as input, and output new programs. Existing studies…

Software Engineering · Computer Science 2023-10-17 Jia Li , Ge Li , Chongyang Tao , Jia Li , Huangzhao Zhang , Fang Liu , Zhi Jin

DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection

Software vulnerability detection is generally supported by automated static analysis tools, which have recently been reinforced by deep learning (DL) models. However, despite the superior performance of DL-based approaches over rule-based…

Software Engineering · Computer Science 2024-05-03 Yanjing Yang , Xin Zhou , Runfeng Mao , Jinwei Xu , Lanxin Yang , Yu Zhangm , Haifeng Shen , He Zhang

Differentiation-Based Extraction of Proprietary Data from Fine-Tuned LLMs

The increasing demand for domain-specific and human-aligned Large Language Models (LLMs) has led to the widespread adoption of Supervised Fine-Tuning (SFT) techniques. SFT datasets often comprise valuable instruction-response pairs, making…

Cryptography and Security · Computer Science 2025-06-24 Zongjie Li , Daoyuan Wu , Shuai Wang , Zhendong Su

Fine-tuning Large Language Models for Domain-specific Machine Translation

Large language models (LLMs) have shown great potential in domain-specific machine translation (MT). However, one major issue is that LLMs pre-trained on general domain corpus might not generalize well to specific domains due to the lack of…

Computation and Language · Computer Science 2024-12-18 Jiawei Zheng , Hanghai Hong , Feiyan Liu , Xiaoli Wang , Jingsong Su , Yonggui Liang , Shikai Wu

Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability Collapse

The efficacy of large language models (LLMs) is heavily dependent on the quality of the underlying data, particularly within specialized domains. A common challenge when fine-tuning LLMs for domain-specific applications is the potential…

Computation and Language · Computer Science 2024-03-15 Jianwei Sun , Chaoyang Mei , Linlin Wei , Kaiyu Zheng , Na Liu , Ming Cui , Tianyi Li

A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation

Natural Language to Code Generation has made significant progress in recent years with the advent of Large Language Models(LLMs). While generation for general-purpose languages like C, C++, and Python has improved significantly, LLMs…

Software Engineering · Computer Science 2024-07-04 Nastaran Bassamzadeh , Chhaya Methani

Generating consistent PDDL domains with Large Language Models

Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup. However, ensuring that actions are consistent within domains still remains a challenging task. In this paper…

Robotics · Computer Science 2024-04-12 Pavel Smirnov , Frank Joublin , Antonello Ceravola , Michael Gienger

Large Language Model-Based Framework for Explainable Cyberattack Detection in Automatic Generation Control Systems

The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While…

Cryptography and Security · Computer Science 2025-08-27 Muhammad Sharshar , Ahmad Mohammad Saber , Davor Svetinovic , Amr M. Youssef , Deepa Kundur , Ehab F. El-Saadany

Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls

In the current cybersecurity landscape, protecting military devices such as communication and battlefield management systems against sophisticated cyber attacks is crucial. Malware exploits vulnerabilities through stealth methods, often…

Cryptography and Security · Computer Science 2024-05-16 Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Gregorio Martínez Pérez

The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Domain generation algorithms (DGAs) prevent the connection between a botnet and its master from being blocked by generating a large number of domain names. Promising single-data-source approaches have been proposed for separating benign…

Cryptography and Security · Computer Science 2021-09-27 Arthur Drichel , Benedikt Holmes , Justus von Brandt , Ulrike Meyer