Related papers: Dialz: A Python Toolkit for Steering Vectors

Understanding Reasoning in Thinking Language Models via Steering Vectors

Recent advances in large language models (LLMs) have led to the development of thinking language models that generate extensive internal reasoning chains before producing responses. While these models achieve improved performance,…

Machine Learning · Computer Science 2025-10-23 Constantin Venhoff , Iván Arcuschin , Philip Torr , Arthur Conmy , Neel Nanda

DELM: a Python toolkit for Data Extraction with Language Models

Large Language Models (LLMs) have become powerful tools for annotating unstructured data. However, most existing workflows rely on ad hoc scripts, making reproducibility, robustness, and systematic evaluation difficult. To address these…

Information Retrieval · Computer Science 2025-09-26 Eric Fithian , Kirill Skobelev

ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents

We present ADVISER - an open-source, multi-domain dialog system toolkit that enables the development of multi-modal (incorporating speech, text and vision), socially-engaged (e.g. emotion recognition, engagement level prediction and…

Computation and Language · Computer Science 2020-05-06 Chia-Yu Li , Daniel Ortega , Dirk Väth , Florian Lux , Lindsey Vanderlyn , Maximilian Schmidt , Michael Neumann , Moritz Völkel , Pavel Denisov , Sabrina Jenne , Zorica Kacarevic , Ngoc Thang Vu

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization

Researchers have been studying approaches to steer the behavior of Large Language Models (LLMs) and build personalized LLMs tailored for various applications. While fine-tuning seems to be a direct solution, it requires substantial…

Computation and Language · Computer Science 2024-07-31 Yuanpu Cao , Tianrong Zhang , Bochuan Cao , Ziyi Yin , Lu Lin , Fenglong Ma , Jinghui Chen

dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python

The increasing amount of available data, computing power, and the constant pursuit for higher performance results in the growing complexity of predictive models. Their black-box nature leads to opaqueness debt phenomenon inflicting…

Machine Learning · Computer Science 2021-10-12 Hubert Baniecki , Wojciech Kretowicz , Piotr Piatyszek , Jakub Wisniewski , Przemyslaw Biecek

Distill: Domain-Specific Compilation for Cognitive Models

This paper discusses our proposal and implementation of Distill, a domain-specific compilation tool based on LLVM to accelerate cognitive models. Cognitive models explain the process of cognitive function and offer a path to human-like…

Programming Languages · Computer Science 2022-01-17 Jan Vesely , Raghavendra Pradyumna Pothukuchi , Ketaki Joshi , Samyak Gupta , Jonathan D. Cohen , Abhishek Bhattacharjee

DIAL: Direct Iterative Adversarial Learning for Realistic Multi-Turn Dialogue Simulation

Realistic user simulation is crucial for training and evaluating multi-turn dialogue systems, yet creating simulators that accurately replicate human behavior remains a significant challenge. An effective simulator must expose the failure…

Computation and Language · Computer Science 2026-05-07 Ziyi Zhu , Olivier Tieleman , Caitlin A. Stamatis , Luka Smyth , Thomas D. Hull , Daniel R. Cahn , Jinghong Chen , Matteo Malgaroli

SDialog: A Python Toolkit for End-to-End Agent Building, User Simulation, Dialog Generation, and Evaluation

We present SDialog, an MIT-licensed open-source Python toolkit that unifies dialog generation, evaluation and mechanistic interpretability into a single end-to-end framework for building and analyzing LLM-based conversational agents. Built…

Computation and Language · Computer Science 2026-05-12 Sergio Burdisso , Séverin Baroudi , Yanis Labrak , David Grunert , Pawel Cyrta , Yiyang Chen , Srikanth Madikeri , Thomas Schaaf , Esaú Villatoro-Tello , Ahmed Hassoon , Ricard Marxer , Petr Motlicek

SDialog: A Python Toolkit for End-to-End Agent Building, User Simulation, Dialog Generation, and Evaluation

We present SDialog, an MIT-licensed open-source Python toolkit that unifies dialog generation, evaluation and mechanistic interpretability into a single end-to-end framework for building and analyzing LLM-based conversational agents. Built…

Artificial Intelligence · Computer Science 2025-12-15 Sergio Burdisso , Séverin Baroudi , Yanis Labrak , David Grunert , Pawel Cyrta , Yiyang Chen , Srikanth Madikeri , Esaú Villatoro-Tello , Thomas Schaaf , Ricard Marxer , Petr Motlicek

AI Steerability 360: A Toolkit for Steering Large Language Models

The AI Steerability 360 toolkit is an extensible, open-source Python library for steering LLMs. Steering abstractions are designed around four model control surfaces: input (modification of the prompt), structural (modification of the…

Computation and Language · Computer Science 2026-03-10 Erik Miehling , Karthikeyan Natesan Ramamurthy , Praveen Venkateswaran , Irene Ko , Pierre Dognin , Moninder Singh , Tejaswini Pedapati , Avinash Balakrishnan , Matthew Riemer , Dennis Wei , Inge Vejsbjerg , Elizabeth M. Daly , Kush R. Varshney

SteerX: Disentangled Steering for LLM Personalization

Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications, including intelligent assistants that support users' daily life and work. A critical factor in building such assistants is…

Computation and Language · Computer Science 2025-10-28 Xiaoyan Zhao , Ming Yan , Yilun Qiu , Haoting Ni , Yang Zhang , Fuli Feng , Hong Cheng , Tat-Seng Chua

Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs

Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding…

Computation and Language · Computer Science 2023-10-17 Chenyang Yang , Rishabh Rustogi , Rachel Brower-Sinning , Grace A. Lewis , Christian Kästner , Tongshuang Wu

Difference-Guided Reasoning: A Temporal-Spatial Framework for Large Language Models

Large Language Models (LLMs) are important tools for reasoning and problem-solving, while they often operate passively, answering questions without actively discovering new ones. This limitation reduces their ability to simulate human-like…

Computational Engineering, Finance, and Science · Computer Science 2025-09-26 Hong Su

Letting Tutor Personas "Speak Up" for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization

With the emergence of large language models (LLMs) as a powerful class of generative artificial intelligence (AI), their use in tutoring has become increasingly prominent. Prior works on LLM-based tutoring typically learn a single tutor…

Computation and Language · Computer Science 2026-02-10 Jaewook Lee , Alexander Scarlatos , Simon Woodhead , Andrew Lan

Improving Instruction-Following in Language Models through Activation Steering

The ability to follow instructions is crucial for numerous real-world applications of language models. In pursuit of deeper insights and more powerful capabilities, we derive instruction-specific vector representations from language models…

Computation and Language · Computer Science 2025-04-15 Alessandro Stolfo , Vidhisha Balachandran , Safoora Yousefi , Eric Horvitz , Besmira Nushi

LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases

Large Language Models (LLMs) have been observed to exhibit bias in numerous ways, potentially creating or worsening outcomes for specific groups identified by protected attributes such as sex, race, sexual orientation, or age. To help…

Computation and Language · Computer Science 2025-01-30 Dylan Bouchard , Mohit Singh Chauhan , David Skarbrevik , Viren Bajaj , Zeya Ahmad

DIESEL -- Dynamic Inference-Guidance via Evasion of Semantic Embeddings in LLMs

In recent years, large language models (LLMs) have had great success in tasks such as casual conversation, contributing to significant advancements in domains like virtual assistance. However, they often generate responses that are not…

Computation and Language · Computer Science 2025-03-11 Ben Ganon , Alon Zolfi , Omer Hofman , Inderjeet Singh , Hisashi Kojima , Yuval Elovici , Asaf Shabtai

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

In this paper, we introduce EasyEdit2, a framework designed to enable plug-and-play adjustability for controlling Large Language Model (LLM) behaviors. EasyEdit2 supports a wide range of test-time interventions, including safety, sentiment,…

Computation and Language · Computer Science 2025-09-16 Ziwen Xu , Shuxun Wang , Kewei Xu , Haoming Xu , Mengru Wang , Xinle Deng , Yunzhi Yao , Guozhou Zheng , Huajun Chen , Ningyu Zhang

dynsight: an Open Python Platform for Simulation and Experimental Trajectory Data Analysis

The study of complex many-body systems via analysis of the trajectories of the units that dynamically move and interact within them is a non-trivial task. The workflow for extracting meaningful information from the raw trajectory data is…

Materials Science · Physics 2025-10-31 Simone Martino , Matteo Becchi , Andrew Tarzia , Daniele Rapetti , Giovanni M. Pavan

Steering Protein Language Models

Protein Language Models (PLMs), pre-trained on extensive evolutionary data from natural proteins, have emerged as indispensable tools for protein design. While powerful, PLMs often struggle to produce proteins with precisely specified…

Biomolecules · Quantitative Biology 2025-09-15 Long-Kai Huang , Rongyi Zhu , Bing He , Jianhua Yao