Computer Science

Deep Binarized Photonic Reservoir Computing for Ultrafast Multimedia Signal Processing

We present a deep photonic neural network architecture based on ultrafast binary optical modulation from a digital micro-mirror device (DMD), optical scattering in random medium, high-speed photodetection with a CMOS sensor, and…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Muhammad Waqar Iqbal , Mohamad Alassir , Nicolas Marsal , Damien Rontani

Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation

Large Audio Language Models (LALMs) expand jailbreak risks from token-level prompting to the full speech perception-to-reasoning pipeline, where unsafe behavior can be induced through semantics, acoustic style, signal artifacts, or internal…

Sound · Computer Science 2026-05-29 Bo-Han Feng , Yu-Hsuan Li Liang , Chien-Feng Liu , You-Hsuan Chang , Yun-Nung Chen

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Unified speech foundation models require a holistic tokenization space that is both learnable by language models and decodable into high-quality waveforms. Existing speech tokenizers, however, often fail to satisfy these requirements…

Sound · Computer Science 2026-05-29 Bohan Li , Shi Lian , Hankun Wang , Yiwei Guo , Yu Xi , Zhihan Li , Da Zheng , Colin Zhang , Kai Yu

Selection Hyper-heuristics Can Automatically Adjust the Learning Period to Optimally Solve Pseudo-Boolean Problems

The Random Gradient hyper-heuristic was recently shown to be able to learn the optimal neighbourhood size when optimizing the LeadingOnes benchmark via the Randomised Local Search (RLS) meta-heuristic. However, for this to happen, a…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Benjamin Doerr , Pietro S. Oliveto , John Alasdair Warwicker

COMET: Concept Space Dissection of the Modality Gap in Audio-Text Multimodal Contrastive Embeddings

Contrastive Language-Audio Pretraining (CLAP) models are widely used for audio understanding and support modality-agnostic condition swapping in many zero-shot applications. However, their performance is heavily affected by the modality gap…

Sound · Computer Science 2026-05-29 Yonggang Zhu , Liting Gao , Aidong Men , Wenwu Wang

Audio Deepfake Detection with Half-Truth Localisation Using Cross-Attentive Feature Fusion

Audio deepfake detection is well-studied as a binary problem, but partially manipulated speech, where a short synthesised segment is spliced into an otherwise genuine utterance, poses a harder and more realistic threat. Detecting such…

Sound · Computer Science 2026-05-29 S. Sutharya , Remya K. Sasi

Evolutionary Rule Extraction from Corporate Default Prediction Models

Small and medium-sized enterprises (SMEs) represent the majority of firms in most economies and often face financial constraints and higher vulnerability to financial distress. Predicting SME default is therefore crucial for financial…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Desirè Fabbretti , Matteo Pasquino , Elia Pacioni , Caterina Lucarelli , Davide Calvaresi

Runtime Analysis of a Compact Genetic Algorithm on a Truly Multi-valued OneMax Function

Recently, the runtime analysis of multi-valued estimation-of-distribution algorithms in the framework of Ben Jedidia et al. (TCS 2024) has made significant advancements. However, almost all existing analyses are limited to multi-valued…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Martin S. Krejca , Carsten Witt

EvoGM: Learning to Merge LLMs via Evolutionary Generative Optimization

Evolutionary model merging provides a powerful framework for the automated, training-free composition of LLMs through parameter-space search. However, existing methods predominantly rely on stochastic, hand-crafted operators that overlook…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Tao Jiang , Xinmeng Yu , Chenhao Yi , Yiling Wu , Yan Li , Ran Cheng , Dongmei Jiang , Jianguo Zhang

ChildVox: A Speech, Audio, and Large Audio-Language Model Benchmark in Understanding and Characterizing Sound across Childhood

We present ChildVox, a novel benchmark for characterizing the diverse acoustic signals through which children communicate. Specifically, ChildVox follows the full developmental trajectory from birth through school age, covering…

Sound · Computer Science 2026-05-29 Tiantian Feng , Anfeng Xu , Xuan Shi , Aditya Kommineni , Shakhrul Iman Siam , Megan Micheletti , Zhonghao Shi , Helen Tager-Flusberg , Mi Zhang , Lynn K. Perry , Catherine Lord , Daniel Messinger , Shrikanth Narayanan

WASHH: An Anchor-Aware Whale-Guided Selection Hyper-Heuristic for Continuous Optimization and SVC Configuration

Learning-assisted algorithm design often has to make reliable search decisions under small evaluation budgets, where committing to a single metaheuristic can be unreliable. We propose WASHH, a Whale-guided Adaptive Selection Hyper-Heuristic…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Yifu Zhao , Xiaofan Zou , Junhao Wei , Yanxiao Li , Baili Lu , Zhenhong Peng , Dexing Yao , Haochen Li , Qinbin He , Sio-Kei Im , Xu Yang , Yapeng Wang

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

Voice agents, artificial intelligence systems that conduct spoken conversations to complete tasks, are increasingly deployed across enterprise applications. However, no existing benchmark jointly addresses two core evaluation challenges:…

Sound · Computer Science 2026-05-29 Tara Bogavelli , Gabrielle Gauthier Melançon , Katrina Stankiewicz , Oluwanifemi Bamgbose , Fanny Riols , Hoang H. Nguyen , Raghav Mehndiratta , Lindsay Devon Brin , Joseph Marinier , Hari Subramani , Anil Madamala , Sridhar Krishna Nemala , Srinivas Sunkara

MedMosaic: A Challenging Large Scale Benchmark of Diverse Medical Audio

Medical audio data is difficult to collect due to privacy regulations and high annotation costs arising from domain expertise. Thus, existing benchmarks tend to underrepresent complex medical audio scenarios. To address this challenge, we…

Sound · Computer Science 2026-05-29 Harshit Rajgarhia , Shuubham Ojha , Asif Shaik , Akhil Pothanapalli , Rachuri Lokesh , Abhishek Mukherji , Prasanna Desikan

Architecture-Induced Recoverability Bias in Differentiable Symbolic Regression

Symbolic regression aims to recover closed-form expressions from numerical data, but in differentiable symbolic regression the recovered expression depends not only on the grammar but also on the fixed architecture through which variables…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Chakshu Gupta , Theodore J. LaGrow

BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps

Tokenizing music to fit the general framework of language models is a compelling challenge, especially considering the diverse symbolic structures in which music can be represented (e.g., sequences, grids, and graphs). To date, most…

Sound · Computer Science 2026-05-29 Lekai Qian , Haoyu Gu , Jingwei Zhao , Ziyu Wang

Beyond LLMs, Sparse Distributed Memory, and Neuromorphics <A Hyper-Dimensional SRAM-CAM "VaCoAl" for Ultra-High Speed, Ultra-Low Power, and Low Cost>

This paper reports an unexpected finding: in a deterministic hyperdimensional computing (HDC) architecture **that inverts the conventional role of Galois-field algebra -- employing it not for error correction toward a unique answer but as…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Hiroyuki Chuma , Kanji Otsuka , Yoichi Sato

Beyond Silicon: Materials, Mechanisms, and Methods for Physical Neural Computing

Physical implementations of neural computation now extend far beyond silicon hardware, encompassing substrates such as memristive devices, photonic circuits, mechanical metamaterials, microfluidic networks, chemical reaction systems, and…

Neural and Evolutionary Computing · Computer Science 2026-05-29 Stefan Fischer , Nihat Ay , Olaf Landsiedel , Esfandiar Mohammadi , Sebastian Otte , Bernd-Christian Renner , Nele Rußwinkel

EvA: An Evidence-First Audio Understanding Paradigm for LALMs

Large Audio Language Models (LALMs) still struggle in complex acoustic scenes because they often fail to preserve task-relevant acoustic evidence before reasoning begins. We identify this error pattern as the evidence bottleneck:…

Sound · Computer Science 2026-05-29 Xinyuan Xie , Shunian Chen , Zhiheng Liu , Yuhao Zhang , Zhiqiang Lv , Liyin Liang , Benyou Wang

AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching

REPresentation Alignment (REPA) improves the training of generative flow models by aligning intermediate hidden states with pretrained teacher features, but its effectiveness in token-conditioned audio Flow Matching critically depends on…

Sound · Computer Science 2026-05-29 Pengfei Zhang , Tianxin Xie , Minghao Yang , Li Liu

OmniCustom: Sync Audio-Video Customization Via Joint Audio-Video Generation Model

Existing mainstream video customization methods focus on generating identity-consistent videos based on given reference images and textual prompts. Benefiting from the rapid advancement of joint audio-video generation, this paper proposes a…

Sound · Computer Science 2026-05-29 Maomao Li , Zhen Li , Kaipeng Zhang , Guosheng Yin , Zhifeng Li , Dong Xu