English
Related papers

Related papers: Improving Audio Caption Fluency with Automatic Err…

200 papers

Automated audio captioning (AAC) is the task of automatically creating textual descriptions (i.e. captions) for the contents of a general audio signal. Most AAC methods are using existing datasets to optimize and/or evaluate upon. Given the…

Sound · Computer Science 2021-07-19 Jan Berg , Konstantinos Drossos

Automated audio captioning is a cross-modal translation task that aims to generate natural language descriptions for given audio clips. This task has received increasing attention with the release of freely available datasets in recent…

Audio and Speech Processing · Electrical Eng. & Systems 2022-09-28 Xinhao Mei , Xubo Liu , Mark D. Plumbley , Wenwu Wang

Automated Audio Captioning (AAC) is the task of generating natural language descriptions given an audio stream. A typical AAC system requires manually curated training data of audio segments and corresponding text caption annotations. The…

Audio and Speech Processing · Electrical Eng. & Systems 2023-09-15 Soham Deshmukh , Benjamin Elizalde , Dimitra Emmanouilidou , Bhiksha Raj , Rita Singh , Huaming Wang

Automated audio captioning (AAC), a task that mimics human perception as well as innovatively links audio processing and natural language processing, has overseen much progress over the last few years. AAC requires recognizing contents such…

Sound · Computer Science 2023-11-17 Xuenan Xu , Zeyu Xie , Mengyue Wu , Kai Yu

Automated audio captioning (AAC) is an audio-to-text task to describe audio contents in natural language. Recently, the advancements in large language models (LLMs), with improvements in training approaches for audio encoders, have opened…

Sound · Computer Science 2024-06-26 Jizhong Liu , Gang Li , Junbo Zhang , Heinrich Dinkel , Yongqing Wang , Zhiyong Yan , Yujun Wang , Bin Wang

Automated audio captioning (AAC) is the task of automatically generating textual descriptions for general audio signals. A captioning system has to identify various information from the input signal and express it with natural language.…

Machine Learning · Computer Science 2021-10-15 Benno Weck , Xavier Favory , Konstantinos Drossos , Xavier Serra

One of the problems with automated audio captioning (AAC) is the indeterminacy in word selection corresponding to the audio event/scene. Since one acoustic event/scene can be described with several words, it results in a combinatorial…

Audio and Speech Processing · Electrical Eng. & Systems 2020-08-11 Yuma Koizumi , Ryo Masumura , Kyosuke Nishida , Masahiro Yasuda , Shoichiro Saito

In recent years, datasets of paired audio and captions have enabled remarkable success in automatically generating descriptions for audio clips, namely Automated Audio Captioning (AAC). However, it is labor-intensive and time-consuming to…

Sound · Computer Science 2023-09-22 Theodoros Kouzelis , Vassilis Katsouros

Automated Audio captioning (AAC) is a cross-modal translation task that aims to use natural language to describe the content of an audio clip. As shown in the submissions received for Task 6 of the DCASE 2021 Challenges, this problem has…

Audio and Speech Processing · Electrical Eng. & Systems 2021-11-23 Xubo Liu , Qiushi Huang , Xinhao Mei , Tom Ko , H Lilian Tang , Mark D. Plumbley , Wenwu Wang

Automated audio captioning (AAC) has developed rapidly in recent years, involving acoustic signal processing and natural language processing to generate human-readable sentences for audio clips. The current models are generally based on the…

Sound · Computer Science 2021-10-13 Zhongjie Ye , Helin Wang , Dongchao Yang , Yuexian Zou

Automated audio captioning is multi-modal translation task that aim to generate textual descriptions for a given audio clip. In this paper we propose a full Transformer architecture that utilizes Patchout as proposed in [1], significantly…

The analysis, processing, and extraction of meaningful information from sounds all around us is the subject of the broader area of audio analytics. Audio captioning is a recent addition to the domain of audio analytics, a cross-modal…

Audio and Speech Processing · Electrical Eng. & Systems 2023-05-04 Sandeep Kothinti , Dimitra Emmanouilidou

Automated audio captioning (AAC) aims to generate informative descriptions for various sounds from nature and/or human activities. In recent years, AAC has quickly attracted research interest, with state-of-the-art systems now relying on a…

Automated Audio captioning (AAC) is a cross-modal task that generates natural language to describe the content of input audio. Most prior works usually extract single-modality acoustic features and are therefore sub-optimal for the…

Sound · Computer Science 2022-04-13 Chen Chen , Nana Hou , Yuchen Hu , Heqing Zou , Xiaofeng Qi , Eng Siong Chng

Automated audio captioning (AAC) aims at generating summarizing descriptions for audio clips. Multitudinous concepts are described in an audio caption, ranging from local information such as sound events to global information like acoustic…

Sound · Computer Science 2021-02-24 Xuenan Xu , Heinrich Dinkel , Mengyue Wu , Zeyu Xie , Kai Yu

Automated Audio Captioning (AAC) aims to generate natural textual descriptions for input audio signals. Recent progress in audio pre-trained models and large language models (LLMs) has significantly enhanced audio understanding and textual…

Audio and Speech Processing · Electrical Eng. & Systems 2024-10-15 Wenxi Chen , Ziyang Ma , Xiquan Li , Xuenan Xu , Yuzhe Liang , Zhisheng Zheng , Kai Yu , Xie Chen

Automated Audio Captioning (AAC) aims to develop systems capable of describing an audio recording using a textual sentence. In contrast, Audio-Text Retrieval (ATR) systems seek to find the best matching audio recording(s) for a given…

Computation and Language · Computer Science 2023-08-30 Etienne Labbé , Thomas Pellegrini , Julien Pinquier

The goal of audio captioning is to translate input audio into its description using natural language. One of the problems in audio captioning is the lack of training data due to the difficulty in collecting audio-caption pairs by crawling…

Audio and Speech Processing · Electrical Eng. & Systems 2020-12-15 Yuma Koizumi , Yasunori Ohishi , Daisuke Niizumi , Daiki Takeuchi , Masahiro Yasuda

The Automated Audio Captioning (AAC) task asks models to generate natural language descriptions of an audio input. Evaluating these machine-generated audio captions is a complex task that requires considering diverse factors, among them,…

Computation and Language · Computer Science 2025-08-12 Tsung-Han Wu , Joseph E. Gonzalez , Trevor Darrell , David M. Chan

The Automated Audio Captioning (AAC) task aims to describe an audio signal using natural language. To evaluate machine-generated captions, the metrics should take into account audio events, acoustic scenes, paralinguistics, signal…

Sound · Computer Science 2024-11-06 Satvik Dixit , Soham Deshmukh , Bhiksha Raj
‹ Prev 1 2 3 10 Next ›