English
Related papers

Related papers: Generating Sample-Based Musical Instruments Using …

200 papers

We introduce the text-to-instrument task, which aims at generating sample-based musical instruments based on textual prompts. Accordingly, we propose InstrumentGen, a model that extends a text-prompted generative audio framework to…

Audio and Speech Processing · Electrical Eng. & Systems 2023-11-09 Shahan Nercessian , Johannes Imort

Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including…

Sound · Computer Science 2019-06-25 Adrien Bitton , Philippe Esling , Antoine Caillon , Martin Fouilleul

Machine-learning techniques have been recently used with spectacular results to generate artefacts such as music or text. However, these techniques are still unable to capture and generate artefacts that are convincingly structured. In this…

Artificial Intelligence · Computer Science 2017-03-03 Pierre Roy , Alexandre Papadopoulos , François Pachet

While most music generation models use textual or parametric conditioning (e.g. tempo, harmony, musical genre), we propose to condition a language model based music generation system with audio input. Our exploration involves two distinct…

Sound · Computer Science 2024-07-31 Simon Rouard , Yossi Adi , Jade Copet , Axel Roebel , Alexandre Défossez

We tackle the problem of generating audio samples conditioned on descriptive text captions. In this work, we propose AaudioGen, an auto-regressive generative model that generates audio samples conditioned on text inputs. AudioGen operates…

How does textual representation of audio relate to the Large Language Model's (LLMs) learning about the audio world? This research investigates the extent to which LLMs can be prompted to generate audio, despite their primary training in…

We demonstrate how conditional generation from diffusion models can be used to tackle a variety of realistic tasks in the production of music in 44.1kHz stereo audio with sampling-time guidance. The scenarios we consider include…

Sound · Computer Science 2023-12-06 Mark Levy , Bruno Di Giorgi , Floris Weers , Angelos Katharopoulos , Tom Nickson

We tackle the task of conditional music generation. We introduce MusicGen, a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised…

Progress in the task of symbolic music generation may be lagging behind other tasks like audio and text generation, in part because of the scarcity of symbolic training data. In this paper, we leverage the greater scale of audio music data…

In recent years, text-to-audio systems have achieved remarkable success, enabling the generation of complete audio segments directly from text descriptions. While these systems also facilitate music creation, the element of human creativity…

Sound · Computer Science 2025-04-15 Weixuan Yuan , Qadeer Khan , Vladimir Golkov

Deep generative models have recently achieved impressive performance in speech and music synthesis. However, compared to the generation of those domain-specific sounds, generating general sounds (such as siren, gunshots) has received less…

Audio and Speech Processing · Electrical Eng. & Systems 2021-10-07 Xubo Liu , Turab Iqbal , Jinzheng Zhao , Qiushi Huang , Mark D. Plumbley , Wenwu Wang

This paper presents a novel approach to neural instrument sound synthesis using a two-stage semi-supervised learning framework capable of generating pitch-accurate, high-quality music samples from an expressive timbre latent space. Existing…

Sound · Computer Science 2025-10-07 Christian Limberg , Fares Schulz , Zhe Zhang , Stefan Weinzierl

Adapting learning materials to the level of skill of a student is important in education. In the context of music training, one essential ability is sight-reading -- playing unfamiliar scores at first sight -- which benefits from…

Sound · Computer Science 2025-09-23 Pedro Ramoneda , Masahiro Suzuki , Akira Maezawa , Xavier Serra

Sequence modeling with neural networks has lead to powerful models of symbolic music data. We address the problem of exploiting these models to reach creative musical goals, by combining with human input. To this end we generalise previous…

Artificial Intelligence · Computer Science 2017-10-03 Christian Walder , Dongwoo Kim

LLM-powered code generation has the potential to revolutionize creative coding endeavors, such as live-coding, by enabling users to focus on structural motifs over syntactic details. In such domains, when prompting an LLM, users may benefit…

Multimedia · Computer Science 2025-09-25 Sam Kouteili , Hiren Madhu , George Typaldos , Mark Santolucito

Diffusion models have shown promising results in cross-modal generation tasks, including text-to-image and text-to-audio generation. However, generating music, as a special type of audio, presents unique challenges due to limited…

Sound · Computer Science 2023-08-04 Ke Chen , Yusong Wu , Haohe Liu , Marianna Nezhurina , Taylor Berg-Kirkpatrick , Shlomo Dubnov

Generative artificial intelligence models can be a valuable aid to music composition and live performance, both to aid the professional musician and to help democratize the music creation process for hobbyists. Here we present a novel…

Sound · Computer Science 2022-09-22 Ignacio J. Tripodi

Despite advances in deep algorithmic music generation, evaluation of generated samples often relies on human evaluation, which is subjective and costly. We focus on designing a homogeneous, objective framework for evaluating samples of…

Automatic music generation is an interdisciplinary research topic that combines computational creativity and semantic analysis of music to create automatic machine improvisations. An important property of such a system is allowing the user…

Sound · Computer Science 2020-03-03 Ke Chen , Gus Xia , Shlomo Dubnov

In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time. We show that our model, which profits from combining memory-less modules, namely autoregressive multilayer…

‹ Prev 1 2 3 10 Next ›