English
Related papers

Related papers: MIDI-LAB, a Powerful Visual Basic Program for Crea…

200 papers

We present MIDI-LLM, an LLM for generating multitrack MIDI music from free-form text prompts. Our approach expands a text LLM's vocabulary to include MIDI tokens, and uses a two-stage training recipe to endow text-to-MIDI abilities. By…

Sound · Computer Science 2025-11-07 Shih-Lun Wu , Yoon Kim , Cheng-Zhi Anna Huang

Symbolic music research plays a crucial role in music-related machine learning, but MIDI data can be complex for those without musical expertise. To address this issue, we present MidiTok Visualizer, a web application designed to facilitate…

Generative models guided by text prompts are increasingly becoming more popular. However, no text-to-MIDI models currently exist due to the lack of a captioned MIDI dataset. This work aims to enable research that combines LLMs with symbolic…

Audio and Speech Processing · Electrical Eng. & Systems 2025-08-08 Jan Melechovsky , Abhinaba Roy , Dorien Herremans

Current methods for creating drum loop audio in digital music production, such as using one-shot samples or resampling, often demand non-trivial efforts of creators. While recent generative models achieve high fidelity and adhere to text,…

In this paper, we introduce Foley Music, a system that can synthesize plausible music for a silent video clip about people playing musical instruments. We first identify two key intermediate representations for a successful video to music…

Computer Vision and Pattern Recognition · Computer Science 2020-07-22 Chuang Gan , Deng Huang , Peihao Chen , Joshua B. Tenenbaum , Antonio Torralba

Recent advances in multimodal large language models (MLLM) for audio music have demonstrated strong capabilities in music understanding, yet symbolic music, a fundamental representation of musical structure, remains unexplored. In this…

Multimedia · Computer Science 2026-01-30 Meng Yang , Jon McCormack , Maria Teresa Llano , Wanchao Su , Chao Lei

Multimodal music generation aims to produce music from diverse input modalities, including text, videos, and images. Existing methods use a common embedding space for multimodal fusion. Despite their effectiveness in other modalities, their…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Baisen Wang , Le Zhuo , Zhaokai Wang , Chenxi Bao , Wu Chengjing , Xuecheng Nie , Jiao Dai , Jizhong Han , Yue Liao , Si Liu

Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and…

Musicians mostly have to rely on their ears when they want to analyze what they play, for example to detect errors. Since hearing is sequential, it is not possible to quickly grasp an overview over one or multiple recordings of a whole…

Human-Computer Interaction · Computer Science 2026-03-26 Frank Heyen , Michael Sedlmair

Music videos, as a prevalent form of multimedia entertainment, deliver engaging audio-visual experiences to audiences and have gained immense popularity among singers and fans. Creators can express their interpretations of music naturally…

Human-Computer Interaction · Computer Science 2025-04-25 Chuer Chen , Shengqi Dang , Yuqi Liu , Nanxuan Zhao , Yang Shi , Nan Cao

Synthesizers are powerful tools that allow musicians to create dynamic and original sounds. Existing commercial interfaces for synthesizers typically require musicians to interact with complex low-level parameters or to manage large…

Human-Computer Interaction · Computer Science 2024-02-22 Stephen Brade , Bryan Wang , Mauricio Sousa , Gregory Lee Newsome , Sageev Oore , Tovi Grossman

While end-to-end lyrics-to-song models offer convenience for casual users, professional songwriters require score-to-song systems that allow them to retain authorship over the core melody. However, existing score-to-song methods are limited…

This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical…

Sound · Computer Science 2019-08-09 Jean-Pierre Briot , Gaëtan Hadjeres , François-David Pachet

We present a Python library, called Midi Miner, that can calculate tonal tension and classify different tracks. MIDI (Music Instrument Digital Interface) is a hardware and software standard for communicating musical events between digital…

Sound · Computer Science 2020-05-27 Rui Guo , Dorien Herremans , Thor Magnusson

Visuals can enhance our experience of music, owing to the way they can amplify the emotions and messages conveyed within it. However, creating music visualization is a complex, time-consuming, and resource-intensive process. We introduce…

Human-Computer Interaction · Computer Science 2023-09-29 Vivian Liu , Tao Long , Nathan Raw , Lydia Chilton

Rapid advancements in artificial intelligence have significantly enhanced generative tasks involving music and images, employing both unimodal and multimodal approaches. This research develops a model capable of generating music that…

Sound · Computer Science 2024-09-13 Tanisha Hisariya , Huan Zhang , Jinhua Liang

We present the MIDInfinite, a web application capable of generating symbolic music using a large-scale generative AI model locally on commodity hardware. Creating this demo involved porting the Anticipatory Music Transformer, a large…

Sound · Computer Science 2024-11-15 Xun Zhou , Charlie Ruan , Zihe Zhao , Tianqi Chen , Chris Donahue

Generating music is an interesting and challenging problem in the field of machine learning. Mimicking human creativity has been popular in recent years, especially in the field of computer vision and image processing. With the advent of…

Sound · Computer Science 2020-11-03 Ashish Ranjan , Varun Nagesh Jolly Behera , Motahar Reza

Deep learning-based probabilistic models of musical data are producing increasingly realistic results and promise to enter creative workflows of many kinds. Yet they have been little-studied in a performance setting, where the results of…

Sound · Computer Science 2024-03-20 Victor Shepardson , Jack Armitage , Thor Magnusson

With the rise of artificial intelligence in recent years, there has been a rapid increase in its application towards creative domains, including music. There exist many systems built that apply machine learning approaches to the problem of…

Human-Computer Interaction · Computer Science 2025-04-22 Renaud Bougueng Tchemeube , Jeff Ens , Philippe Pasquier
‹ Prev 1 2 3 10 Next ›