Related papers: Structure-informed Positional Encoding for Music G…

The Power of Fragmentation: A Hierarchical Transformer Model for Structural Segmentation in Symbolic Music Generation

Symbolic Music Generation relies on the contextual representation capabilities of the generative model, where the most prevalent approach is the Transformer-based model. The learning of musical context is also related to the structural…

Sound · Computer Science 2022-07-12 Guowei Wu , Shipei Liu , Xiaoya Fan

Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation

Modelling musical structure is vital yet challenging for artificial intelligence systems that generate symbolic music compositions. This literature review dissects the evolution of techniques for incorporating coherent structure, from…

Sound · Computer Science 2024-03-14 Keshav Bhandari , Simon Colton

F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation

While music remains a challenging domain for generative models like Transformers, recent progress has been made by exploiting suitable musically-informed priors. One technique to leverage information about musical structure in Transformers…

Sound · Computer Science 2025-02-18 Manvi Agarwal , Changhong Wang , Gael Richard

Controllable deep melody generation via hierarchical music structure representation

Recent advances in deep learning have expanded possibilities to generate music, but generating a customizable full piece of music with consistent long-term structure remains a challenge. This paper introduces MusicFrameworks, a hierarchical…

Sound · Computer Science 2021-09-03 Shuqi Dai , Zeyu Jin , Celso Gomes , Roger B. Dannenberg

Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions

A great number of deep learning based models have been recently proposed for automatic music composition. Among these models, the Transformer stands out as a prominent approach for generating expressive classical piano performance with a…

Sound · Computer Science 2020-08-11 Yu-Siang Huang , Yi-Hsuan Yang

Improving Transformers using Faithful Positional Encoding

We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing…

Machine Learning · Computer Science 2024-05-17 Tsuyoshi Idé , Jokin Labaien , Pin-Yu Chen

Melody Infilling with User-Provided Structural Context

This paper proposes a novel Transformer-based model for music score infilling, to generate a music passage that fills in the gap between given past and future contexts. While existing infilling approaches can generate a passage that…

Sound · Computer Science 2022-10-07 Chih-Pin Tan , Alvin W. Y. Su , Yi-Hsuan Yang

Positional Encoding in Transformer-Based Time Series Models: A Survey

Recent advancements in transformer-based models have greatly improved time series analysis, providing robust solutions for tasks such as forecasting, anomaly detection, and classification. A crucial element of these models is positional…

Machine Learning · Computer Science 2026-05-07 Habib Irani , Vangelis Metsis

Supervised Learning for Game Music Segmentation

At present, neural network-based models, including transformers, struggle to generate memorable and readily comprehensible music from unified and repetitive musical material due to a lack of understanding of musical structure. Consequently,…

Sound · Computer Science 2026-01-21 Shangxuan Luo , Joshua Reiss

Deep Learning Techniques for Music Generation -- A Survey

This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical…

Sound · Computer Science 2019-08-09 Jean-Pierre Briot , Gaëtan Hadjeres , François-David Pachet

Generating music with sentiment using Transformer-GANs

The field of Automatic Music Generation has seen significant progress thanks to the advent of Deep Learning. However, most of these results have been produced by unconditional models, which lack the ability to interact with their users, not…

Sound · Computer Science 2022-12-22 Pedro Neves , Jose Fornari , João Florindo

Encoding Musical Style with Transformer Autoencoders

We consider the problem of learning high-level controls over the global structure of generated sequences, particularly in the context of symbolic music generation with complex language models. In this work, we present the Transformer…

Sound · Computer Science 2020-07-01 Kristy Choi , Curtis Hawthorne , Ian Simon , Monica Dinculescu , Jesse Engel

Score Transformer: Generating Musical Score from Note-level Representation

In this paper, we explore the tokenized representation of musical scores using the Transformer model to automatically generate musical scores. Thus far, sequence models have yielded fruitful results with note-level (MIDI-equivalent)…

Sound · Computer Science 2021-12-02 Masahiro Suzuki

Evaluating Disentangled Representations for Controllable Music Generation

Recent approaches in music generation rely on disentangled representations, often labeled as structure and timbre or local and global, to enable controllable synthesis. Yet the underlying properties of these embeddings remain underexplored.…

Sound · Computer Science 2026-02-17 Laura Ibáñez-Martínez , Chukwuemeka Nkama , Andrea Poltronieri , Xavier Serra , Martín Rocamora

Hierarchical Symbolic Pop Music Generation with Graph Neural Networks

Music is inherently made up of complex structures, and representing them as graphs helps to capture multiple levels of relationships. While music generation has been explored using various deep generation techniques, research on…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-13 Wen Qing Lim , Jinhua Liang , Huan Zhang

Musical Composition Style Transfer via Disentangled Timbre Representations

Music creation involves not only composing the different parts (e.g., melody, chords) of a musical work but also arranging/selecting the instruments to play the different parts. While the former has received increasing attention, the latter…

Audio and Speech Processing · Electrical Eng. & Systems 2019-06-03 Yun-Ning Hung , I-Tung Chiang , Yi-An Chen , Yi-Hsuan Yang

Compound Word Transformer: Learning to Compose Full-Song Music over Dynamic Directed Hypergraphs

To apply neural sequence models such as the Transformers to music generation tasks, one has to represent a piece of music by a sequence of tokens drawn from a finite set of pre-defined vocabulary. Such a vocabulary usually involves tokens…

Sound · Computer Science 2021-01-08 Wen-Yi Hsiao , Jen-Yu Liu , Yin-Cheng Yeh , Yi-Hsuan Yang

Do we need more complex representations for structure? A comparison of note duration representation for Music Transformers

In recent years, deep learning has achieved formidable results in creative computing. When it comes to music, one viable model for music generation are Transformer based models. However, while transformers models are popular for music…

Sound · Computer Science 2024-10-15 Gabriel Souza , Flavio Figueiredo , Alexei Machado , Deborah Guimarães

Transformer with Tree-order Encoding for Neural Program Generation

While a considerable amount of semantic parsing approaches have employed RNN architectures for code generation tasks, there have been only few attempts to investigate the applicability of Transformers for this task. Including hierarchical…

Computation and Language · Computer Science 2022-06-28 Klaudia-Doris Thellmann , Bernhard Stadler , Ricardo Usbeck , Jens Lehmann

Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks

Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. However, existing neural models have been shown to lack this basic ability in learning symbolic…

Computation and Language · Computer Science 2021-10-01 Yichen Jiang , Mohit Bansal