Generative Modeling of Bach-Style Symbolic Music: A Comparative Study of Autoregressive, Latent-Variable, and Adversarial Approaches

2026-06-11 • Sound

SoundMachine Learning

AI summaryⓘ

The authors compare three types of AI models to create piano music in the style of Bach using digital note data. They find that models that generate music step-by-step with added attention create the most musically clear pieces. Models that use special ways to compress music information (vector quantization) make better patterns than some other compression methods. GANs capture small music patterns but are harder to train and less consistent at mimicking Bach. This work shows the strengths and weaknesses of different AI methods for making symbolic music.

Generative modelingBach-style musicMIDIAutoregressive LSTMAttention mechanismVariational Autoencoder (VAE)Vector quantizationGenerative Adversarial Networks (GANs)PolyphonyPosterior collapse

Authors

Kyuil Lee, Dezhi Yu, Yongkang Huang

Abstract

We study generative modeling of Bach-style symbolic piano music using a shared MIDI corpus and three model families: autoregressive LSTMs with attention, latent-variable models including recurrent VAEs and vector-quantized VAEs, and generative adversarial networks. We compare their ability to model polyphonic note sequences, learn useful latent representations, and generate stylistically coherent compositions. Our experiments show that the autoregressive LSTM with attention produces the most musically coherent samples, while vector quantization helps mitigate posterior collapse and yields more structured outputs than conventional recurrent VAEs. The adversarial approach captures local pitch patterns but remains difficult to train and generalizes less reliably to Bach's style. These results highlight the relative strengths and failure modes of autoregressive, latent-variable, and adversarial approaches for symbolic music generation.

View PDFOpen arXiv