26
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang Research Center of IT Innovation, Academia Sinica Demo Page https://salu133445.github.io/musegan/ *these authors contributed equally to this work

MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and AccompanimentHao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan YangResearch Center of IT Innovation, Academia Sinica

Demo Page https://salu133445.github.io/musegan/

*these authors contributed equally to this work

Page 2: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Outline。Goals & Challenges。Data。Proposed Model。Results & Evaluation。Future Works

Source Code https://github.com/salu133445/museganDemo Page https://salu133445.github.io/musegan/

2

Page 3: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Goals

Generate pop music。of multiple tracks

。in piano-roll format

。using GAN with CNNs

[Source Code]https://github.com/salu133445/musegan

[Demo Page]https://salu133445.github.io/musegan/

3

Page 4: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Challenge IMultitrack Interdependency

vocal

piano

bassdrums

strings

music & clip by phycause

Multi-track GAN

4

Page 5: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Challenge IIMusic Texture

melody

chord(harmony)

Convolutional Neural Networks

5

Page 6: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Challenge IIITemporal Structure

paragraph 1 paragraph 2 paragraph 3

phrase 1 phrase 2 phrase 3 phrase 4

bar 1 bar 2 bar 3 bar 4

beat 1 beat 2 beat 3 beat 4

step 1 step 2 ··· step 24

song

phrase 2

4/4 time

6

Page 7: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Challenge IIITemporal Structure

bar 1 bar 2 bar 3 bar 4

beat 1 beat 2 beat 3 beat 4

step 1 step 2 ··· step 24

phrase 2 Fixed Structure

Convolutional Neural Networks

4/4 time

7

Page 8: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Data Representation

pitch

time

Bar 1 Bar 2 Bar 3 Bar 4time step

8

Piano-roll

polyphonic multi-track

(with symbolic timing)

Page 9: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Data Representation

pitch

time

Piano-roll

Bar 1 Bar 2 Bar 3 Bar 4polyphonic multi-track

(with symbolic timing)

9

A3

t0 t1

Page 10: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Data RepresentationMulti-track Piano-roll

pitch

time

tracks

polyphonic multi-track

(with symbolic timing)

10

Page 11: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Data Representation

11

96 time steps

84pitches 5 tracks

4 bars

a 4×96×84×5 tensor

Drums

GuitarPiano

Strings

Bass

Page 12: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Data

LPD (Lakh Pianoroll Dataset)。>170,000 multi-track piano-rolls。Derived from Lakh MIDI Dataset。Mainly pop songs

Pypianoroll (Python package)。Manipulation & Visualization。Efficient Save/Load。Parse/Write MIDI files。On PYPI (pip installable)

[Dataset]https://salu133445.github.io/musegan/dataset

[Pypianoroll]https://salu133445.github.io/pypianoroll/

12

Page 13: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Generative Adversarial Networks

X

real data

Gz~p(z) G(z)

random noise fake dataGenerator

D real/fake

Discriminator

4-bar phrases of 5 tracks

critic(wgan-gp)

13

Page 14: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

MuseGAN – An Overview

Gtemp

4 latent variables1 random noise

temporalgenerator

bargenerator

4 piano-roll matrices

Gbar

14

Page 15: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Bar Generator

MuseGAN

zzzzz

zzzz

zzzz

GGGGG

15

Page 16: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

MuseGAN

z

Bar Generator

zzzzzzzzz

zzzz

16

GGGGG

No Coordination

Coordination

track-dependent

track-independent

Page 17: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

zzzzz

MuseGAN

z

Bar GeneratorGz

GGGGG

zzzz

zzzzzzzzz

zzzz

17

GGGGG

Page 18: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

zzzzz

MuseGAN

z

Bar GeneratorGz

GGGGG

zzzz

zzzzzzzzz

zzzz

18

GGGGG

Page 19: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

TimeDependent Independent

TrackDependent Melody Groove

Independent Chords Style

zzzzz

MuseGAN

z

Bar GeneratorGz

GGGGG

zzzz

zzzzzzzzz

zzzz

19

GGGGG

ChordsStyle

Melody

Groove

Page 20: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Results

More Samples on Demo Pagehttps://salu133445.github.io/musegan/

Sample 1 Sample 2

20

BassDrumsGuitarStringsPiano

Step 0 Step 700 Step 2500 Step 6000 Step 7900

Drum pattern

Chords

Bass Line

Page 21: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Objective MetricsUPC

step

QN

step

UPC number of used pitch classes per bar

QN ratio of qualified notes

Monitor the Training

21

step2000 4000 6000 8000104

106

108

1010

1012

0

Negative Critic Loss

Page 22: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

User Study

H: harmoniousR: rhythmicMS: musically structuredC: coherentOR: overall rating

composer

jamming

hybrid

22

Page 23: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Summary。MuseGAN

◦ a novel GAN for multi-track sequence generation

◦ multi-track, polyphonic music

◦ human-AI cooperative scenario (see the paper)

。Lakh Pianoroll Dataset (LPD) (new dataset!!)

。Pypianoroll (new package!!)

23

Page 24: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Future Works

Full Song Generation

bar 1 bar 2 bar 3 bar 4

beat 1 beat 2 beat 3 beat 4

step 1 step 2 ··· step 24

phrase 2

paragraph 1 paragraph 2 paragraph 3

phrase 1 phrase 2 phrase 3 phrase 4

song

Hierarchical Temporal Structure

24

Page 25: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Future Works

Cross-modal Generation。Music + Video。Music + Lyrics。Video + Text

25

Page 26: MuseGAN: Multi-track Sequential Generative Adversarial ... · Data. LPD (Lakh Pianoroll Dataset) 。 >170,000. multi-track piano-rolls 。Derived fromLakh MIDI Dataset 。Mainly pop

Q&A MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music

Generation and Accompaniment

Source Code https://github.com/salu133445/museganDemo Page https://salu133445.github.io/musegan/