Midi 120: Unsupervised learning and fine-tuning #4

WojciechMat · 2023-10-26T16:45:07Z

T5 denoising

As described in Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, T5 model is pre-trained on denoising objective with sentinel tokens as masks.

In this implementation I have masking_probability * len(sequence) of tokens masked at random. Then each mask is replaced with sentinel token of increasing id, so each sentinel token in a given sequence is unique. If there are several <MASK> tokens one after another they are replaced with a single sentinel token.

Target corresponding with this source sequence is a sequence of tokens masked in source. Other tokens are masked with sentinel tokens.

Representation of this masking method on text from the above paper:

Model

The default model is a full-sized 44M-parameter T5 model, trained with 3e-05 learning rate with 0.15 note masking probability for 4 epochs on maestro-v1-sustain.
See its run here.

Dashboard

To download the model run

python -m dashboard.download_models

Then, you can look at model predictions using

PYTHONPATH=. streamlit run --server.port 4466 dashboard/denoise/main.py

WojciechMat · 2023-10-27T12:51:42Z

I have done an experiment on optimizing encoders and training and here are the results (the values are in seconds):

The differences are small but in a quest for perfection I feel obligated to strive for the best possible solution.

roszcz · 2023-10-27T14:49:48Z

I have done an experiment on optimizing encoders and training and here are the results (the values are in seconds): The differences are small but in a quest for perfection I feel obligated to strive for the best possible solution.

I'm not sure what I'm looking at, can you describe what exactly are you comparing here? Also please don't use images to share text information - try to take advantage of the text formatting options provided by github :)

I'm all about perfection here, but make sure that you're not hanging up on early micro optimizations - even if you can do something 5x faster it's not going to be very productive if that thing is only 1% of the total computational cost :)

WojciechMat · 2023-11-08T21:54:18Z

Getting somewhere 🚀

The model

I picked a learning rate of 3e-05, trained a full-sized model (44M parameters) for 4 epochs on maestro-v1-sustain.

The model has learnt to formulate notes with pitch-time-velocity sequences. We are actually able to listen to model predictions!
They sound very good but I wonder what will happen if I were to use more masks (15% of original sequence is masked right now).

I will probably try to make the dashboard look more like Tomek's BERT inference dashboard to highlight notes predicted by the model.

Some of the changes:

PEP 526

I started using variable annotations from PEP 526 where I find them helpful.

eos_token_id

I was using the default 1 as eos token id in the T5Config...
I use

config = T5Config(
        vocab_size=vocab_size(train_cfg),
        decoder_start_token_id=start_token_id,
        pad_token_id=pad_token_id,
        eos_token_id=pad_token_id,
        use_cache=False,
        d_model=train_cfg.model.d_model,
        d_kv=train_cfg.model.d_kv,
        d_ff=train_cfg.model.d_ff,
        num_layers=train_cfg.model.num_layers,
        num_heads=train_cfg.model.num_heads,
    )

now.

WojciechMat changed the base branch from master to MIDI-119/classic-tok October 26, 2023 16:45

WojciechMat marked this pull request as draft October 26, 2023 17:55

WojciechMat marked this pull request as ready for review November 10, 2023 11:08

WojciechMat requested a review from roszcz November 10, 2023 11:08

WojciechMat changed the base branch from MIDI-119/classic-tok to master December 26, 2023 16:30

WojciechMat added 22 commits December 26, 2023 17:36

velocity encoder

fc6d2ee

fixed average dist, training for single token

d43008a

lower resolution start in quantization

c2ab363

run on cuda:0 by default

410d03f

new masked midi encoder for T5 unsupervised learning

9a4e19b

clean

82a6331

comment

ea29489

cleaner better code, MaskedMidiDataset

6d7934a

denoising training, remove lr schedule

b944ebc

training on a large dataset

f0e4601

fix dataset building

8a0c5ee

optimization - use np and torch

5becd1d

fix AT record slicing...

f8dd742

fix bug in masking

89c9459

remove printing

ba00168

filter incorrect sequences

69f3104

denoise checkpoint dir

5752447

denosing initial dashboard

02cfcc2

fix untokenize, add decode method to maskedmidiencoder

4c29459

print tokens on a dashboard

333f179

decoder_start_id = start_id

c7251f4

fix:

1d89012

WojciechMat added 28 commits December 26, 2023 17:37

change config

51e8f5f

pre-training

8fffb9c

vocab size fix

3358d6b

fix vocab size

6c92834

remove prints

5363b9d

change default model path

b289bfe

finetuned velocity model

87899a9

clean

29d1ec6

change model name, change run name for finetuning

192eeca

change distance calculation

079d020

legacy

bbf94bf

update dashboard

9fcb65e

fix finetuning

681dd0c

expand source on click

ea22e1a

merge

8bfacef

dashboard for denoising on single-token-per-note tokenization

f7b68a2

fix base_lr when finetuning

ef8567a

cleanup, configs modification

e19d90e

fix dashboard, pre_defined_architectures added

e755672

denoise checkpoint

becf0f9

streamlit_pianoroll

0bcf484

change distance calculation

011f4c9

better dashboard

96444f4

validation every 1000*log_frequency steps

d083ab6

change config

96379d7

200*log_frequency

965d337

merge

d6dd095

pre-commit hooks

5f4b377

WojciechMat force-pushed the MIDI-120/unsupervised-training branch from 6abb305 to 5f4b377 Compare December 26, 2023 16:44

WojciechMat changed the title ~~Midi 120/unsupervised training~~ Midi 120: Unsupervised learning and fine-tuning Dec 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Midi 120: Unsupervised learning and fine-tuning #4

Midi 120: Unsupervised learning and fine-tuning #4

Uh oh!

WojciechMat commented Oct 26, 2023 •

edited

Loading

Uh oh!

WojciechMat commented Oct 27, 2023 •

edited

Loading

Uh oh!

roszcz commented Oct 27, 2023

Uh oh!

WojciechMat commented Nov 8, 2023

Uh oh!

Uh oh!

Midi 120: Unsupervised learning and fine-tuning #4

Are you sure you want to change the base?

Midi 120: Unsupervised learning and fine-tuning #4

Uh oh!

Conversation

WojciechMat commented Oct 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

T5 denoising

Model

Dashboard

Uh oh!

WojciechMat commented Oct 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roszcz commented Oct 27, 2023

Uh oh!

WojciechMat commented Nov 8, 2023

Getting somewhere 🚀

The model

Some of the changes:

PEP 526

eos_token_id

Uh oh!

Uh oh!

WojciechMat commented Oct 26, 2023 •

edited

Loading

WojciechMat commented Oct 27, 2023 •

edited

Loading