7-11

Things to try

Baselines
- HMM
- RTRBM
- RNN-RBM
- N-gram language model: might be hard to sample, but could be used to score outputs
Interpolating between two sound bytes
- Run forward and backward LSTMs, combine hidden states, emit from those states
- Multiple ways to combine hidden states:
- Concatenate at each time step and use. Might be bad because further away states are neglected.
  - NMT by Jointly Learning to Align and Translate: Attention mechanism taking weighted combination of all hidden states to be interpolated over.
Constraining length of output
- Motivation: phrases are not plausibly long
- Can introduce countdown to 0 in training and sampling procedures
t-SNE of learned neural embedding, do similar chords map similarly
Plot activations of hidden state over time
- How does it know a chord has at most 4 notes? (I'm expecting to see a "chord-end" memory cell)

Discussion topics

Modulation : how to get LSTM to do it?
Interpolation : how to combine hidden states and emit? How to train?
Constraining length : sanity check, how to carry information forward across phrases?

6-5

Keras notes

To do a convolutional/time distribed operation, TimeDistributed assumes the 1st axis (excluding sample axis 0) is the time dimension. This means that Permute should be used to satisfy this assumption
For some reason my sharing of embedding matrices is only supported by the tensorflow backend...

6-4

"Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network", Denil et al 2014

Music is also sequence built up of individual measures, phrases, parts, etc, amenable to time-invariant convolution
Try building a convolutional representation for music then put discriminative classifiers on top
- e.g. Bach vs XYZ, Major vs Minor Key

Pitch Classes

Don't really matter... 0.96 accuracy on pitch classes vs 0.95 without
Future experiments should include octaves since significantly improves generated output

6-3

Many papers use pitch classes (i.e. mod 12), removing octave information...

6-2

Things to try:

Segment phrases based on fermatas
Encode (pitch, duration, chord) like (Lichtenwalter 2009)

Spent rest of day setting up keras and tensorflow; seems to be easier to use for building new models...

5-30

Hyperparam opt over constant-timestep input monophonic models

Previous inputs had a new input per (pitch|REST,duration), these experiments expand this into (pitch|REST) tokens outputted at constant duration intervals.

Best result:
{u'num_layers': 1.0,
 u'rnn_size': 512.03541493830664,
 u'seq_length': 16.002948219489955,
 u'val_loss': 0.15576986968517303,
 u'wordvec_size': 6.0277702732883034}

seq_length	rnn_size	val_loss	wordvec_size	num_layers
16.0029	512.035	0.15577	6.02777	1
16.0029	471.29	0.175634	60.1177	1
16.0029	512.035	0.194531	1	1
11.3981	512.035	0.216492	128.825	1
11.1927	122.295	0.222993	128.825	2.13249
16.0029	512.035	0.241461	128.825	1
6.25468	460.95	0.247955	128.596	2.88079
6.92759	72.8999	0.272705	128.825	8.00018
4.00037	22.6282	0.306297	11.3501	2.82846
14.4828	13.1409	0.522982	1	1
1.00973	391.191	0.527153	59.2535	6.22639
15.3933	1	0.934034	128.825	1
15.9987	1	0.998908	128.825	1
1.8637	1.47925	1.16437	94.7932	5.22018
1	1	1.30671	1	1
16.0029	1	1.37111	1.35772	7.85394
1.85749	312.957	1.43217	27.7185	1.2719
1.15772	512.035	2.05152	128.825	3.09914
3.79838	512.035	3.99939	3.2196	8.00018
4.28328	1.08049	9.4367	128.825	7.73301
1.89024	1.44436	13.1816	12.317	8.00018

Hyperparam optimization over note-based monophonic modeling

Used Spearmint to do hyperparam optimization over major soprano monophonic LSTM models
Best result val_loss=1.13967 with seq_length=6.94253, rnn_size=29.5404, wordvec_size=126.366, num_layers=1.00082, all floored.
- Sampling with temp=0.8 yielded believable melody lines

All Spearmint results:

val_loss	seq_length	rnn_size	wordvec_size	num_layers
1.81279	1	1	1	1
1.25221	4.00037	22.6282	11.3501	2.82846
1.54454	16.0029	86.9708	128.825	7.15475
1.91175	12.0003	10.0163	11.2472	6.57259
9.7702	1.03734	438.864	114.248	7.57239
3.72573	16.0029	1	128.825	1
10.5524	2.18817	142.192	1	2.83128
1.94992	1.34606	1	17.2647	8.00018
4.68856	16.0029	1	4.0772	1
3.09284	1	54.9462	21.2293	3.10958
1.84045	1	1.54607	12.446	2.11687
1.45445	16.0029	512.035	14.0071	1.09933
20.4365	2.49828	512.035	8.91201	1.17965
1.92739	15.4095	210.947	11.8957	3.7034
3.39995	7.74704	1	13.2716	6.58817
1.44211	16.0029	512.035	16.0497	1
1.83048	1	1	2.04846	1
4.70739	16.0029	1	38.4966	8.00018
3.84151	16.0029	1	128.825	1
1.41617	16.0029	512.035	128.825	8.00018
6.90625	5.17655	512.035	97.6266	8.00018
1.82316	3.89022	1.28814	1	1
1.58859	9.94785	512.035	128.825	1
1.61969	12.3247	512.035	128.825	1
1.33559	10.284	26.3633	128.825	6.37936
1.68185	4.7608	2.06105	128.825	1.00727
1.21108	4.29163	13.2033	128.825	1.62681
1.69945	1.39915	37.5867	2.38933	5.68261
2.26959	1.30102	7.32666	1	3.71789
1.20485	6.03521	14.0379	128.825	1
13.4733	4.98242	27.8648	1	8.00018
1.82551	4.37409	1.1129	128.825	1.0037
1.99838	3.55431	1	1.22266	1
1.57108	1	14.1768	3.16177	8.00018
1.36905	3.86123	4.4941	128.825	1
1.27876	5.44864	50.7731	128.825	1
1.67601	16.0029	483.363	128.825	6.73011
2.17016	1.02489	1	1.51668	6.55008
1.92507	1.52215	1	128.825	5.84352
1.17033	4.04997	48.6965	128.825	1.18371
1.45292	1	47.8396	128.825	7.77974
1.46929	16.0029	91.7265	128.825	5.99395
1.19416	4.06726	27.956	128.825	1.28507
2.51698	1.60321	20.3026	1	7.90567
4.53957	16.0029	512.035	1	7.85759
1.8674	1	1.72167	128.825	7.97761
2.04807	1.34766	3.16714	128.825	7.541
2.05087	10.6888	348.024	128.825	4.01615
2.03063	1.45055	1	3.80915	7.99896
4.48324	1.39391	338.201	18.1749	8.00018
1.8183	1.44086	1	1	1
2.18131	1.37276	1	1	4.3693
2.73249	1.48716	138.966	1.01189	7.2744
2.16369	1.41637	1	128.825	7.77549
2.24593	1.29825	1	1.00003	7.99566
2.49271	1	1.26242	1	7.51788
1.43529	1.17257	19.4415	128.825	8.00016
1.85919	4.48557	51.0752	1	1
1.27045	6.94479	85.2623	128.825	2.02529
1.29271	4.3174	41.9469	128.825	2.71045
1.16407	6.15871	41.2952	128.825	1
1.179	4.48587	53.2404	128.825	1
5.25998	3.96968	42.0234	1.14234	1
1.92197	3.77989	1	128.825	1.23153
1.67657	4.6666	23.0382	3.14412	1
14.7627	5.04117	174.806	1.52616	1
1.1712	5.66671	40.1667	128.825	1
2.09475	4.43712	16.4887	2.62326	1
1.3684	4.60999	4.36251	128.825	1
1.87167	3.95252	1	123.594	8.00018
2.0122	4.28938	1	1	1
1.72728	5.43109	4.92454	120.619	1
1.72963	3.97031	1	128.825	1
3.21394	4.18858	50.1616	123.625	8.00018
2.30705	3.70729	1	1	8.00018
1.75668	3.58032	118.189	93.7408	8.00018
2.83069	1.05197	78.8487	1.12416	8.00018
3.01681	2.96283	53.8838	126.457	8.00018
2.69333	1	12.5461	1.04687	5.61842
6.16394	16.0029	512.034	3.00566	7.79508
2.50471	3.20547	1	1.00527	8.00018
2.74393	2.86051	1	1.58627	1
1.94912	3.55173	321.378	128.825	8.00018
2.50607	3.1362	1	1.2272	7.86776
2.50117	3.40996	1	2.59141	8.00018
2.01436	1.9717	1	1.84658	8.00018
1.74399	2.13574	1	2.74339	1
1.55385	3.77291	19.4764	3.40497	1
1.84827	1	1	1.20563	1.00026
1.48099	16.0029	305.927	116.831	8.00018
3.25906	4.1078	218.305	128.825	8.00018
1.88984	3.52137	1	1.01016	1.00719
1.95496	3.10587	205.784	128.825	8.00018
2.53949	6.43489	24.0808	10.4327	5.12246
1.18219	16.0029	61.1378	128.735	2.79494
1.62688	1.01011	144.572	128.825	8.00018
1.19234	10.6007	30.2878	128.825	1.90145
1.89839	16.0029	501.778	115.486	3.64192
3.02298	6.336	48.863	1	1.01751
2.94935	4.87841	65.0807	1.58859	1.03422
1.48481	1	32.8238	5.21677	6.51185
2.57488	1	196.637	99.4593	8.00018
1.18709	2.31219	29.1906	16.6164	1
3.3022	1.00246	126.95	1	8.00018
1.35802	1.20229	35.7395	128.825	8.00018
1.17939	2.98678	21.6642	111.966	1
2.22157	1.37443	17.3358	4.96489	1.0013
1.45561	16.0029	18.6679	126.261	4.40354
1.33853	16.0029	269.284	128.097	1.01187
1.36142	15.922	42.1897	7.82053	4.28042
1.58634	3.79205	14.2155	7.52816	1.00302
2.07173	4.96631	35.1168	1	1.06558
2.16536	1.02073	60.2927	2.89752	7.69905
1.19761	4.99027	20.8433	128.825	1.00068
1.21183	3.25201	31.8954	121.463	4.74495
1.15955	7.68755	52.693	128.825	1
1.19747	15.9547	40.3288	106.232	4.79994
1.14678	14.3189	105.651	128.825	1
1.31329	4.75247	39.5341	6.5095	1
1.28955	3.83953	12.3442	128.825	6.88565
1.25079	3.35264	31.9617	128.499	7.30232
3.54208	2.79402	24.04	116.388	6.24677
1.23928	3.70084	38.0085	121.191	6.69755
1.18921	2.31037	7.9934	128.825	1
1.16043	3.37842	33.9173	128.253	1
1.55291	3.4025	90.3102	128.825	5.64659
1.19088	3.09136	47.4656	116.573	1
1.31964	3.57266	16.7673	127.438	8.00018
1.13287	2.12181	16.7073	127.608	1
1.29983	3.42929	54.3889	124.45	5.2666
1.89918	3.58951	1	2.99075	4.84539
1.15648	4.54977	24.7907	114.179	1
1.18002	3.16433	26.1107	109.408	1
1.28697	2.23085	11.4699	6.17508	1
1.47203	16.0029	424.294	96.665	8.00018
1.35923	15.7712	180.292	108.152	8.00018
1.22905	3.56232	23.4472	116.798	6.1748
1.55541	1.72011	93.4516	101.304	8.00018
2.04159	16.0029	40.1776	1	1.00011
1.1952	2.49004	20.0365	114.219	1
2.77794	15.5213	512.035	1	1.02812
1.61996	14.5473	512.035	4.56864	1
1.21182	3.58192	30.8165	128.573	7.61866
1.46842	16.0029	212.909	126.77	7.25935
1.23573	3.89539	16.4708	123.302	4.63252
1.68854	16.0029	93.8211	2.23815	1.01339
1.24737	15.7682	51.6474	117.842	1
1.18779	5.80308	23.8137	114.688	1
1.519	3.55924	80.2821	114.28	7.09611
1.26767	15.9138	163.389	128.463	1
1.58908	8.73408	60.4034	3.12543	1
1.13967	6.94253	29.5404	126.366	1.00082
1.47346	1.02767	15.9882	3.0994	6.65049
1.19142	3.78845	111.306	128.825	1.00895
6.06288	15.3437	490.379	1	8.00018
1.87639	4.9916	491.995	128.825	1
1.93319	4.25991	508.601	128.825	1
1.68181	5.70063	509.011	128.825	1
3.56743	5.02613	504.939	128.825	8.00018
1.88407	4.5674	511.515	128.825	1
1.97464	13.1559	510.938	1.60573	1.10128
1.53288	16.0029	454.708	128.478	1
3.16029	4.69018	509.507	1.20061	1
2.17807	16.0029	499.117	1.5671	1
4.56358	5.62898	456.094	120.225	8.00018
1.57462	16.0029	487.641	128.825	1.02787
1.76454	5.19752	17.6369	1.58856	1.01913
4.76278	1	511.678	1	8.00018
1.93808	2.43203	511.584	128.825	1
1.71001	2.74913	509.772	128.825	1
17.6859	3.12348	511.251	1.74208	1.23072
2.75805	4.46449	22.5683	1.67204	1
1.79666	16.0029	379.387	2.68668	1.00074
2.28637	5.92394	1	1.62136	1.46851
1.5856	2.8623	356.665	2.92364	1
1.75186	2.21219	1	2.15511	1.21592
18.9152	2.12912	168.995	1.21215	1
3.93017	4.75664	1	1.2334	6.75532
2.02585	1	1	1.65368	8.00018
1.82365	1	1	1.18676	2.68117
2.45338	16.0029	324.993	1.64182	1.00067
3.94181	2.77051	371.001	1.01043	1
2.58022	5.28293	1	1.00899	8.00018
2.32257	2.42847	488.999	75.7214	1
2.48066	5.19708	1	1.01857	1
1.83206	1	1	2.5297	1
1.98198	16.0029	239.976	1.80369	1
3.59905	2.21228	1	1.00908	8.00018
1.90486	5.60672	1	1.01175	1.68969
4.08377	11.4198	1	1.01033	1.99797
2.19399	3.45414	1	1.01273	6.39969
2.12843	5.91437	1	1.01445	1
5.06573	12.2589	252.052	1.01313	2.42166
1.89684	3.38058	1	1.01567	1.00369
1.82683	1.79312	1	1.61672	1
2.09055	1.99713	7.66155	1.02651	1.0045
3.26499	2.60936	298.33	1.03542	1
38.0935	1	387.262	1.02453	1
1.79081	2.47266	1	1.02802	1.30136
2.70179	4.08058	1	1.00678	5.66322
3.70049	5.01941	125.481	1.01641	8.00018
2.52746	2.26491	1	1.04334	8.00018
2.6144	1.34182	109.675	1.00672	7.98652
1.89099	1.51636	1.02817	1.06104	3.43684
5.01127	3.55865	188.044	1.04538	1
3.43548	1	69.1306	1.07355	8.00018
7.10486	6.85572	512.035	1.17991	2.29812
1.87445	16.0029	507.561	1.13841	1
1.63515	2.55714	340.268	105.369	1.88211
6.87511	1.59641	510.92	65.2621	1
7.40836	3.42253	329.605	1.02056	5.19144
1.44172	3.06819	436.638	124.987	1.04117
1.85288	1.63065	53.3601	1.0173	8.00018
1.97156	4.89706	70.355	1.01513	1
1.54226	16.0029	511.366	73.8461	1.01515
3.87939	5.92443	510.453	1.05408	1.48474
1.51564	3.17769	500.43	128.825	1.04849
2.05485	1	4.23242	1.12558	8.00018
1.97433	15.804	10.7363	1.35953	1.0445
1.42677	3.22435	343.821	128.825	1.10866
2.69979	2.62094	1	1	8.00018
2.80693	16.0029	508.23	1.2509	7.44169
5.56769	1.75418	115.663	82.5968	1.02226
10.0386	2.70432	376.212	90.4778	8.00018
1.75714	5.97219	511.494	128.825	1
3.61418	5.74625	501.833	1.16609	1
1.58739	3.34706	476.397	85.2811	1
1.54583	2.99822	228.601	108.462	1
3.28737	16.0029	54.0864	1.08781	8.00018
1.61078	16.0029	510.841	113.284	8.00018
7.69706	1	512.035	1	7.44544
2.67493	16.0029	441.365	1.77608	8.00018
1.74966	16.0029	50.4464	1.31085	1.07266
3.7585	16.0029	189.338	1.04997	8.00018
1.81978	2.38928	509.248	127.419	1
1.35456	3.49442	92.3896	126.596	2.37375
1.66493	2.31178	1	37.7945	3.80823
1.82544	2.72484	502.978	128.731	1
1.75726	2.87181	479.845	128.825	1
1.68781	15.0497	509.078	128.825	2.46445
1.46165	3.95094	496.274	128.056	1
1.62109	2.77635	325.136	128.444	1
9.84507	2.36965	1	1.22999	3.43455
2.12618	8.06339	37.5034	1.55497	1.03319
19.9971	3.70609	512.035	1	1.22997
2.50406	3.6462	1	1.07039	8.00018
1.74508	2.69366	126.026	1.01994	1.0059
1.84615	1.3313	1	1.006	1
6.35803	3.40397	70.1067	1.006	1
18.1778	2.2708	125.054	1.02172	8.00018
1.8263	1.6561	1	2.18745	2.95551
1.6426	1.85016	7.80418	107.429	3.65741
1.99685	2.53031	1	1	1.00799
1.82727	1.8859	1	1.01417	1.07867
10.267	2.8795	217.669	1.03202	1.09872
1.73753	2.7155	23.3139	1.00842	1
3.24402	3.02396	441.701	1.26887	1
1.34272	3.44249	238.292	128.825	1
1.93469	3.04653	1	1	1
1.36159	2.59927	163.243	124.008	1
1.81301	2.57622	510.034	128.825	1.00085
1.52424	3.09678	488.866	93.7029	1
2.34485	3.53712	492.5	105.515	8.00018
3.94889	1.54106	61.0157	1.00655	1
2.03146	3.30615	38.6263	1.53452	1
2.21316	4.7681	44.6425	1.00104	1
2.57991	1.8712	1	1.04351	8.00018

5-29

Will try:
- Train on all voices
- Split major/minor pieces apart
- Model only the duration

Observations

Low validation loss doesn't imply poor perceptual performance. In contrast, overfit models tended to yield more realistic samples
Subsetting to only major/minor pieces significantly improves sample quality
Training on all four parts significantly improves performance over using just Soprano, but introduces obvious non-melodic parts (e.g. periods of rest)

5-28

Improved preprocessing using bachbot get_chorales
- Get corpus with music21
- Transpose to Cmaj/Amin (is there a standard way to do this?)
- Strip all information except (Note+Octave|Rest, Duration)
- Write processed data to bachbot/scratch/{bwv_id}-mono.txt

Results with new preprocessing

seq length	wordvec size	num layers	rnn size	dropout	batchnorm	lr	nepoch	final train loss	final val loss
8	64	2	256	0	1	2e-3	30	0.238247	1.5794
8	64	2	128	0	1	2e-3	50	0.349	1.367
4	64	2	128	0	1	2e-3	50	0.288	1.434
4	32	2	128	0	1	2e-3	50	0.2527	1.8538
8	32	2	32	0	1	2e-3	50	1.044	1.191
8	32	2	64	0	1	2e-3	50	0.7539	1.236
8	64	2	32	0	1	2e-3	50	1.027	1.190
2	64	2	32	0	1	2e-3	50	0.783344	1.25899
4	64	2	32	0	1	2e-3	50	1.064	1.197
8	64	1	32	0	1	2e-3	50	1.022	1.188
8	64	1	32	0	1	2e-3	50	1.096	1.186
8	64	3	32	0	1	2e-3	50	0.989	1.186
8	64	3	32	0	1	2e-3	50	0.953	1.183
8	64	4	32	0	1	2e-3	50	1.0104	1.2274
8	64	4	64	0	1	2e-3	50	1.0165	1.2038
8	64	4	64	0.5	1	2e-3	27.51	1.392	1.4355
8	64	4	64	0.5	0	2e-3	25.10	1.807	1.851
6	64	3	32	0	1	2e-3	50	0.9304	1.2137
8	64	3	16	0	1	2e-3	50	1.264	1.2311
12	64	3	32	0	1	2e-3	50	1.030	1.1909

Generative results don't sound too realistic...

Try overfitting a model and sampling

seq_length=8,wordvec=128,num_layers=2,rnn_size=256,dropout=0,batchnorm=1,lr=2e-3

Sounds much better with an overfit LSTM and temperature=0.98...
- Maybe generalizable modeling isn't a good criteria...

5-25

Added extract_melody, which extracts the 0th part from music21.stream.Score and assumes they are the melody
Music representation:
- Since music21 cannot output kern, use musicXML output
- We currently include all header and dynamics info; should we strip that?

Results on musicXML monophonic melody

seq length	wordvec size	num layers	rnn size	batchnorm	lr	nepoch	final train loss	final val loss
500	64	2	256	1	2e-3	16.19	0.022378	0.029262
50	64	2	256	1	2e-3	13.41	0.028490	0.032692
100	64	2	256	1	2e-3	13.41	0.028490	0.032692

Results on kern format data

seq length	wordvec size	num layers	rnn size	batchnorm	lr	nepoch	final train loss	final val loss
50	64	2	256	0	2e-3	51	0.443295	0.619
500	64	2	256	1	2e-3	21.45	0.4094	0.5779
500	64	2	256	1	2e-3	31.00	0.440350	0.572764
500	64	2	256	1	1e-2	28.73	0.287570	0.6176
50	64	2	256	1	1e-2	13.65	0.390861	0.6316

5-23

wordvec_size=64 appears to perform best, should use for defaults in future:
- rnnsize=256
- num_layers=2
- wordvec_size=64

5-22-overnight

Training interrupted by cudnn recompilation
Results suggest val_loss does best with rnn_size=256, num_layers=2

5-5

Training on entire corpus ** BAD: kern format has K voices => each line has K space-delimited notes ** This suggests output should be a K-dimensional vector rather than character-by-character
Traning on just chorales

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiment-notes.md

experiment-notes.md

7-11

Things to try

Discussion topics

6-5

6-4

6-3

6-2

5-30

Hyperparam opt over constant-timestep input monophonic models

Hyperparam optimization over note-based monophonic modeling

5-29

Observations

5-28

Results with new preprocessing

Try overfitting a model and sampling

5-25

Results on musicXML monophonic melody

Results on kern format data

5-23

5-22-overnight

5-5

Files

experiment-notes.md

Latest commit

History

experiment-notes.md

File metadata and controls

7-11

Things to try

Discussion topics

6-5

6-4

6-3

6-2

5-30

Hyperparam opt over constant-timestep input monophonic models

Hyperparam optimization over note-based monophonic modeling

5-29

Observations

5-28

Results with new preprocessing

Try overfitting a model and sampling

5-25

Results on musicXML monophonic melody

Results on kern format data

5-23

5-22-overnight

5-5