深度学习 | 吴恩达序列模型专项课程第一周编程作业实验3 --- Jazz Music Generator - 好文

吴恩达深度学习专项课程的所有实验均采用iPython Notebooks实现，不熟悉的朋友可以提前使用一下Notebooks。

完整代码(中文翻译版带答案) <https://pan.baidu.com/s/1WAyQO2mykyTZQMkGtTKV_w>

目录

1.实验综述
<https://blog.csdn.net/sdu_hao/article/details/87867549#1.%E5%AE%9E%E9%AA%8C%E7%BB%BC%E8%BF%B0>

2.导入必要的包
<https://blog.csdn.net/sdu_hao/article/details/87867549#2.%E5%AF%BC%E5%85%A5%E5%BF%85%E8%A6%81%E7%9A%84%E5%8C%85>

3.问题描述
<https://blog.csdn.net/sdu_hao/article/details/87867549#3.%E9%97%AE%E9%A2%98%E6%8F%8F%E8%BF%B0>

4.构建模型
<https://blog.csdn.net/sdu_hao/article/details/87867549#4.%E6%9E%84%E5%BB%BA%E6%A8%A1%E5%9E%8B>

5.生成音乐
<https://blog.csdn.net/sdu_hao/article/details/87867549#5.%E7%94%9F%E6%88%90%E9%9F%B3%E4%B9%90>

6.总结
<https://blog.csdn.net/sdu_hao/article/details/87867549#6.%E6%80%BB%E7%BB%93%C2%A0>

<https://blog.csdn.net/sdu_hao/article/details/87867549#%E2%80%8B>

1.实验综述

2.导入必要的包
from __future__ import print_function import IPython import sys from music21
import * import numpy as np from grammar import * from qa import * from
preprocess import * from music_utils import * #music_utils.py from data_utils
import * #data_utils.py from keras.models import load_model, Model from
keras.layers import Dense, Activation, Dropout, Input, LSTM, Reshape, Lambda,
RepeatVector from keras.initializers import glorot_uniform from keras.utils
import to_categorical from keras.optimizers import Adam from keras import
backend as K
3.问题描述

* 数据集
我们将在一个爵士乐语料库上训练我们的模型. 下面先听一段训练集中音频片段:
IPython.display.Audio('./data/30s_seq.mp3') #播放音频

X, Y, n_values, indices_values = load_music_utils() print('shape of X:',
X.shape) print('number of training examples:', X.shape[0]) print('Tx (length of
sequence):', X.shape[1]) print('total # of unique values:', n_values)
print('Shape of Y:', Y.shape)

* 模型结构

4.构建模型

n_a = 64

reshapor = Reshape((1,78)) LSTM_cell = LSTM(n_a,return_state=True) #定义LSTM单元
有n_a个隐藏单元 densor = Dense(n_values,activation='softmax')#定义输出层 n_values个单元
使用softmax多分类

# GRADED FUNCTION: djmodel def djmodel(Tx, n_a, n_values): """ 实现模型 Arguments:
Tx -- 一个训练样本/序列的长度，需要的时间步骤 n_a -- LSTM单元中隐藏单元的数量 n_values --
输出层的单元数，不同音符/音乐值的数量 Returns: model -- Keras模型 """ # 定义输入的placeholder
传入一个训练样本/序列的维度 X(?,Tx=30,n_values=78) X = Input(shape=(Tx, n_values)) #
定义LSTM的初始隐藏状态和单元状态 a0 = Input(shape=(n_a,), name='a0') c0 = Input(shape=(n_a,),
name='c0') a = a0 c = c0 # Step 1: 创建存储预测输出的列表 outputs = [] # Step 2:
对?个样本在每个时间步骤上进行迭代 for t in range(Tx): # Step 2.A: 从X中选择第t个时间步骤的输入tensor x =
Lambda(lambda X: X[:,t,:])(X) #(?,78) # Step 2.B: 对x进行转型 x = reshapor(x)
#(?,78) -> (?,1,78) # Step 2.C: 把当前时间步骤的x输入到LSTM单元 a, _, c =
LSTM_cell(x,initial_state=[a,c]) # Step 2.D: 把当前时间步骤产生的隐藏状态a
通过一个输出层(softmax)得到预测输出 out = densor(a) # Step 2.E: 把out添加到 "outputs"中
outputs.append(out) # Step 3: 创建模型实例 model =
Model(inputs=[X,a0,c0],outputs=outputs) return model #预处理后每个训练样本/序列都是等长的
都有需要30个时间步骤 #便于同时对多个样本进行向量化处理 #n_a LSTM单元中隐藏单元的数量 #n_values 输出层单元数
不同的音乐值/音符(序列的基本组成单元)的个数 #第一步：创建模型 model = djmodel(Tx=30,n_a=64,n_values=78)
#第二步：编译模型 #选择一个优化算法 Adam #可以使用默认参数也可以自己设置采用学习率衰减decay为衰减率 opt =
Adam(lr=0.01,beta_1=0.9,beta_2=0.999,decay=0.01) #多分类使用
categorical_crossentropy 损失指标采用accuracy
model.compile(optimizer=opt,loss='categorical_crossentropy',metrics=['accuracy'])
#初始化隐藏状态和单元状态为0 m = 60 #所有训练样本数 60 a0 = np.zeros((m,n_a)) c0 = np.zeros((m,n_a))

#第三步训练模型 model.fit([X,a0,c0],list(Y),epochs=100,batch_size=20)
#默认batch_size维训练集的大小此时采用batch GD。 #可以设置batch_size的大小采用 mini-batch GD

5.生成音乐

我们现在已经训练完了模型，该模型已经学习了jazz乐的模式。现在让我们用这个模型来生成新的音乐。

* 预测和采样

# GRADED FUNCTION: music_inference_model def music_inference_model(LSTM_cell,
densor, n_values = 78, n_a = 64, Ty = 100): """ 使用model()中训练好的 "LSTM_cell" 和
"densor" 生成一系列值. Arguments: LSTM_cell -- model()中训练好的"LSTM_cell" , Keras 层对象
densor -- model()中训练好的"densor" , Keras 层对象 n_values -- 整数所有不同的音乐值数 n_a --
LSTM隐藏单元数 Ty -- 整数用于生成的时间步骤 Returns: inference_model -- Keras 模型实例 """ #
定义模型的输入 x0 = Input(shape=(1, n_values)) # 初始化decoder LSTM的隐藏状态 a0 =
Input(shape=(n_a,), name='a0') c0 = Input(shape=(n_a,), name='c0') a = a0 c =
c0 x = x0 # Step 1: 创建一个空的 "outputs" 列表，存储预测值 outputs = [] # Step 2: 遍历 Ty个时间步骤
，每个时间步骤生成一个值 for t in range(Ty): # Step 2.A: 执行一步 LSTM_cell a, _, c =
LSTM_cell(x,initial_state=[a,c]) # Step 2.B: 对 LSTM_cell输出的隐藏状态a 经过Dense
得到当前时间步骤上的输出 out = densor(a) # Step 2.C: 将预测结果out添加到outputs列表中. out.shape =
(None, 78) outputs.append(out) # Step 2.D: 根据 "out"选择下一时间步骤LSTM的输入,
one-hot表示形式 x = Lambda(one_hot)(out) # Step 3: 创建模型实例 inference_model =
Model(inputs = [x0,a0,c0],outputs=outputs) return inference_model
运行下列代码定义inference模型。这个模型被硬编码生成50(可以设置)个值(音乐值)
inference_model =
music_inference_model(LSTM_cell,densor,n_values=78,n_a=64,Ty=50)
最后需要初始化x和LSTM状态变量a，c为0向量。
x_initializer = np.zeros((1, 1, 78)) a_initializer = np.zeros((1, n_a))
c_initializer = np.zeros((1, n_a))

# GRADED FUNCTION: predict_and_sample def predict_and_sample(inference_model,
x_initializer = x_initializer, a_initializer = a_initializer, c_initializer =
c_initializer): """ Predicts the next value of values using the inference
model. Arguments: inference_model -- Keras model instance for inference time
x_initializer -- numpy array of shape (1, 1, 78), one-hot vector initializing
the values generation a_initializer -- numpy array of shape (1, n_a),
initializing the hidden state of the LSTM_cell c_initializer -- numpy array of
shape (1, n_a), initializing the cell state of the LSTM_cel Returns: results --
numpy-array of shape (Ty, 78), matrix of one-hot vectors representing the
values generated indices -- numpy-array of shape (Ty, 1), matrix of indices
representing the values generated """ # Step 1: Use your inference model to
predict an output sequence given x_initializer, a_initializer and
c_initializer. pred =
inference_model.predict([x_initializer,a_initializer,c_initializer]) # Step 2:
Convert "pred" into an np.array() of indices with the maximum probabilities
indices = np.argmax(np.array(pred),axis=-1) # Step 3: Convert indices to
one-hot vectors, the shape of the results should be (1, ) results =
to_categorical(indices,num_classes=x_initializer.shape[-1]) ### END CODE HERE
### return results, indices
* 生成音乐

out_stream = generate_music(inference_model)

IPython.display.Audio('./data/30s_trained_model.mp3')
6.总结

热门工具换一换