最近学习残差网络,非常给力,即使是深层网络也能很快收敛
这里的代码构建了一个17层的网络,5 epoch就能达到96%以上准确率

lost-损失,acc-准确率

不过发现几个问题
1.使用训练过程中,lost值会先减小,然后会一直增大,而acc值却在一直上升
2.使用prelu神经元,lost值增大更快,acc值训练时间长会降低
3.使用elu神经元与prelu神经元相比,elu神经元lost增大更缓慢,acc值能持续增大,prelu神经元的acc值后期会下降

不使用relu主要是因为担心relu坏死问题

完整代码
import tensorflow as tf import tensorlayer as tl sess = tf.InteractiveSession()
# 准备数据 X_train, y_train, X_val, y_val, X_test, y_test =
tl.files.load_mnist_dataset(shape=(-1,784)) # 定义 placeholder x =
tf.placeholder(tf.float32, shape=[None,784], name='x') y_ = tf.placeholder(tf.
int64, shape=[None, ], name='y_') # 定义模型 network = tl.layers.InputLayer(x, name=
'input_layer') res_a = network = tl.layers.DenseLayer(network, n_units=200, act
= tf.nn.elu, name='relu1') network = tl.layers.DenseLayer(network, n_units=200,
act = tf.nn.elu, name='relu2') network = tl.layers.DenseLayer(network, n_units=
200, act = tf.nn.elu, name='relu3') res_a = network =
tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add1'
) network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name=
'relu4') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu,
name='relu5') res_a = network = tl.layers.ElementwiseLayer([network, res_a],
combine_fn=tf.add, name='res_add2') network = tl.layers.DenseLayer(network,
n_units=200, act = tf.nn.elu, name='relu6') network =
tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name='relu7') res_a
= network = tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add,
name='res_add3') network = tl.layers.DenseLayer(network, n_units=200, act =
tf.nn.elu, name='relu8') network = tl.layers.DenseLayer(network, n_units=200,
act = tf.nn.elu, name='relu9') res_a = network =
tl.layers.ElementwiseLayer([network, res_a], combine_fn=tf.add, name='res_add4'
) network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu, name=
'relu10') network = tl.layers.DenseLayer(network, n_units=200, act = tf.nn.elu,
name='relu11') res_a = network = tl.layers.ElementwiseLayer([network, res_a],
combine_fn=tf.add, name='res_add5') network = tl.layers.DenseLayer(network,
n_units=10, act = tf.identity, name='output_layer') # 定义损失函数和衡量指标 #
tl.cost.cross_entropy 在内部使用 tf.nn.sparse_softmax_cross_entropy_with_logits() 实现
softmax y = network.outputs cost = tl.cost.cross_entropy(y, y_, name = 'cost')
correct_prediction = tf.equal(tf.argmax(y,1), y_) acc =
tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) y_op =
tf.argmax(tf.nn.softmax(y),1) # 定义 optimizer train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate=0.003, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost, var_list=train_params) # 初始化
session 中的所有参数 tl.layers.initialize_global_variables(sess) # 列出模型信息
network.print_params() network.print_layers()# 训练模型 tl.utils.fit(sess, network,
train_op, cost, X_train, y_train, x, y_, acc=acc, batch_size=500, n_epoch=500,
print_freq=5, X_val=X_val, y_val=y_val, eval_train=False) # 评估模型
tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=None,
cost=cost)# 把模型保存成 .npz 文件 tl.files.save_npz(network.all_params , name=
'model.npz') sess.close()