python – 如何使用TensorFlow中的官方批量标准化层?

前端之家收集整理的这篇文章主要介绍了python – 如何使用TensorFlow中的官方批量标准化层?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我试图使用批量标准化来训练我的神经网络使用TensorFlow,但我不清楚如何使用 the official layer implementation of Batch Normalization(注意这与 API的不同).

在他们的github issues上进行了一些痛苦的挖掘后,似乎需要一个tf.cond来正确使用它并且还有一个’resue = True’标志,以便BN移位和缩放变量被正确地重复使用.在弄清楚之后,我提供了一个小小的描述,说明我认为是使用它的正确方法here.

现在我已经编写了一个简短的脚本来测试它(只有一个层和一个ReLu,很难让它比这个小).但是,我不是百分百确定如何测试它.现在我的代码运行时没有错误消息,但意外返回NaN.这降低了我对我在其他帖子中提供的代码可能正确的信心.或许我所拥有的网络很奇怪.无论哪种方式,有人知道什么是错的?这是代码

import tensorflow as tf
# download and install the MNIST data automatically
from tensorflow.examples.tutorials.mnist import input_data
from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm

def batch_norm_layer(x,train_phase,scope_bn):
    bn_train = batch_norm(x,decay=0.999,center=True,scale=True,is_training=True,reuse=None,# is this right?
    trainable=True,scope=scope_bn)

    bn_inference = batch_norm(x,is_training=False,reuse=True,scope=scope_bn)

    z = tf.cond(train_phase,lambda: bn_train,lambda: bn_inference)
    return z

def get_NN_layer(x,input_dim,output_dim,scope,train_phase):
    with tf.name_scope(scope+'vars'):
        W = tf.Variable(tf.truncated_normal(shape=[input_dim,output_dim],mean=0.0,stddev=0.1))
        b = tf.Variable(tf.constant(0.1,shape=[output_dim]))
    with tf.name_scope(scope+'Z'):
        z = tf.matmul(x,W) + b
    with tf.name_scope(scope+'BN'):
        if train_phase is not None:
            z = batch_norm_layer(z,scope+'BN_unit')
    with tf.name_scope(scope+'A'):
        a = tf.nn.relu(z) # (M x D1) = (M x D) * (D x D1)
    return a

mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)
# placeholder for data
x = tf.placeholder(tf.float32,[None,784])
# placeholder that turns BN during training or off during inference
train_phase = tf.placeholder(tf.bool,name='phase_train')
# variables for parameters
hiden_units = 25
layer1 = get_NN_layer(x,input_dim=784,output_dim=hiden_units,scope='layer1',train_phase=train_phase)
# create model
W_final = tf.Variable(tf.truncated_normal(shape=[hiden_units,10],stddev=0.1))
b_final = tf.Variable(tf.constant(0.1,shape=[10]))
y = tf.nn.softmax(tf.matmul(layer1,W_final) + b_final)

### training
y_ = tf.placeholder(tf.float32,10])
cross_entropy = tf.reduce_mean( -tf.reduce_sum(y_ * tf.log(y),reduction_indices=[1]) )
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    steps = 3000
    for iter_step in xrange(steps):
        #Feed_dict_batch = get_batch_Feed(X_train,Y_train,M,phase_train)
        batch_xs,batch_ys = mnist.train.next_batch(100)
        # Collect model statistics
        if iter_step%1000 == 0:
            batch_xstrain,batch_xstrain = batch_xs,batch_ys #simualtes train data
            batch_xcv,batch_ycv = mnist.test.next_batch(5000) #simualtes CV data
            batch_xtest,batch_ytest = mnist.test.next_batch(5000) #simualtes test data
            # do inference
            train_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xs,y_:batch_ys,train_phase: False})
            cv_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xcv,y_:batch_ycv,train_phase: False})
            test_error = sess.run(fetches=cross_entropy,Feed_dict={x: batch_xtest,y_:batch_ytest,train_phase: False})

            def do_stuff_with_errors(*args):
                print args
            do_stuff_with_errors(train_error,cv_error,test_error)
        # Run Train Step
        sess.run(fetches=train_step,train_phase: True})
    # list of booleans indicating correct predictions
    correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
    # accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
    print(sess.run(accuracy,Feed_dict={x: mnist.test.images,y_: mnist.test.labels,train_phase: False}))

当我运行它时,我得到:

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
(2.3474066,2.3498712,2.3461707)
(0.49414295,0.88536006,0.91152304)
(0.51632041,0.393666,nan)
0.9296

它曾经是最后一个是南,现在只有少数几个.一切都好还是我是偏执狂?

解决方法

我不确定这是否能解决您的问题,BatchNorm的文档不是很容易使用/提供信息,所以这里简单回顾一下如何使用简单的BatchNorm:

首先,您定义BatchNorm图层.如果你想在仿射/完全连接的层之后使用它,你可以这样做(只是一个例子,订单可以根据需要不同):

...
inputs = tf.matmul(inputs,W) + b
inputs = tf.layers.batch_normalization(inputs,training=is_training)
inputs = tf.nn.relu(inputs)
...

函数tf.layers.batch_normalization调用变量初始值设定项.这些是内部变量,需要调用一个特殊的范围,它位于tf.GraphKeys.UPDATE_OPS中.因此,您必须按如下方式调用优化程序函数(在定义了所有图层之后!):

...
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
    trainer = tf.train.AdamOptimizer() 
    updateModel = trainer.minimize(loss,global_step=global_step)
...

你可以阅读更多关于它here.我知道回答你的问题有点晚了,但它可能会帮助其他人在tensorflow中遇到BatchNorm问题!

猜你在找的Python相关文章