TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

A shorter and scalable way of doing this would be ;

vars   = tf.trainable_variables() 
lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in vars ]) * 0.001

This basically sums the l2_loss of all your trainable variables. You could also make a dictionary where you specify only the variables you want to add to your cost and use the second line above. Then you can add lossL2 with your softmax cross entropy value in order to calculate your total loss.

Edit : As mentioned by Piotr Dabkowski, the code above will also regularise biases. This can be avoided by adding an if statement in the second line ;

lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in vars
                    if 'bias' not in v.name ]) * 0.001

This can be used to exclude other variables.

hidden_weights, hidden_biases, out_weights, and out_biases are all the model parameters that you are creating. You can add L2 regularization to ALL these parameters as follows :

loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=out_layer, labels=tf_train_labels)) +
    0.01*tf.nn.l2_loss(hidden_weights) +
    0.01*tf.nn.l2_loss(hidden_biases) +
    0.01*tf.nn.l2_loss(out_weights) +
    0.01*tf.nn.l2_loss(out_biases))

With the note of @Keight Johnson, to not regularize the bias:

loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=out_layer, labels=tf_train_labels)) +
    0.01*tf.nn.l2_loss(hidden_weights) +
    0.01*tf.nn.l2_loss(out_weights) +

In fact, we usually do not regularize bias terms (intercepts). So, I go for:

loss = (tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=out_layer, labels=tf_train_labels)) +
    0.01*tf.nn.l2_loss(hidden_weights) +
    0.01*tf.nn.l2_loss(out_weights))

By penalizing the intercept term, as the intercept is added to y values, it will result in changing the y values, adding a constant c to the intercepts. Having it or not will not change the results but takes some computations

TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

Tags:

Machine Learning

Neural Network

Deep Learning

Tensorflow

Regularized

Related

Recent Posts