Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
- TensorFlow version and how it was installed (source or binary): N/A
- TensorFlow-Addons version and how it was installed (source or binary): N/A
- Python version: N/A
- Is GPU used? (yes/no): N/A
Describe the bug
This bug is in https://colab.research.google.com/github/tensorflow/addons/blob/master/docs/tutorials/networks_seq2seq_nmt.ipynb
The loss function is not calculated properly. The mean should only be calculated over non-masked elements. This line should be replaced:
loss = tf.reduce_mean(loss)
with this:
loss = tf.math.reduce_sum(loss) / tf.math.reduce_sum(mask)
This now gives the same results as keras.metrics.SparseCategoricalCrossentropy(from_logits=True), as expected.
def loss_function(real, pred):
# real shape = (BATCH_SIZE, max_length_output)
# pred shape = (BATCH_SIZE, max_length_output, tar_vocab_size )
cross_entropy = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True, reduction='none')
loss = cross_entropy(y_true=real, y_pred=pred)
mask = tf.logical_not(tf.math.equal(real,0)) #output 0 for y=0 else output 1
mask = tf.cast(mask, dtype=loss.dtype)
loss = mask* loss
loss = tf.reduce_mean(loss)
return loss
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
Activity
guillaumekln commentedon Dec 28, 2021
Yes, the loss is not correctly reduced. Could you send a PR with your change?
MrinalTyagi commentedon Jan 28, 2022
@guillaumekln I would like to contribute to this if no one is working on it.
martingoodson commentedon Jan 28, 2022
MrinalTyagi commentedon Jan 28, 2022
sorry. thought it was available for contribution