Bugfixes in training.py regarding batch_multiplier (d1848153) · Commits · kotzaneck / IRL Machine Translation Project

Commit d1848153 authored Jan 28, 2020 by czarnetzki

Bugfixes in training.py regarding batch_multiplier

Fixed logging of batch loss if batch_multiplier > 1:
- Previously only last portion of size batch_size of an effective batch (batch_multiplier * batch_size) was logged.
- Previously every portion (batch_size) was saved to tb.writer not just the correctly computed loss of a batch (batch_multiplier * batch_size).
Now this behaves the same as in the batch_multiplier = 1 case.

Fixed bug occuring if train_data is not devisible by (batch_multiplier * batch_size) if batch_multiplier > 1
- Previously if this case occurs a batch (batch_multiplier * batch_size) couldn't fit the leftover training examples for the last batch in an epoch. Therefore no update could carry out and the leftover gradients were summed on top of the ones of the first batch of the next epoch.
  This also resulted in a faulty scaling of the loss normalization.
This now behaves properly by changing batch_multiplier in last batch of an epoch to fit the number of leftover examples.
- The way the losses are accumalated in _train_batch is changed to reduce rounding error that occured by dividing every portion of the batch loss by batch_multiplier. Now only final sum is devided by batch_multiplier.

parent 85cdb2dd

Hide whitespace changes

Inline Side-by-side

Please register or to comment