LanguagePairDataset and BacktranslationDataset changes for semi supervised task setup (#330)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/330 As part of the semi sueprvised task setup (https://github.com/pytorch/translate/pull/243), this diff adds the ability for LanguagePairDataset to remove EOS from source or append EOS to target. This functionality is required by BacktranslationDataset to use translations as source data. Also added changes to BacktranslationDataset to make it work on GPU. We needed to transfer back-translated sentences back to CPU for the LanguagePairDataset to collate. Reviewed By: liezl200 Differential Revision: D10846294 fbshipit-source-id: b015ecb5fcef26fba507c30f8a4992bdbc54899f
Loading
Please register or sign in to comment