Add LegacyDistributedDataParallel in place of no_c10d (#370)
Summary: This should bring back the speedup with --update-freq that we reported in the Scaling Neural Machine Translation paper. Pull Request resolved: https://github.com/pytorch/fairseq/pull/370 Differential Revision: D13100281 Pulled By: myleott fbshipit-source-id: 4a81b51bb7390a197add314a4be5512bbf68c085
Loading
Please register or sign in to comment