Load a XLM model into transformer encoder / decoder for MT training (#629)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/629 Use GeLU as an alternate activation layer for ReLU. Reviewed By: lematt1991 Differential Revision: D14689851 fbshipit-source-id: 7ec81fa34bc7bd0e1e43b337847ae932dcbf8b15
Loading
Please register or sign in to comment