Add gelu and gelu_fast as possible activation functions (#653)
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/653 After this diff, you can train a transformer model with --activation-fn 'relu', 'gelu', or 'gelu_fast' gelu_fast is the default implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77 gelu is the alternate implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77 and the default implementation in https://github.com/facebookresearch/XLM Reviewed By: pipibjc Differential Revision: D14966006 fbshipit-source-id: 94e95fb99bd548ba47cf23b4999265c7b6833fc1
Loading
Please register or sign in to comment