Add gelu and gelu_fast as possible activation functions (#653) (8500bdd0) · Commits · Simon Will / fairseq

Commit 8500bdd0 authored Apr 25, 2019 by Liezl Puzon Committed by Facebook Github Bot Apr 25, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/653

After this diff, you can train a transformer model with --activation-fn 'relu', 'gelu', or 'gelu_fast'

gelu_fast is the default implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77
gelu is the alternate implementation in https://github.com/hendrycks/GELUs/blob/master/mnist_fcn.py#L72-L77 and the default implementation in https://github.com/facebookresearch/XLM

Reviewed By: pipibjc

Differential Revision: D14966006

fbshipit-source-id: 94e95fb99bd548ba47cf23b4999265c7b6833fc1

parent d8d03745

Please register or to comment