Make TransformerEncoderLayer layer norm names more descriptive
Summary: I added an upgrade_state_dict function so that loading old models will still work layer_norms[0] --> self_attn_layer_norm layer_norms[1] --> final_layer_norm Reviewed By: pipibjc Differential Revision: D14689849 fbshipit-source-id: b2809262c11fe9d083e571fa31044798aefd48ce
Loading
Please register or sign in to comment