Commits · v0.6.2 · Simon Will / fairseq

Mar 15, 2019

Myle Ott authored Mar 15, 2019

Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f6: Add mixture of experts code from Shen et al. (2019)
- 00493490: Add example for multilingual training
- 48d9afbe: Speed improvements, including fused operators from apex
- 44d27e64: Add Tensorboard support
- d17fa851: Add Adadelta optimizer
- 9e1c880f: Add `FairseqEncoderModel`
- b65c579b: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178e: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7

e6422528

Mar 14, 2019

Speed improvements (#531) · 48d9afbe

Myle Ott authored Mar 14, 2019

Summary:
* Add FusedLayerNorm and FusedAdam
* Softmax and zero grad optimizations
Pull Request resolved: https://github.com/pytorch/fairseq/pull/531

Differential Revision: D14218457

Pulled By: myleott

fbshipit-source-id: 5656b2d0152cd85f77dc21ec0e1439ec04b9fa89

48d9afbe

Minor fix for multilingual example shell command (#561) · a24880bd

Wen-Ding Li authored Mar 14, 2019

Summary:
Add `\` to fix for the shell command.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/561

Differential Revision: D14460091

Pulled By: myleott

fbshipit-source-id: 3658ca41e69bcd00d4ad8ec2d79ddcc6a8de586e

a24880bd

Mar 13, 2019

Enable sampling (#571) · 4d3401b0

Qing Sun authored Mar 12, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/571

Enable sampling from Fairseq

Reviewed By: akinh

Differential Revision: D13981666

fbshipit-source-id: 2af1bd67701a73a2c76a9255bd8381d6a7518876

4d3401b0

Mar 12, 2019

Handle 3+ dimensional input in sequence_generator + nits · 860010e9

Dmytro Okhonko authored Mar 12, 2019

Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.

Reviewed By: myleott

Differential Revision: D14420044

fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015

860010e9

Adadelta optimizer · d17fa851

Dmytro Okhonko authored Mar 12, 2019

Summary: Adding Adadelta optimizer to fairseq as wrapper around torch.optim.Adadelta

Reviewed By: myleott

Differential Revision: D14418635

fbshipit-source-id: 6bf5ec008e905a4a2cbf7415e9492f5eea3ff07f

d17fa851

FairseqEncoderModel · 9e1c880f

Dmytro Okhonko authored Mar 12, 2019

Summary: Base class for encoder-only models. Some models doesn't have decoder part.

Reviewed By: myleott

Differential Revision: D14413406

fbshipit-source-id: f36473b91dcf3c835fd6d50e2eb6002afa75f11a

9e1c880f

Mar 11, 2019

Create fairseq_cli_lib · 7fc9a3be

Matt Le authored Mar 11, 2019

Summary: This allows one to call fairseq_cli functions from within python without dispatching to bash.

Reviewed By: myleott

Differential Revision: D14404719

fbshipit-source-id: 044eb652045bb15fc40e72ecbaf6fb10df9f8c61

7fc9a3be

Add missing parentheses in regex expression (#567) · fef4e002

Jose Fonollosa authored Mar 11, 2019

Summary:
The regex pattern without parentheses is not correct. The checkpoints are not sorted in descending order
Pull Request resolved: https://github.com/pytorch/fairseq/pull/567

Differential Revision: D14404380

Pulled By: myleott

fbshipit-source-id: 98cd0cfa8c92b78a03ffbb94840bc0f7a118eca1

fef4e002

Mar 04, 2019

Try to access sys.stdin.fileno() only at runtime and not during import (#553) · 5869385c

Louis MARTIN authored Mar 04, 2019

Summary:
Accessing sys.stdin.fileno() raises an error in multiple contexts
(pytest, joblib, jupyter...).
Thus accessing it at the top level of the file can cause other scripts
to crash when they import fairseq.
This is why it is moved inside the method of MultiprocessingPdb to only
be accessed at runtime if needed.

See  Issue #517
Pull Request resolved: https://github.com/pytorch/fairseq/pull/553

Differential Revision: D14309284

Pulled By: myleott

fbshipit-source-id: 6ca36f2053a86ebc02e2d6f025459c6a78c592e7

5869385c

Add --curriculum (fixes #533) · 2ad1178e

Myle Ott authored Mar 04, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/554

Differential Revision: D14300596

Pulled By: myleott

fbshipit-source-id: f38c8e58daef99d5e4b97dd423e4142e4294a4f0

2ad1178e

Mar 02, 2019

Fix Pdb · 1fd0a6f6

Myle Ott authored Mar 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/551

Differential Revision: D14295227

Pulled By: myleott

fbshipit-source-id: 404f2a2697a62ce0dbf22e5ab2e1cf932acc83ac

1fd0a6f6

Mar 01, 2019

Fixed the issue that no space in string converted from tensor · 88bf8b56

James King authored Mar 01, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/548

Differential Revision: D14286021

Pulled By: myleott

fbshipit-source-id: 7c725304185e63787220371a812ec860e178872c

88bf8b56

Use --workers for validation sets in preprocess.py · 66262a38

Myle Ott authored Mar 01, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/550

Differential Revision: D14286008

Pulled By: myleott

fbshipit-source-id: 6055acf98023fdd01f85ac3d7c4e7fb786e54389

66262a38

Refactor BERTDataset to the more general MaskedLMDataset · 92a6c548

Kartikay Khandelwal authored Feb 28, 2019

Summary: The current BERTDataset has a lot of components needed for generic MaskedLM training but is too restrictive in terms of the assumptions it makes - two blocks being masked, the special tokens used for the sentence embedding as well as the separator etc. In this diff I refactor this dataset and at the same time add make some of the parameters including the probabilities associated with masking configurable.

Reviewed By: rutyrinott

Differential Revision: D14222467

fbshipit-source-id: e9f78788dfe7f56646ba09c62967c4c0bd30aed8

92a6c548

ignore data files in .gitignore · 4d59517f

JingboWang1997 authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/546

Differential Revision: D14272808

Pulled By: myleott

fbshipit-source-id: e993450354e7d7561b14b56c12d4859a8ee7121b

4d59517f

Feb 28, 2019

Deprecate _aggregate_logging_outputs · 8a8df81d

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/498

Differential Revision: D14024524

Pulled By: myleott

fbshipit-source-id: 1b0be4bb212dbab41ea0959ac34020832ff00645

8a8df81d

Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) · f296824f

Vladimir Karpukhin authored Feb 28, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8

f296824f

Add test for mixture of experts · bc919276

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/543

Differential Revision: D14259481

Pulled By: myleott

fbshipit-source-id: fcb0a150b8e851cf86ea5ed1f083f56e1600588e

bc919276

Add sacrebleu to requirements · 139e3a3c

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/542

Differential Revision: D14258895

Pulled By: myleott

fbshipit-source-id: 950a840e1d001a472be8d4737c9e4de5224137b3

139e3a3c

Extract after skipping download for LM example script · 19b6e8bf

Jo Chuang authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/538

Differential Revision: D14258736

Pulled By: myleott

fbshipit-source-id: ca16355e4c4700fc8eecf2c9374ec170bca826a4

19b6e8bf

Feb 26, 2019

Support LM generation from interactive.py (fixes #526) · 98daf039

Myle Ott authored Feb 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/528

Differential Revision: D14218377

Pulled By: myleott

fbshipit-source-id: facb0a32f6aebf56a4fea7259080394ad2d2d846

98daf039

Multilingual training example (#527) · 00493490

Myle Ott authored Feb 25, 2019

Summary:
* Add example for multilingual translation on IWSLT'17
* Match dataset ordering for multilingual_translation and translation
* Fix bug with LegacyDistributedDataParallel when calling forward of sub-modules
Pull Request resolved: https://github.com/pytorch/fairseq/pull/527

Differential Revision: D14218372

Pulled By: myleott

fbshipit-source-id: 2e3fe24aa39476bcc5c9af68ef9a40192db34a3b

00493490

Add Tensorboard support (#530) · 44d27e64

Myle Ott authored Feb 25, 2019

Summary:
Enable with the `--tensorboard-logdir` option.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/530

Differential Revision: D14218430

Pulled By: myleott

fbshipit-source-id: e7a54f66f928e3bb02ae03fda09b22fa4fa7d053

44d27e64

Misc fixes · 65c1903e

Myle Ott authored Feb 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/529

Differential Revision: D14218384

Pulled By: myleott

fbshipit-source-id: 5d2cbb1f56ea42e9929785aff4a5ae5f44d13724

65c1903e

Feb 24, 2019

Add scoring script for Mixture of Experts · 94fedf00

Myle Ott authored Feb 23, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/523

Differential Revision: D14200060

Pulled By: myleott

fbshipit-source-id: a2e3d6ec7c6b9cacc9f44565d2b91e65b580b084

94fedf00

Feb 23, 2019

Update README for Mixture of Experts paper · 392bdd6c

Myle Ott authored Feb 22, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/522

Differential Revision: D14194672

Pulled By: myleott

fbshipit-source-id: 4ff669826c4313de6f12076915cfb1bd15289ef0

392bdd6c

Feb 22, 2019

Add code for mixture of experts (#521) · 4294c4f6

Myle Ott authored Feb 22, 2019

Summary:
Code for the paper: [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](https://arxiv.org/abs/1902.07816).
Pull Request resolved: https://github.com/pytorch/fairseq/pull/521

Differential Revision: D14188021

Pulled By: myleott

fbshipit-source-id: ed5b1ed5ad9a582359bd5215fa2ea26dc76c673e

4294c4f6

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

Feb 19, 2019

moving masking logic to collate · 08e866f9

Ruty Rinott authored Feb 19, 2019

Summary: Move masking logic to data_utils

Reviewed By: kartikayk, jingfeidu

Differential Revision: D14098403

fbshipit-source-id: c7b7e811ab48b9c5a12662dc1e2f2ed694724176

08e866f9

Feb 16, 2019

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

Feb 12, 2019

Add onnx_trace argument for learned embeddings (#492) · 184629a7

Juan Miguel Pino authored Feb 12, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/492

This argument was missing so we cannot export Transformer if we use learned positional embeddings. See also https://github.com/pytorch/translate/pull/335

Reviewed By: jhcross

Differential Revision: D13984781

fbshipit-source-id: 2187377e952ff587e07237de312c5b68f7d68891

184629a7

Feb 09, 2019

Add fairseq to PyPI (#495) · fbd4cef9

Myle Ott authored Feb 08, 2019

Summary:
- fairseq can now be installed via pip: `pip install fairseq`
- command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/495

Differential Revision: D14017761

Pulled By: myleott

fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235

fbd4cef9

Feb 07, 2019

stitch preprocessing pipeline · cea0e4b9

Ruty Rinott authored Feb 06, 2019

Summary:
1. add call to binarization to complete preprocessing pipeline
2. add ability to specify task to select the dictionary, and add a bert task
3. Get rid of function calls that are no longer needed after moving functions from fairseq here

Reviewed By: jingfeidu

Differential Revision: D13977842

fbshipit-source-id: ec9bbb4e98e62e12c20ba68bb52b8bcc94aee91d

cea0e4b9

Feb 06, 2019

Add CheckpointManager to keep avg checkpoint weights in memory to reduce disk... · c49c292c

Wei Ho authored Feb 06, 2019

Add CheckpointManager to keep avg checkpoint weights in memory to reduce disk read when averaging + various checkpoint refactoring

Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/315

Reviewed By: akinh

Differential Revision: D13510446

fbshipit-source-id: 22a6594af9253130a93e638285a47183a974e0de

c49c292c

Feb 05, 2019

Add standalone binaries · 829bd8ce

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

Feb 01, 2019

Support custom Dictionary implementations in 'preprocess.py' (#448) · bbb4120b

Davide Caroselli authored Feb 01, 2019

Summary:
The `preprocess.py` script has been refactored in order to:

1. Use the `options` module for command line arguments parsing. This will give to `preprocess.py` the ability to load custom modules with `--user-dir` flag (already implemented to all other binaries)
2. Dictionary loading and building code has moved to Task implementation. This allows custom Dictionary classes to be used during the data generation step.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/448

Differential Revision: D13674819

Pulled By: myleott

fbshipit-source-id: b40648a98ed6c08284577e5ec25876e018d8c822

bbb4120b

Jan 30, 2019

Do distributed init after data loading · ec6f8ef9

Myle Ott authored Jan 30, 2019

Summary:
FACEBOOK

This switches back to torch.multiprocessing.spawn, instead of directly calling fb_train.par using a subprocess.Process. This has the advantage that exceptions are propagated properly. It also moves the distributed_init part to happen after data loading, which gets around the timeout issue.

The downside of this approach is that it's not so easy to pipe stdout to multiple places, which was nice when using the sweep.py scripts. I'm still working on a fix for that.

Reviewed By: rutyrinott, ngoyal2707

Differential Revision: D13873224

fbshipit-source-id: 08d593233b8d23590c01c723363630a79804a8b0

ec6f8ef9

Add --input option to interactive.py to support reading from file · 3dce7c9f

Myle Ott authored Jan 30, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/484

Differential Revision: D13880636

Pulled By: myleott

fbshipit-source-id: 984b2e1c3b281c28243102eb971ea45ec891d94e

3dce7c9f

Merge internal changes (#483) · 42be3ebd

Myle Ott authored Jan 30, 2019

Summary:
Changelog:
- `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU.
- `0d76427`: fix assertion error when training language model with dataset containing empty sentences
- minor bug and style fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/483

Differential Revision: D13867899

Pulled By: myleott

fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda

42be3ebd