- Feb 09, 2019
-
-
Myle Ott authored
Summary: - fairseq can now be installed via pip: `pip install fairseq` - command-line tools are globally accessible: `fairseq-preprocess`, `fairseq-train`, `fairseq-generate`, etc. Pull Request resolved: https://github.com/pytorch/fairseq/pull/495 Differential Revision: D14017761 Pulled By: myleott fbshipit-source-id: 10c9f6634a3056074eac2f33324b4f1f404d4235
-
- Feb 07, 2019
-
-
Ruty Rinott authored
Summary: 1. add call to binarization to complete preprocessing pipeline 2. add ability to specify task to select the dictionary, and add a bert task 3. Get rid of function calls that are no longer needed after moving functions from fairseq here Reviewed By: jingfeidu Differential Revision: D13977842 fbshipit-source-id: ec9bbb4e98e62e12c20ba68bb52b8bcc94aee91d
-
- Feb 06, 2019
-
-
Wei Ho authored
Add CheckpointManager to keep avg checkpoint weights in memory to reduce disk read when averaging + various checkpoint refactoring Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/315 Reviewed By: akinh Differential Revision: D13510446 fbshipit-source-id: 22a6594af9253130a93e638285a47183a974e0de
-
- Feb 05, 2019
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489 Differential Revision: D13956810 Pulled By: myleott fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514
-
- Feb 01, 2019
-
-
Davide Caroselli authored
Summary: The `preprocess.py` script has been refactored in order to: 1. Use the `options` module for command line arguments parsing. This will give to `preprocess.py` the ability to load custom modules with `--user-dir` flag (already implemented to all other binaries) 2. Dictionary loading and building code has moved to Task implementation. This allows custom Dictionary classes to be used during the data generation step. Pull Request resolved: https://github.com/pytorch/fairseq/pull/448 Differential Revision: D13674819 Pulled By: myleott fbshipit-source-id: b40648a98ed6c08284577e5ec25876e018d8c822
-
- Jan 30, 2019
-
-
Myle Ott authored
Summary: FACEBOOK This switches back to torch.multiprocessing.spawn, instead of directly calling fb_train.par using a subprocess.Process. This has the advantage that exceptions are propagated properly. It also moves the distributed_init part to happen after data loading, which gets around the timeout issue. The downside of this approach is that it's not so easy to pipe stdout to multiple places, which was nice when using the sweep.py scripts. I'm still working on a fix for that. Reviewed By: rutyrinott, ngoyal2707 Differential Revision: D13873224 fbshipit-source-id: 08d593233b8d23590c01c723363630a79804a8b0
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/484 Differential Revision: D13880636 Pulled By: myleott fbshipit-source-id: 984b2e1c3b281c28243102eb971ea45ec891d94e
-
Myle Ott authored
Summary: Changelog: - `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU. - `0d76427`: fix assertion error when training language model with dataset containing empty sentences - minor bug and style fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/483 Differential Revision: D13867899 Pulled By: myleott fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda
-
- Jan 29, 2019
-
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/482 With this change, we can use different dictionary classes when calling build_dictionary and build_and_save_dictionary Reviewed By: liaimi Differential Revision: D13855100 fbshipit-source-id: 62e6db310b5f078e05c547d2671252233be7b7f0
-
- Jan 25, 2019
-
-
Myle Ott authored
Summary: Changelog: - `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper - `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu - update READMEs - misc fixes Pull Request resolved: https://github.com/pytorch/fairseq/pull/473 Differential Revision: D13819717 Pulled By: myleott fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9
-
Xian Li authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/474 Reviewed By: theweiho, akinh Differential Revision: D13701447 fbshipit-source-id: 34036dce7601835b605e3b169210edc7a6715de6
-
Lucio Dery authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/472 Implementation of "Adafactor: Adaptive Learning Rates with Sublinear Memory Cost" (https://arxiv.org/abs/1804.04235) Differential Revision: D13388049 fbshipit-source-id: 24ad30f4bac248e6aeaced5064bb83784058f03d
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/471 Differential Revision: D13818918 Pulled By: myleott fbshipit-source-id: d3b8dc50e81ee1d2dcc5efc5815998be8461085f
-
- Jan 24, 2019
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/470 Differential Revision: D13803964 Pulled By: myleott fbshipit-source-id: 91b66599e9a539833fcedea07c608b349ba3b449
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/469 Differential Revision: D13802945 Pulled By: myleott fbshipit-source-id: b6976506a8336b96ee40505c4a7638541cc99c95
-
Davide Caroselli authored
Summary: When opening text files without specifying the encoding (i.e. `open(path, "r")` or `open(path, "w")`), python3 will use the preferred locale encoding (`locale.getpreferredencoding()`) so the result is platform dependent and can change from one machine to another. I believe fairseq should enforce its standard (UTF-8 seems like the best choice to me). This pull request explicity specify UTF-8 encoding when reading text files. Pull Request resolved: https://github.com/pytorch/fairseq/pull/460 Differential Revision: D13802525 Pulled By: myleott fbshipit-source-id: 672fd55707ee559ab36d74bc1c24026166ea2367
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/468 Differential Revision: D13802590 Pulled By: myleott fbshipit-source-id: e374e38e74dc91bda0579ae41e26289fb0ba56a2
-
vufg authored
Summary: Although both are supported by Python 3.6, I think it would be better to unify the usage of string format function. Pull Request resolved: https://github.com/pytorch/fairseq/pull/467 Differential Revision: D13802506 Pulled By: myleott fbshipit-source-id: 5c4877547b1c4ca806ab54c80ae483cfbaa7827a
-
frankang authored
Summary: Fix iterating from the beginning bug when initializing the GroupedIterator. (https://github.com/pytorch/fairseq/issues/441) Correct filter criterion for dict type sentence size. (https://github.com/pytorch/fairseq/issues/451) Pull Request resolved: https://github.com/pytorch/fairseq/pull/455 Differential Revision: D13725646 Pulled By: myleott fbshipit-source-id: e698fa6f9b45460f95a75c9e9976a3aa3b6aa523
-
- Jan 17, 2019
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/454 Differential Revision: D13708565 Pulled By: myleott fbshipit-source-id: 5cd0e07e3e1885eef14e3a5e8074f24cf4bde632
-
Myle Ott authored
Summary: There was a very subtle bug here
😢 When we recently removed this line (7633129b), it meant that the learning rate scheduler didn't get initialized until after the first update. Unfortunately pytorch optimizers store the learning rate in their internal state, so some learning rate schedulers use their `__init__` method to reset the learning rate to some sane initial value. This is especially problematic for LR schedulers that include a warmup, where the Optimizer is likely to contain the peak learning rate at initialization, and it's only in the LR scheduler's `__init__` that the (much smaller) warmup value is set. For example, the inverse_sqrt scheduler resets the learning rate upon initialization: https://github.com/pytorch/fairseq/blob/7853818c2e33a63ec17a31bcfe20e4fc75d94130/fairseq/optim/lr_scheduler/inverse_square_root_schedule.py#L48-L50 **Impact:** For the last ~1.5 weeks, the first training update would use the optimizer's default learning rate instead of the initial rate set by the LR scheduler. All subsequent updates used the correct learning rates. This primarily affects LR schedulers with warmups. Pull Request resolved: https://github.com/pytorch/fairseq/pull/453 Differential Revision: D13704453 Pulled By: myleott fbshipit-source-id: a946da30100f837c66bdc6b9b77b014ab4eb8764
-
- Jan 16, 2019
-
-
Davide Caroselli authored
Summary: On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`. This pull request fixes this problem: custom module import in now explicit on every `main()` function. Pull Request resolved: https://github.com/pytorch/fairseq/pull/449 Differential Revision: D13676922 Pulled By: myleott fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6
-
Myle Ott authored
Summary: This is useful for averaging the last N checkpoints, ending at some "best" checkpoint. Pull Request resolved: https://github.com/pytorch/fairseq/pull/452 Differential Revision: D13695407 Pulled By: myleott fbshipit-source-id: 5d9d2bff3706834f01501e9259834c77fb335817
-
Ruty Rinott authored
Summary: optimizing memory use of token_block_dataset by replacing python data structures with numpy arrays. applying needed parts from D13498973, instead of rebasing it on changes Reviewed By: edunov Differential Revision: D13678485 fbshipit-source-id: c0c827a8b95834a6a5456476040ebdc8e42136d4
-
- Jan 15, 2019
-
-
Davide Caroselli authored
Summary: Correct help message was obfuscated by the transient `ArgumentParser` used only for eagerly read `--user-dir` flag. To reproduce just try: ```bash python3 train.py --help ``` Pull Request resolved: https://github.com/pytorch/fairseq/pull/446 Differential Revision: D13674731 Pulled By: myleott fbshipit-source-id: b9503a4d7ef26405be630d31c0ca02386d783031
-
Davide Caroselli authored
Summary: Command line option --user-dir documented in docs/overview.rst Pull Request resolved: https://github.com/pytorch/fairseq/pull/447 Differential Revision: D13674744 Pulled By: myleott fbshipit-source-id: 17049ee5c9f692f5298ef9fa7381ee583f269cde
-
- Jan 14, 2019
-
-
Davide Caroselli authored
Summary: Following discussion on official fairseq (https://github.com/pytorch/fairseq/issues/438), I added the `--user-dir` option to the command line. The user can now specify a path in order to import a custom module with proprietary tasks, architectures and so on. Pull Request resolved: https://github.com/pytorch/fairseq/pull/440 Differential Revision: D13651721 Pulled By: myleott fbshipit-source-id: 38b87454487f1ffa5eaf19c4bcefa0b3b15a8f43
-
Huihui Fan authored
Summary: minor fixes: 1- adding fairseq logo 2- encoder padding for fconv self att 3- legacy ddp change Pull Request resolved: https://github.com/pytorch/fairseq/pull/442 Differential Revision: D13651715 Pulled By: myleott fbshipit-source-id: ac93c80f1dbffdfe03fbd4b8a8ea527aecb576a7
-
- Jan 10, 2019
-
-
Wei Ho authored
Summary: https://github.com/pytorch/fairseq/blob/master/fairseq/trainer.py#L164 calls `train()` without any argument Reviewed By: myleott Differential Revision: D13599203 fbshipit-source-id: 3a096a6dd35a7a3f8309fbda3b54a36f606475e3
-
- Jan 09, 2019
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/439 Differential Revision: D13608151 Pulled By: myleott fbshipit-source-id: 198b84995a6329f8329829cc91184d88f1eab947
-
Art Matsak authored
Summary: https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset is not longer valid, redirects to a blog post listing page. Pull Request resolved: https://github.com/pytorch/fairseq/pull/436 Differential Revision: D13607961 Pulled By: myleott fbshipit-source-id: 1a1074ffcbc454e29bc9d5aed84fdf2089a224bc
-
- Jan 07, 2019
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/433 Differential Revision: D13588032 Pulled By: myleott fbshipit-source-id: 0e5ff361e27b206c4490264f0f51863367499e81
-
- Jan 05, 2019
-
-
Myle Ott authored
-
Myle Ott authored
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/translate/pull/283 Pull Request resolved: https://github.com/pytorch/fairseq/pull/428 Differential Revision: D13564190 Pulled By: myleott fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5
-
- Dec 28, 2018
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/425 Differential Revision: D13558340 Pulled By: myleott fbshipit-source-id: dff8c77027e821d8c80bfbd6a6ccce9ca1a44b78
-
Myle Ott authored
Summary: This was broken in 03a57dec. Pull Request resolved: https://github.com/pytorch/fairseq/pull/424 Differential Revision: D13557540 Pulled By: myleott fbshipit-source-id: 62deda5353032aff20d35d046b0bb843da44d27c
-
Paul Michel authored
Summary: BacktranslationDataset would throw an error when the underlying dataset was an IndexedCachedDataset because prefetching was not handled correctly. This fixes the error. Pull Request resolved: https://github.com/pytorch/fairseq/pull/410 Differential Revision: D13557539 Pulled By: myleott fbshipit-source-id: 398ab59a3ebdbf1c666d862b9f905654eece800c
-
- Dec 26, 2018
-
-
Myle Ott authored
Summary: - 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks - 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking Pull Request resolved: https://github.com/pytorch/fairseq/pull/422 Differential Revision: D13548445 Pulled By: myleott fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9
-
Emanuele Bugliarello authored
Summary: Add argument `--no-token-positional-embeddings` to TransformerModel (currently only available in TransformerLanguageModel) to disable positional embeddings. Pull Request resolved: https://github.com/pytorch/fairseq/pull/421 Differential Revision: D13548450 Pulled By: myleott fbshipit-source-id: b352c702ed1609e3b84d9a8404941d3274a7f883
-