Skip to content
Snippets Groups Projects
Commit 1c3e56d6 authored by friebolin's avatar friebolin
Browse files

Update usage

parent e2af5040
No related branches found
No related tags found
No related merge requests found
......@@ -185,9 +185,9 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w
| Command | Functionality | Arguments |
| ------- | ------------- |-----------|
| <center> **General** </center>|
|🔛 **`--architecture`** | Defines which model is used. | Choose `bert-base-uncased` or `roberta` |
|🔛 **`--model_type`** | How to initialize the Classification Model | Choose `separate` or `one` |
|🔛 **`--tokenizer`**|Which tokenizer to use when preprocessing the datasets.|Choose `swp` for our tokenizer, `li ` for the tokenizer of Li et al. [^6], or `salami` for the tokenizer used by another [student project](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami/-/tree/master/)|
|🔛 **`--architecture`** | Defines which model is used. | Choose `bert-base-uncased` or `roberta`.|
|🔛 **`--model_type`** | How to initialize the Classification Model | Choose `separate` or `one`. ⚠️ *TMix* only works with `one`.|
|🔛 **`--tokenizer`**|Which tokenizer to use when preprocessing the datasets.|Choose `swp` for our tokenizer, `li ` for the tokenizer of Li et al. [^6], or `salami` for the tokenizer used by another [student project](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami/-/tree/master/).|
|**`-tc`**/**`--tcontext`**|Whether or not to preprocess the training set with context.||
|**`-vc`**/**`--vcontext`**|Whether or not to preprocess the test set with context.||
|🔛 **`-max`**/**`--max_length`**|Defines the maximum sequence length when tokenizing the sentences.|Typically choose $256$ or $512$.|
......@@ -200,17 +200,17 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w
|**`-sd`**/**`--save_directory`**|This option specifies the destination directory for the output results of the run.||
|**`-msp`**/**`--model_save_path`**|This option specifies the destination directory for saving the model.|We recommend saving models in [Code/saved_models](Code/saved_models).|
|**`--masking`**|Whether or not to mask the target word.||
|🌐 **`--mixlayer`**| Specify in which `layer` the interpolation takes place. Only select one layer at a time. | Choose from ${0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}$ |
|🍸, 🌐 **`-lambda`**/**`--lambda_value`**|Speficies the lambda value for interpolation of *MixUp* and *TMix*|Choose any value between $0$ and $1$, `type=float`|
|🌐 **`--mixlayer`**| Specify in which `layer` the interpolation takes place. Only select one layer at a time. | Choose from ${0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}$.|
|🍸, 🌐 **`-lambda`**/**`--lambda_value`**|Specifies the lambda value for interpolation of *MixUp* and *TMix*|Choose any value between $0$ and $1$, `type=float`.|
| <center>| **MixUp** specific </center>||
|🍸 **`-mixup`**/**`--mix_up`**| Whether or not to use *MixUp*. If yes, please specify `lambda` and `-mixepoch`| |
|🍸 **`-mixepoch`**/**`--mixepoch`**|Specifies the epoch(s) in which to apply *MixUp*.|Default is `None`|
|🍸 **`-mixepoch`**/**`--mixepoch`**|Specifies the epoch(s) in which to apply *MixUp*.|Default is `None`.|
| <center>| **TMix** specific </center>| Default is `None`.|
|🌐 **`--tmix`**| Whether or not to use *TMix*. If yes, please specify `-mixlayer` and `-lambda`| |
|🌐 **`--tmix`**| Whether or not to use *TMix*. If yes, please specify `-mixlayer` and `-lambda`.| |
| <center>| **Datasets** specific </center>||
|🔛 **`-t`**/**`"--train_dataset`**|Defines which dataset is chosen for training.|Choose any of the datasets from [original_datasets](data/original_datasets), [fused_datasets](data/fused_datasets) or [paraphrases](data/paraphrases)|
|🔛 **`-v`**/**`--test_dataset`**|Defines which dataset is chosen for testing.|Choose from ["semeval_test.txt"](data/original_datasets/semeval_test.txt), ["companies_test.txt"](data/original_datasets/companies_test.txt) or ["relocar_test.txt"](data/original_datasets/relocar_test.txt)|
|**`--imdb`**| Whether or not to use the [IMDB](https://huggingface.co/datasets/imdb) dataset. Note that this is only relevant for validating our *TMix* implementation.||
|🔛 **`-t`**/**`"--train_dataset`**|Defines which dataset is chosen for training.|Choose any of the datasets from [original_datasets](data/original_datasets), [fused_datasets](data/fused_datasets) or [paraphrases](data/paraphrases).|
|🔛 **`-v`**/**`--test_dataset`**|Defines which dataset is chosen for testing.|Choose from ["semeval_test.txt"](data/original_datasets/semeval_test.txt), ["companies_test.txt"](data/original_datasets/companies_test.txt) or ["relocar_test.txt"](data/original_datasets/relocar_test.txt).|
|**`--imdb`**| Whether or not to use the [IMDB](https://huggingface.co/datasets/imdb) dataset. Note that this is only relevant for validating our *TMix* implementation.|⚠️ Only works wit `BERT`.|
|🔛 **`-b`**/**`--batch_size`**|Defines the batch size for the training process.|Default is $32$.|
|🔛 **`-tb`**/**`--test_batch_size`**|Specifies the batch size for the test process.|Default is $16$.|
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment