diff --git a/Code/train.py b/Code/train.py index acd07f2217d64ae2ef8565d73b0885ce1b6e61a4..516fb0c89a64dfea035c4d06faace5aabeb39984 100644 --- a/Code/train.py +++ b/Code/train.py @@ -27,22 +27,23 @@ def train(model, name,train_dataset, test_dataset, seed, batch_size, test_batch_ """Train loop for models. Iterates over epochs and batches and gives inputs to model. After training, call evaluation.py for evaluation of finetuned model. Params: - model: model out of models.py - name: str - train_dataset: Dataset - test_dataset: Dataset - seed: int - batch_size: - test_batch_size: - num_epochs: int - imdb: bool - mixup: bool - lambda_value: float - mixepoch:int - tmix: bool - mixlayer: int in {0, 11} - learning_rate: float - mlp_leaning_rate:float + + model: model out of models.py ->WordClassificationModel, BertForWordClassification or RobertaForWordClassification + name: str -> specifies architecture of model (either bert-base-uncased or roberta-base) + train_dataset: Dataset -> Train dataset as Torch.Dataset Object (created in preprocess.py) + test_dataset: Dataset ->Test dataset as Torch.Dataset Object (created in preprocess.py) + seed: int -> Random seed + batch_size: ->batch size for training + test_batch_size: -> batch size for testing + num_epochs: int -> number of epochs + imdb: bool ->whether or not imdb dataset is used + mixup: bool ->whether or not to use mixup in training + lambda_value: float ->if mixup or tmix selected, what lambda value to use + mixepoch:int -> specifies in what epoch to use mixup + tmix: bool ->whether or not tmix is used in training (used to differentiate between mixing in training and not mixing in evaluation) + mixlayer: int in {0, 11} ->what layer to mix in tmix + learning_rate: float ->learning rate for Bert/Roberta Model, or WordClassificationModel including linear classifier + mlp_leaning_rate:float ->separate learning rate for multi layer perceptron Returns: Evaluation Results for train and test dataset in Accuracy, F1, Precision and Recall""" diff --git a/README.md b/README.md index b359be4162dc2fd243b932a05da278fb40494672..09e8f071c1460da75e3d31fc37ee942ed22fee74 100644 --- a/README.md +++ b/README.md @@ -172,15 +172,15 @@ pip install -r requirements.txt *** ## âš™ï¸ Usage <a name="usage"></a> -🚀 Launch our application by following the steps below: - -[welche argumente genau?] +#### 🚀 Launch our application by following the steps below: ```bash ./main.py <COMMAND> <ARGUMENTS>... ``` -For `<COMMAND>` you must enter one of the commands you find in the list below, where you can also find an overview about necessary `<ARGUMENTS>`. +For `<COMMAND>` you must enter one of the commands you find in the list below, where you can also find an overview about possible `<ARGUMENTS>`. + +â„¹ï¸ The icons indicate if a command is mandatory (🔛). 🸠indicates that this command is mandatory for *MixUp*, 🌠describes mandatory commands for *TMix*. | Command | Functionality | Arguments | | ------- | ------------- |-----------| @@ -190,18 +190,18 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w |🔛 **`--tokenizer`**|Which tokenizer to use when preprocessing the datasets.|Choose `swp` for our tokenizer, `li ` for the tokenizer of Li et al. [^6], or `salami` for the tokenizer used by another [student project](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami/-/tree/master/)| |**`-tc`**/**`--tcontext`**|Whether or not to preprocess the training set with context.|| |**`-vc`**/**`--vcontext`**|Whether or not to preprocess the test set with context.|| -|🔛 **`-max`**/**`--max_length`**|Defines the maximum sequence length when tokenizing the sentences.|âš ï¸ Always choose 256 for *TMix* and 512 for the other models.| +|🔛 **`-max`**/**`--max_length`**|Defines the maximum sequence length when tokenizing the sentences.|Typically choose $256$ or $512$.| |🔛 **`--train_loop`**|Defines which train loop to use.|Choose `swp` for our train loop implementation and `salami` for the one of the [salami](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami) student project.| |🔛 **`-e`**/**`--epochs`**|Number of epochs for training.|| |🔛 **`-lr`**/**`--learning_rate`**|Learning rate for training.|`type=float`| |**`-lrtwo`**/**`--second_learning_rate`**| Separate learning rate for multi layer perceptron.|Default is `None`.| |**`--mlp`**| Whether or not to use two layer MLP as classifier.| | |🔛 **`-rs`**/**`--random_seed`**|Random seed for initialization of the model.|Default is $42$.| -|🔛 **`-sd`**/**`--save_directory`**|This option specifies the destination directory for the output results of the run.|| +|**`-sd`**/**`--save_directory`**|This option specifies the destination directory for the output results of the run.|| |**`-msp`**/**`--model_save_path`**|This option specifies the destination directory for saving the model.|We recommend saving models in [Code/saved_models](Code/saved_models).| |**`--masking`**|Whether or not to mask the target word.|| |🌠**`--mixlayer`**| Specify in which `layer` the interpolation takes place. Only select one layer at a time. | Choose from ${0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}$ | -|ðŸ¸, 🌠**`-lambda`**/**`--lambda_value`**|Speficies the lambda value for interpolation of *MixUp* and *TMix*|Default is $0.4$, `type=float`| +|ðŸ¸, 🌠**`-lambda`**/**`--lambda_value`**|Speficies the lambda value for interpolation of *MixUp* and *TMix*|Choose any value between $0$ and $1$, `type=float`| | <center>| **MixUp** specific </center>|| |🸠**`-mixup`**/**`--mix_up`**| Whether or not to use *MixUp*. If yes, please specify `lambda` and `-mixepoch`| | |🸠**`-mixepoch`**/**`--mixepoch`**|Specifies the epoch(s) in which to apply *MixUp*.|Default is `None`| @@ -215,17 +215,28 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w |🔛 **`-tb`**/**`--test_batch_size`**|Specifies the batch size for the test process.|Default is $16$.| -extra: BT and inference +#### 📠If you want to use our *backtranslation* code, you must execute the following: + +```bash +python3 Code/backtranslate.py +``` + +#### 🎥 If you want to see a demo of our model, you can enter your own sentence and let the model predict if a target word is used in its `literal` or `non-literal` sense: + +```bash +python3 inference.py +``` +<img src="documentation/images/demo.png" width="80%" height="80%"> -[ADD screenshot of demo?] *** ## 🯠Code-Structure <a name="code-structure"></a> - âš™ï¸ [`requirements.txt`](requirements.txt): All necessary modules to install. -- 📱 [`main.py`](main.py): Our main code file which does ... -- 💻 [`Code`](code): Here, you can find all code files for our different models and data augmentation methods. +- 📱 [`main.py`](main.py): Our main code file is responsible for organizing input options and calling the necessary functions to preprocess datasets, train the model, and evaluate it on a test set. +- 🎥 [`inference.py`](inference.py): Run a demo version of our to test if an input sentence contains a metonymy. +- 💻 [`Code`](code): Here, you can find all code files for our different models and data augmentation methods, as well as a [`submit_template.sh`](Code/submit_template.sh). - 📀 [`data`](data): Find all datasets in this folder. - ðŸ—‚ï¸ [`backtranslations`](data/backtranslations): Contains unfiltered generated paraphrases. - ðŸ—‚ï¸ [`fused_datasets`](data/fused_datasets): Contains original datasets fused with filtered paraphrases. Ready to be used for training the models. diff --git a/documentation/images/demo.png b/documentation/images/demo.png new file mode 100644 index 0000000000000000000000000000000000000000..ccb82dbd4d2bf4870f3936341bfc166394448572 Binary files /dev/null and b/documentation/images/demo.png differ diff --git a/main.py b/main.py index 8a33f4ff0965ece565502accf64130a78ee37176..a623948f8e42b5fc9cf088810b1b45a9466b6672 100644 --- a/main.py +++ b/main.py @@ -208,8 +208,7 @@ if __name__ == "__main__": "-lambda", "--lambda_value", help="speficies the lambda value for mixup", - type=float, - default=0.4) + type=float) parser.add_argument( "-mixepoch",