Skip to content
Snippets Groups Projects
Commit 0dbfd3ef authored by umlauf's avatar umlauf
Browse files
parents cf8f7764 45536356
No related branches found
No related tags found
No related merge requests found
......@@ -27,22 +27,23 @@ def train(model, name,train_dataset, test_dataset, seed, batch_size, test_batch_
"""Train loop for models. Iterates over epochs and batches and gives inputs to model. After training, call evaluation.py for evaluation of finetuned model.
Params:
model: model out of models.py
name: str
train_dataset: Dataset
test_dataset: Dataset
seed: int
batch_size:
test_batch_size:
num_epochs: int
imdb: bool
mixup: bool
lambda_value: float
mixepoch:int
tmix: bool
mixlayer: int in {0, 11}
learning_rate: float
mlp_leaning_rate:float
model: model out of models.py ->WordClassificationModel, BertForWordClassification or RobertaForWordClassification
name: str -> specifies architecture of model (either bert-base-uncased or roberta-base)
train_dataset: Dataset -> Train dataset as Torch.Dataset Object (created in preprocess.py)
test_dataset: Dataset ->Test dataset as Torch.Dataset Object (created in preprocess.py)
seed: int -> Random seed
batch_size: ->batch size for training
test_batch_size: -> batch size for testing
num_epochs: int -> number of epochs
imdb: bool ->whether or not imdb dataset is used
mixup: bool ->whether or not to use mixup in training
lambda_value: float ->if mixup or tmix selected, what lambda value to use
mixepoch:int -> specifies in what epoch to use mixup
tmix: bool ->whether or not tmix is used in training (used to differentiate between mixing in training and not mixing in evaluation)
mixlayer: int in {0, 11} ->what layer to mix in tmix
learning_rate: float ->learning rate for Bert/Roberta Model, or WordClassificationModel including linear classifier
mlp_leaning_rate:float ->separate learning rate for multi layer perceptron
Returns: Evaluation Results for train and test dataset in Accuracy, F1, Precision and Recall"""
......
......@@ -172,15 +172,15 @@ pip install -r requirements.txt
***
## ⚙️ Usage <a name="usage"></a>
🚀 Launch our application by following the steps below:
[welche argumente genau?]
#### 🚀 Launch our application by following the steps below:
```bash
./main.py <COMMAND> <ARGUMENTS>...
```
For `<COMMAND>` you must enter one of the commands you find in the list below, where you can also find an overview about necessary `<ARGUMENTS>`.
For `<COMMAND>` you must enter one of the commands you find in the list below, where you can also find an overview about possible `<ARGUMENTS>`.
ℹ️ The icons indicate if a command is mandatory (🔛). 🍸 indicates that this command is mandatory for *MixUp*, 🌐 describes mandatory commands for *TMix*.
| Command | Functionality | Arguments |
| ------- | ------------- |-----------|
......@@ -190,18 +190,18 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w
|🔛 **`--tokenizer`**|Which tokenizer to use when preprocessing the datasets.|Choose `swp` for our tokenizer, `li ` for the tokenizer of Li et al. [^6], or `salami` for the tokenizer used by another [student project](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami/-/tree/master/)|
|**`-tc`**/**`--tcontext`**|Whether or not to preprocess the training set with context.||
|**`-vc`**/**`--vcontext`**|Whether or not to preprocess the test set with context.||
|🔛 **`-max`**/**`--max_length`**|Defines the maximum sequence length when tokenizing the sentences.|⚠️ Always choose 256 for *TMix* and 512 for the other models.|
|🔛 **`-max`**/**`--max_length`**|Defines the maximum sequence length when tokenizing the sentences.|Typically choose $256$ or $512$.|
|🔛 **`--train_loop`**|Defines which train loop to use.|Choose `swp` for our train loop implementation and `salami` for the one of the [salami](https://gitlab.cl.uni-heidelberg.de/salami-hd/salami) student project.|
|🔛 **`-e`**/**`--epochs`**|Number of epochs for training.||
|🔛 **`-lr`**/**`--learning_rate`**|Learning rate for training.|`type=float`|
|**`-lrtwo`**/**`--second_learning_rate`**| Separate learning rate for multi layer perceptron.|Default is `None`.|
|**`--mlp`**| Whether or not to use two layer MLP as classifier.| |
|🔛 **`-rs`**/**`--random_seed`**|Random seed for initialization of the model.|Default is $42$.|
|🔛 **`-sd`**/**`--save_directory`**|This option specifies the destination directory for the output results of the run.||
|**`-sd`**/**`--save_directory`**|This option specifies the destination directory for the output results of the run.||
|**`-msp`**/**`--model_save_path`**|This option specifies the destination directory for saving the model.|We recommend saving models in [Code/saved_models](Code/saved_models).|
|**`--masking`**|Whether or not to mask the target word.||
|🌐 **`--mixlayer`**| Specify in which `layer` the interpolation takes place. Only select one layer at a time. | Choose from ${0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}$ |
|🍸, 🌐 **`-lambda`**/**`--lambda_value`**|Speficies the lambda value for interpolation of *MixUp* and *TMix*|Default is $0.4$, `type=float`|
|🍸, 🌐 **`-lambda`**/**`--lambda_value`**|Speficies the lambda value for interpolation of *MixUp* and *TMix*|Choose any value between $0$ and $1$, `type=float`|
| <center>| **MixUp** specific </center>||
|🍸 **`-mixup`**/**`--mix_up`**| Whether or not to use *MixUp*. If yes, please specify `lambda` and `-mixepoch`| |
|🍸 **`-mixepoch`**/**`--mixepoch`**|Specifies the epoch(s) in which to apply *MixUp*.|Default is `None`|
......@@ -215,17 +215,28 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w
|🔛 **`-tb`**/**`--test_batch_size`**|Specifies the batch size for the test process.|Default is $16$.|
extra: BT and inference
#### 📝 If you want to use our *backtranslation* code, you must execute the following:
```bash
python3 Code/backtranslate.py
```
#### 🎥 If you want to see a demo of our model, you can enter your own sentence and let the model predict if a target word is used in its `literal` or `non-literal` sense:
```bash
python3 inference.py
```
<img src="documentation/images/demo.png" width="80%" height="80%">
[ADD screenshot of demo?]
***
## 🏯 Code-Structure <a name="code-structure"></a>
- ⚙️ [`requirements.txt`](requirements.txt): All necessary modules to install.
- 📱 [`main.py`](main.py): Our main code file which does ...
- 💻 [`Code`](code): Here, you can find all code files for our different models and data augmentation methods.
- 📱 [`main.py`](main.py): Our main code file is responsible for organizing input options and calling the necessary functions to preprocess datasets, train the model, and evaluate it on a test set.
- 🎥 [`inference.py`](inference.py): Run a demo version of our to test if an input sentence contains a metonymy.
- 💻 [`Code`](code): Here, you can find all code files for our different models and data augmentation methods, as well as a [`submit_template.sh`](Code/submit_template.sh).
- 📀 [`data`](data): Find all datasets in this folder.
- 🗂️ [`backtranslations`](data/backtranslations): Contains unfiltered generated paraphrases.
- 🗂️ [`fused_datasets`](data/fused_datasets): Contains original datasets fused with filtered paraphrases. Ready to be used for training the models.
......
documentation/images/demo.png

1.13 MiB

......@@ -208,8 +208,7 @@ if __name__ == "__main__":
"-lambda",
"--lambda_value",
help="speficies the lambda value for mixup",
type=float,
default=0.4)
type=float)
parser.add_argument(
"-mixepoch",
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment