As a comparatively safe (= label preserving) data augmentation strategy, we selected *backtranslation* using the machine translation model Fairseq [^9]. Adapting the approach of Chen et al. [^2] we use the pre-trained single models :
As a comparatively safe (= label preserving) data augmentation strategy, we selected *backtranslation* using the machine translation model Fairseq [^9]. Adapting the approach of Chen et al. [^2a] we use the pre-trained single models :
@@ -124,6 +125,10 @@ Here, $T(x_i)$ and $T(x_j)$ represent the hidden representations of the two inst
We used a fixed $\lambda$ which was set for the entire training process. In the following, the derived instances $\hat{x}$ with the derived label $\hat{y}$ as new true label are given into the classifier to generate a prediction.
The *MixUp* process can be used dynamically during training at any epoch.
### 🌐 3. TMix (Feature Space)<a name="tmix"></a>
We use the same set fixed $\lambda$, but in contrast to *MixUp*, *TMix* is applied in all epochs [^2b]. It can dynamically be used in any layer, and we focus our experiments on the transformer layers 7 and 9 for interpolation, since they have been found to contain the syntactic and semantic information.
***
## 🗃️ Data <a name="data"></a>
...
...
@@ -211,7 +216,10 @@ For `<COMMAND>` you must enter one of the commands you find in the list below, w
[^1]:Bayer, Markus, Kaufhold, Marc-André & Reuter, Christian. ["A survey on data augmentation for text classification."](https://arxiv.org/abs/2107.03158) CoRR, 2021.
[^2]:Chen, Jiaao, Wu, Yuwei & Yang, Diyi. ["Semi-supervised models via data augmentation for classifying interactive affective responses."](https://arxiv.org/abs/2004.10972) 2020.
[^2a]:Chen, Jiaao, Wu, Yuwei & Yang, Diyi. ["Semi-Supervised Models via Data Augmentation for Classifying Interactive Affective Responses."](https://arxiv.org/abs/2004.10972) 2020.
[^2b]:Chen, Jiaao, Wu, Yuwei & Yang, Diyi. ["MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification."](https://aclanthology.org/2020.acl-main.194) 2020.
[^3]:Devlin, Jacob, Chang, Ming-Wei, Lee, Kenton & Toutanova, Kristina. ["BERT: pre-training of deep bidirectional transformers for language understanding."](http://arxiv.org/abs/1810.04805) CoRR, 2018.