Skip to content
Snippets Groups Projects
Commit e30832d1 authored by umlauf's avatar umlauf
Browse files

MixUp in readme

parent e0d8a6c5
No related branches found
No related tags found
No related merge requests found
......@@ -108,8 +108,23 @@ As a comparatively safe (= label preserving) data augmentation strategy, we sele
### 🍸 2. MixUp (Feature Space)<a name="mixup"></a>
To transform representations of our data in the feature space, we use MixUp, proposed by Zhang et al.[^3], which is based on linear interpolations of input data and corresponding labels in hidden space. We take our inspiration from Sun et al.[^4], who employ this method to the last hidden layer of transformer models. Their dynamic MixUp approach can be applied to later training epochs exclusively, thus allowing to learn good representations first.
Our method adopts the framework of the mixup transformer proposed by Sun et al. [^4]. This approach involves interpolating the representation of two instances on the last hidden state of the transformer model (in our case, BERT-base-uncased \cite{BERT}).
To derive the interpolated hidden representation and corresponding label, we use the following formulas on the representation of two data samples:
🔡 *Instance interpolation:*
$$\hat{x} = \lambda T(x_i) + (1- \lambda)T(x_j)$$
🏷️ *Label interpolation :*
$$\hat{y} = \lambda T(y_i) + (1- \lambda)T(y_j)$$
Here, $T(x_i)$ and $T(x_j)$
represent the hidden representations of the two instances, $T(y_i)$
and $T(y_j)$ represent their corresponding labels, and $\lambda$ is a mixing coefficient that determines the degree of interpolation.
We used a fixed $\lambda$ which was set for the entire training process. In the following the derived instances $\hat{x}$ with the derived label $\hat{y}$ as new true label are given into the classifier to generate a prediction.
The MixUp process can be used dynamically during training at any epoch.
***
## 🗃️ Data <a name="data"></a>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment