Using cross-lingual embeddings trained on Twitter and a BiLSTM for the hate speech classification task of GermEval2018
We present several cross-lingual approaches to hate speech detection based on
(pseudo-)cross-lingual embeddings trained on twitter data.
This repository provides the embeddings, systems and tools used for the software
project course in the summer of 2018.
## Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
These instructions will get you a copy of the project up and running on your
local machine for development and testing purposes. See deployment for notes on
how to deploy the project on a live system.
### Prerequisites
Python 3.6 is highly recommended to deploy this system. Using older versions
of Python may require manual changes in the code to run certain modules provided
in this repository.
```
pip3 install -r requirements.txt
cd kernseife
python3.6 -m venv env
source env/bin/activate
pip3.6 install -r requirements.txt
```
## Training Embeddings
We used train_skipgram.py by Michael Egger (See *Build With*) to train our