From f9995c9863829a6d3ef18d06e7a1d9a3d1859463 Mon Sep 17 00:00:00 2001
From: Utaemon Toyota <toyota@cl.uni-heidelberg.de>
Date: Wed, 27 Feb 2019 20:56:04 +0100
Subject: [PATCH] update README with preprocessing for method 1

---
 Senseval_Prep/README.md | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/Senseval_Prep/README.md b/Senseval_Prep/README.md
index a650bf4..f937bc6 100644
--- a/Senseval_Prep/README.md
+++ b/Senseval_Prep/README.md
@@ -6,7 +6,12 @@ Softwareprojekt WS2018/19
 Betreuerin: Prof. Dr. Anette Frank
 Graph Embedding Propagation
 
-# Senseval Preprocessing
+# Senseval Preprocessing for Method 1
+
+This is an implementation to provide preprocessed data for our Word Sense Disambiguation Method 1. The skript will produce json-files for SensEval-2 and 3. This files include sentence splitted lists with lemmatized lowered words in a tuple together with the according WordNet3.0 POS-tag.
+The output will be two JSON-files with preprocessed data from SensEval-2 respectively SensEval-3 datasets.
+
+# Senseval Preprocessing for Method 2
 
 This is an implementation to provide preprocessed data for our Word Sense Disambiguation Method 2. The skript will produce pkl-files for each document in Senseval2/3 named as the document name.
 From provided Senseval-english-allword-test-data and their Penntree Bank annotations only the useful information will be filtered out. Lemmas which are not included in glossmappings or listed in stopwords will be deleted. For multiword-expressions, only the tag for the head-token will be saved. Information about their satellites will be discarded.
@@ -29,16 +34,21 @@ gloss_mapping.txt
 stopwords.txt
 - includes stopwords, which will be filtered out
 
-Python3 skript
+Python3 skripts
 - senseval_preprocessing.py
+- preprocess_senseval_method1.py
 
 ## Dependencies
 re 	- for regular expression matching
-pickle 	- for saving the resulting lists in a pkl-file
+json	- for saving the results for WSD method 1
+pickle 	- for saving the resulting lists in a pkl-file for WSD method 2
 nltk	- WordNetLemmatizer from NLTK for lemmatizing
 
-## Running Instructions
-python3 senseval_preprocessing.py [-s] [-g] [-v]
+## Running Instructions Method 1
+python[3] preprocess_senseval_method1.py
+
+## Running Instructions Method 2
+python[3] senseval_preprocessing.py [-s] [-g] [-v]
         -s / --stopwords    Path to txt-file with stopwords
         -g / --gloss        Path to txt-file with gloss mappings
         -v / --version      valid input: 2 or 3 for senseval 2 / 3
-- 
GitLab