From 3e4e9c27ad8e2c8053849f84b06165f6e0417b30 Mon Sep 17 00:00:00 2001
From: toyota <toyota@cl.uni-heidelberg.de>
Date: Fri, 30 Mar 2018 18:31:17 +0200
Subject: [PATCH] fix typo

---
 README.md     | 6 +++---
 lib/README.md | 9 ++++-----
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index 6babca7..35ba020 100644
--- a/README.md
+++ b/README.md
@@ -95,7 +95,7 @@ After running the system you'll have the output file in your project folder.
 ### RUN THE SYSTEM:
 
 
-git clone https://gitlab.cl.uni-heidelberg.de/semantik_project/wsd_chernenko_schindler_toyota.git
+git clone https://gitlab.cl.uni-heidelberg.de/semantik_project/wsi_chernenko_schindler_toyota.git
 
 cd bin
 
@@ -108,7 +108,7 @@ python3 chertoy.py /home/tatiana/Desktop/FSemantik_Projekt /test /topics.txt /re
 
 ### Other files:
 
-* Performances_Table.pdf.pdf - a performance table with F1, RI, ARI and JI values of the baseline and 40 experiments (incl. CHERTOY) on the trial data.
+* Performances_Table.pdf - a performance table with F1, RI, ARI and JI values of the baseline and 40 experiments (incl. CHERTOY) on the trial data.
 
 * bin
 
@@ -120,7 +120,7 @@ The folder experiments contains an implementation of the baseline and 40 differe
 
 * lib
 
-The folder contains code for preprocessing Wikipedia Dataset to train own sent2vec models for the experiments and a README file. Our preprocessed Wikipedia 2017 dataset and two self-trained models of the Wikipedia 2017 dataset, that we used in our experiments with sent2vec, are provided on /proj/toyota on the server of the Institut.
+The folder contains code for preprocessing Wikipedia Dataset to train own sent2vec models for the experiments and a README file. Our preprocessed Wikipedia 2017 dataset and two self-trained models of the Wikipedia 2017 dataset, that we used in our experiments with sent2vec, are provided on /proj/toyota on the server of the Institut of Computerlinguistics Heidelberg.
 Other models that we used during our experiments can be found in sense2vec and sent2vec repositories.
 
 * experiments
diff --git a/lib/README.md b/lib/README.md
index b904bd5..1cbec2b 100644
--- a/lib/README.md
+++ b/lib/README.md
@@ -8,15 +8,14 @@ This is an implementation to provide necessary pre-processing steps for modeling
 
 Download Wikipedia Dump
 - Wikipedia Dumps for the english language is provided on https://meta.wikimedia.org/wiki/Data_dump_torrents#English_Wikipedia
-- In our model we used enwiki-20170820-pages-articles-multistream.xml.bz2 (14.1 GiB)
+- For our model we used enwiki-20170820-pages-articles-multistream.xml.bz2 (14.1 GiB)
 
 Dependencies:
 - wikiExtractor: http://attardi.github.io/wikiextractor
 - fasttext: https://github.com/facebookresearch/fastText
 - sent2vec: https://github.com/epfml/sent2vec
 
-
-First of all the wikipedia text needs to be extracted from the provided XML.
+First the wikipedia text needs to be extracted from the provided XML.
 -extracted file: enwiki-20170820-pages-articles-multistream.xml (21.0GB)
 
 From the XML the plain text will be extracted using wikiExtractor:
@@ -25,9 +24,9 @@ WikiExtractor.py -o OUTPUT-DIRECTORY INPUT-XML-FILE
 _Example_
 WikiExtractor.py -o /wikitext enwiki-20170820-pages-articles-multistream.xml
 
-WikiExtractor will create several directories AA, AB, AC, ...,  CH with a total size of 6.2GB. Each directory contains 100 txt documents (besides CH -> 82).
+WikiExtractor will create several directories AA, AB, AC, ...,  CH with a total size of 6.2GB. Each directory contains 100 .txt documents (besides CH -> 82).
 Each article begins with an ID such as <doc id="12" url="https://en.wikipedia.org/wiki?curid=12" title="Anarchism">. Also comments in Parentheses are provided.
-Using preprocess_wikitext.py we delete all IDs, parentheses with their content and also quotes like ' or " and getting a plain wikipedia text. The text file contain one sentence per line. 
+Using preprocess_wikitext.py we delete all IDs, parentheses with their content and also quotes like ' or " for getting a plain wikipedia text. The output text file contains one sentence per line. 
 
 _Usage_
 python3 preprocess_wikitext.py input_directory_path output_txt_file_path
-- 
GitLab