Skip to content
Snippets Groups Projects
Commit 4924b49a authored by chernenko's avatar chernenko
Browse files

Update chertoy.py

parent 764e7387
No related branches found
No related tags found
No related merge requests found
......@@ -11,9 +11,11 @@
------------------- DESCRIPTION -------------------
The pipeline system performs the 17th variant of the system for WSI (word sense induction) task (the Task 11 at SemEval 2013), which showed the best performance on the trial data.
The pipeline system performs the 17th variant of the system for WSI (word sense induction) task (the Task 11 at SemEval 2013),
which showed the best performance on the trial data.
The system creates semantic related clusters from the given snippets (the text fragments we get back from the search engine) for each pre-defined ambigue topic.
The system creates semantic related clusters from the given snippets (the text fragments we get back from the search engine)
for each pre-defined ambigue topic.
------------------- METHODS -------------------
......@@ -22,7 +24,8 @@ For the WSI purposes it uses the following methods:
- For pre-rpocessing: tokenization + remove punctuation
- Language model: sense2vec (paper: https://arxiv.org/abs/1511.06388, code: https://github.com/explosion/sense2vec)
- Compositional semantics: vector mixture model (BOW (bag-of-words) representation with summarization for each snippet)
- Clustering: Mean Shift clustering with sklearn.cluster (http://scikit-learn.org/stable/modules/generated/sklearn.cluster.MeanShift.html#sklearn.cluster.MeanShift) with default parameters.
- Clustering: Mean Shift clustering with sklearn.cluster
(http://scikit-learn.org/stable/modules/generated/sklearn.cluster.MeanShift.html#sklearn.cluster.MeanShift) with default parameters.
"""
import sys
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment