Skip to content
Snippets Groups Projects
Commit 9038e14b authored by toyota's avatar toyota
Browse files

add README and add argparse to cora.py

parent 95fcf1c4
No related branches found
No related tags found
No related merge requests found
......@@ -8,6 +8,7 @@ Graph Embedding Propagation
# Cora Node Classification
To evaluate the trained graph and the embeddings the task of node classification will be executed. First, the data of cora will be imported into a networkX graph, which will be saved in a pickle file to use it for the training of the embeddings with our EP-SP algorithm. Afterwards the trained embedding will be evaluated with LibLinear L2-Logistic Regression provided from sklearn.
Graph building is provided on cora.py, the evaluation on node_classification.py.
# Required Data
- Cora dataset saved in cora_data for building the graph
......@@ -33,7 +34,10 @@ For node_classification.py
# Running instructions
For cora.py
...
python3 cora.py [-n] [-e] [-o]
-n / --nodes Path to cora file containing nodes
-e / --edges Path to cora file containing edges
-o / --output Path where the graph should be saved
For node_classification.py
python3 node_classification.py [-g] [-e] [-s] [-i] [-n]
......
"""
@project: Software Projekt @ Heidelberg University, Institute for Computational Linguistics
@requirements: cora data, numpy, networkX, pickle
@info
Getting a networkx graph from Cora. Graph can be saved in txt file. CARE: numpy-arrays are converted to lists due to errors (NumPy array is not JSON serializable).
Getting a networkx graph from Cora. Graph will be saved in a pickle file.
@usage
get_graph(path_nodes="/cora_data/cora.content", path_edges="/cora_data/cora.cites")
-> return graph with nodes and edges
To write the graph informations in file:
def write_graph_to_file(path_nodes="/cora_data/cora.content", path_edges="/cora_data/cora.cites", path_output_graph = "")
python3 cora.py [-n] [-e] [-o]
-n / --nodes Path to cora file containing nodes
-e / --edges Path to cora file containing edges
-o / --output Path where the graph should be saved
As a module (used in node_classification.py) you can access the graph with
read_pickle_graph("path_to_cora_graph")
"""
import argparse
import networkx as nx
import numpy as np
import pickle as pkl
......@@ -116,8 +122,11 @@ def read_pickle_graph(path = "graph.pkl"):
graph = pkl.load(f)
return graph
if __name__ == "__main__":
# execute only if run as a script
get_graph(path_nodes="/cora_data/cora.content", path_edges="/cora_data/cora.cites")
get_init_emb(rand_type="normal_random", dimension = 128, quantity=1433)
parser = argparse.ArgumentParser(description="Skript for building cora graph.")
parser.add_argument("-n", "--nodes", default="/cora_data/cora.content", help="path to file containing cora nodes")
parser.add_argument("-e", "--edges", default="/home/utaemon/SP/cora/cora.cites", help="path to file containing edges/citations")
parser.add_argument("-o", "--output", default="", help="path where the graph should be saved")
args = parser.parse_args()
write_pickle_graph_file(path_nodes=args.nodes, path_edges=args.edges, path_output_graph=args.output)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment