Implementation of "GraphSGAN", a GAN-based semi-supervised learning algorithm for graph data.
Paper: Semi-supervised Learning on Graphs with Generative Adversarial Nets
unzip cora.dataset.zip
The codes are written under Python==2.7 and pytorch~=3. If you want to run it in other environments, minor changes might be needed.
python GraphSGAN.py --cuda
will run a example, banchmark cora task.
The programme takes FeatureGraphDataset as input and cora.dataset is built from FeatureGraphDataset.py. You can build your own FeatureGraphDataset.
Early stop and tuned hyperparameters are not included in this minimal release. You can determine them based on your own validation set.
rm -rf models logfile
can delete the saved models and logfile to retrain.
You can visualize the infomation in logfile by using tensorboard.
The expected accuracy of example should be above 0.83.
- Build
FeatureGraphDatasetfor new dataset. This class takes init parameters as below:
features: numpy ndarray, [[f1, f2, ...], [f1, f2, ...]]
label: numpy ndarray, [0, 1, 2, 0, ...]
adj: dict of (int, list of int), {[1,2],[0,3],...}
- Load embeddings for dataset.
It is recommended to use read_embeddings method to read embeddings from file.
The first line of embeddings file are two integers: n and dim.
In the next n lines, each line contains dim + 1 integers. The first is the No. of the node and the rest are embeddings.
Example:
3 2
0 0.123 0.233
1 0.720 -0.121
2 0.778 -0.921
3 0.161 -0.775
- Setting splits
call setting(label_num_per_class, test_num)
- Replace
datasetinGraphSGAN.pywith built new dataset.
In the paper of SCAN_DIS, the performance of GraphSGAN on Pubmed, Flickr and BlogCatalog are tested:
| Pubmed | Flickr | BlogCatalog | ||||||
|---|---|---|---|---|---|---|---|---|
| Marco-F1 | Micro-F1 | Acc | Marco-F1 | Micro-F1 | Acc | Marco-F1 | Micro-F1 | Acc |
| .839 | .842 | .841 | .697 | .715 | .702 | .698 | .703 | .719 |
Although not responsible for the results, we think it is really worth reference.