GATAS

Implementation of Graph Representation Learning Network via Adaptive Sampling: http://arxiv.org/abs/2006.04637

The algorithm represents nodes by reducing their neighbour representations with attention. Multi-step neighbour representations incorporate different path properties. Neighbours are sampled using learnable depth coefficients.

Overview

This repository is organized as follows:

data/ contains the necessary files for the Cora, Pubmed, Citeseer, PPI, Twitter and YouTube datasets.
framework/ contains helper libraries for model development, training and evaluation.
gatas/ contains the implementation of GATAS.
node_classification/ contains a node label classifier using the model.
link_prediction/ contains a link predictor using the model.

Instructions

First we must create a CSR binary representation of the graph where the values are the edge type indices. For the Cora, Citeseer and PubMed datasets, we have precomputed and placed them in data/. For PPI, it can be computed with:

python3 -m node_classification.datasets.ppi --path data/ppi

For the Twitter and YouTube datasets, it can be computed with:

python3 -m link_prediction.datasets.gatne --path {path to dataset} --num-edge-types {number of edge types}

These scripts will also collect and preprocess the node features when available, and create dataset splits with inputs and targets for the tasks. Once we have a CSR graph representation, we can compute the transition probabilities by running:

python3 -m gatas.transitions --path {path to dataset} --num_steps {number of steps}

GATAS has two components: NeighbourSampler and NeighbourAggregator. NeighbourSampler can be initialized with a path so the precomputed transition data can be used:

from gatas.sampler import NeighbourSampler

neighbour_sampler = NeighbourSampler.from_path(num_steps=3, path='data/ppi')

NeighbourAggregator can receive a matrix of node features and can be initialized as follows:

import numpy as np
from gatas.aggregator import NeighbourAggregator

node_features = np.load('data/ppi/node_embeddings.npy')

neighbour_aggregator = NeighbourAggregator(
    input_noise_rate=0.,
    dropout_rate=0.,
    num_nodes=node_features.shape[0],
    num_edge_types=neighbour_sampler.num_edge_types,
    num_steps=3,
    edge_type_embedding_size=5,
    node_embedding_size=None,
    layer_size=256,
    num_attention_heads=10,
    node_features=node_features,
)

We can call neighbour_aggregator with the output of neighbour_sampler. This pattern is used in the node classification and link prediction tasks. You can train those models with:

python3 -m node_classification.train --data-path {path to dataset}

or:

python3 -m link_prediction.train --data-path {path to dataset}

where additional parameters can be passed through the command line. Run with --help for a list of them:

python3 -m node_classification.train --help

Reference

@misc{andrade2020graph,
    title={Graph Representation Learning Network via Adaptive Sampling},
    author={Anderson de Andrade and Chen Liu},
    year={2020},
    eprint={2006.04637},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
framework		framework
gatas		gatas
link_prediction		link_prediction
node_classification		node_classification
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
MLproject		MLproject
README.md		README.md
mypy.ini		mypy.ini
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GATAS

Overview

Instructions

Reference

License

About

Releases

Packages

Languages

License

Wattpad/gatas

Folders and files

Latest commit

History

Repository files navigation

GATAS

Overview

Instructions

Reference

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages