GAMMA

A graph pattern mining framework for large graphs on gpu. This is the source code of "GAMMA: A Graph Pattern Mining Framework for Large Graphs on GPU" (ICDE 2023), by Lin Hu, Prof. Lei Zou, Prof. Tamer Ozsu.

Please cite our paper, if you use our source code.

@inproceedings{
  title={GAMMA: A Graph Pattern Mining Framework for Large Graphs on GPU},
  author={Lin Hu, Lei Zou, Tamer Ozsu},
  booktitle={2023 IEEE 39th International Conference on Data Engineering (ICDE)},
  pages={273--286},
  year={2023},
  organization={IEEE}
}

ABSTRACT

Graph pattern mining (GPM) is getting increasingly important recently. There are many parallel frameworks for GPM, many of which suffer from performance. GPU is a power- ful option for graph processing, which has excellent potential for performance improvement; however, parallel GPM algorithms produce a large number of intermediate results, limiting GPM implementations on GPU.

In this paper, we present GAMMA, an out-of-core GPM framework on GPU, and it makes full use of host memory to process large graphs. Specifically, GAMMA adopts a self- adaptive implicit host memory access manner to achieve high bandwidth, which is transparent to users. GAMMA provides flexible and effective interfaces for users to build their algorithms. We also propose several optimizations over primitives provided by GAMMA in the out-of-core GPU system. Experimental results show that GAMMA has scalability advantages in graph size over the state-of-the-art by an order of magnitude, and is also faster than existing GPM systems and some dedicated GPU algorithms of specific graph mining problems.

REQUIREMENTS

GCC 5.3.0
CUDA toolkit 9.0

INPUTS

GAMMA use CSR (compressed sparse rows) as the data structure of graphs. It has two parts: OFFSET and VALUES. The former records the offsets of all adjacency lists in CSR, the latter is the set of all adjacency lists.

The input file of a dataset has two files, data.col and data.dst, which are both binary files. data.col starts with the vertex number of input graph, then follows the OFFSET. So it is as:

${vertex_number} (4 bytes)
${offset_1} (8 bytes)
${offset_2} (8 bytes)
${offset_3} (8 bytes)
… …

data.dst starts with the edge number of input graph, then follows the VALUES. So it is as:

${value_1} (4_byte)
${value_2} (4_byte)
${value_3} (4_byte)
… …

BUILD

$ cd ${GAMMA_ROOT}
$ make

RUN

K-Clique counting.

$ ./kcl ${data} $k debug

Subgraph matching.

./sm ${data} ${query_name} debug

Frequent pattern mining.

./fsm ${data} ${pattern_length} ${minimum_support} debug

Name	Name	Last commit message	Last commit date
Latest commit History 7 Commits
GAMMA_fullversion.pdf	GAMMA_fullversion.pdf	Add files via upload	Jul 21, 2022
Makefile	Makefile	init project	Jan 27, 2022
README.md	README.md	Update README.md	Dec 9, 2024
accessMode.cuh	accessMode.cuh	init project	Jan 27, 2022
aggregrate.cuh	aggregrate.cuh	init project	Jan 27, 2022
clock.h	clock.h	init project	Jan 27, 2022
embedding.cuh	embedding.cuh	init project	Jan 27, 2022
expand.cuh	expand.cuh	init project	Jan 27, 2022
fsm.cu	fsm.cu	init project	Jan 27, 2022
graph.cuh	graph.cuh	init project	Jan 27, 2022
kcl.cu	kcl.cu	init project	Jan 27, 2022
log.cpp	log.cpp	init project	Jan 27, 2022
log.h	log.h	init project	Jan 27, 2022
log.o	log.o	init project	Jan 27, 2022
queryGraph.cuh	queryGraph.cuh	Add files via upload	Nov 30, 2022
sm.cu	sm.cu	Add files via upload	Nov 30, 2022
utils.h	utils.h	Add files via upload	Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GAMMA

About

Releases

Packages

Contributors 3

Languages

pkumod/GAMMA

Folders and files

Latest commit

History

Repository files navigation

GAMMA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages