Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features drop when using to_networkx() and resulting networkx graph is incompatible with torch_geometric from networkx #196

Open
khaled3ttia opened this issue Mar 9, 2022 · 2 comments
Labels
Bug Something isn't working

Comments

@khaled3ttia
Copy link

Features drop when using to_networkx() and resulting networkx graph is incompatible with torch_geometric from networkx

I'm trying to use the devmap dataset graphs on a torch geometric model
I started with the provided script to generate the graphs out of the devmap dataset here

To Reproduce

Then, I read each of the graph files and try to process them as follows:

import pathlib
import programl as pg
from programl.util.py import pbutil
from programl.proto import program_graph_pb2
from torch_geometric.utils.convert import from_newtorkx

filepath= 'path/to/pbfile.pb'

# Load graph from pb file
graph = pbutil.FromFile(pathlib.Path(filepath), program_graph_pb2.ProgramGraph())

# convert ProgramGraph to networkx graph
nx_graph = pg.to_networkx(graph)

# ISSUE: print networkx graph, now all features such as wgsize, transfer_bytes,... are gone
print(list(nx_graph.nodes(data=True)))

# ANOTHER ISSUE: cannot read in torch geometric
pyg_graph = from_networkx(nx_graph)

First error I get is at the invocation of from_networkx()
ValueError: Not all nodes contain the same attributes

by looking at the nodes, I found that the first node is different than other so I just decided to delete it to make it work
nx_graph.remove_node(0)

Now, after that the other error I get is:
RuntimeError: Could not infer dtype of dict

This happens when torch_geometric.utils.convert.from_networkx() tries processing the values for the key features in the networkx representations. I traced it back to find that the key blocks is processed just fine because it is a list of integers. However, it fails when processing features which is a dict of dicts {'full_text': str , 'function': int, 'text': int, 'type': int}

Is there any workaround for these two issues:
(1) The features being dropped when using to_networkx()
(2) the from_networkx() torch_geometric routine failing on the resulting to_networkx(ProgramGraph)

Thanks!

Environment

  • ProGraML version : 0.3.2
  • How you installed ProGraML (source, pip): pip
  • OS: Ubuntu 20.04.3
  • Python version: 3.9.7
@khaled3ttia khaled3ttia added the Bug Something isn't working label Mar 9, 2022
@ChrisCummins
Copy link
Owner

Hi @khaled3ttia, thanks for the report. Could you please attach one of the .pb files that is causing issues here so that I can run your repro script on it?

Cheers,
Chris

@khaled3ttia
Copy link
Author

Thanks for your reply. Sure, that is a link to one of the files (I had to zip it because GitHub wouldn't allow it otherwise)
graph.zip
This file, and the others I am using are the ones generated by tasks/devmap/dataset/create.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants