Description
When a module is imported, any code inside __init__.py
for that module is supposed to be executed. It's supposed to be module initialization code, but really it can be any arbitrary code. In Java, the concept is analogous to a static initializer that is executed when a class is first loaded.
The results of the initialization code is supposed to be made available to the importing script. Currently, as a result of #163, we are adding artificial code to load up the subpackages so that they are available to importing scripts. However, we don't do anything about the real code that is there; that code may be importing the subpackages explicitly.
Suppose we have the following code in a script:
# tests/GNN/nodes_graph_classfication/train_gcn.py
from nlpgnn.models import GCNLayer
Currently, we are expecting that:
nlpgnn/models/__init__.py
is empty.GCNLayer.py
exists innlpgnn/models
.
Each of these is false. There is no such file GCNLayer.py
in nlpgnn/models
. Furthermore, nlpgnn/models/__init__.py
is non-empty; in fact, it has the following code:
# nlpgnn/models/__init__.py
from .GCN import *
In nlpgnn/models/GCN.py
we then have class GCNLayer
:
# nlpgnn/models/GCN.py
class GCNLayer(tf.keras.Model):
In the IR for nlpgnn/models/__init__.py
, we may available the name GCN
:
callees of node __init__.py : []
IR of node 42, context CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@82 ]
<Code body of function Lscript nlpgnn/models/__init__.py>
...
0 global:global script nlpgnn/models/__init__.py = v1<no information>
...
21 v13 = global:global script nlpgnn/models/GCN.py__init__.py [46->660] (line 5)
22 putfield v1.< PythonLoader, LRoot, GCN, <PythonLoader,LRoot> > = v13__init__.py [46->660] (line 5)
But, we don't do that for GCNLayer
. Instead, what happens is that the explicit import drops off the face of the earth:
130 v272 = global:global script nlpgnn/models/GCN.py__init__.py [46->660] (line 5) [272=[*]]
131 v274 = fieldref v272.v259:#* __init__.py [46->660] (line 5) [274=[*]272=[*]]
In other words, while v1
in stored in a global and available to any importing scripts, while v274
isn't. Consequently, the import code in this file is never reflected in the importer. So, in the IR of of tests/GNN/nodes_graph_classfication/train_gcn.py
:
callees of node train_gcn.py : [import, range, zip, GradientTape, EarlyStopping, MaskAccuracy, MaskCategoricalCrossentropy, trampoline3, trampoline4, trampoline4, trampoline4, trampoline4, trampoline4, gradient]
IR of node 31, context CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@60 ]
<Code body of function Lscript tests/GNN/nodes_graph_classfication/train_gcn.py>
..,
102 v268 = global:global script nlpgnn/models/__init__.pytrain_gcn.py [2:0] -> [56:64] [268=[GCNLayer]]
103 v270 = fieldref v268.v254:#GCNLayer train_gcn.py [2:0] -> [56:64] [270=[GCNLayer]268=[GCNLayer]]
We correctly assign v268
, but v270
is empty:
[Node: <Code body of function Lscript tests/GNN/nodes_graph_classfication/train_gcn.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@60 ], v270] --> []