-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Step 1 - Transpiler #595
Merged
Merged
Step 1 - Transpiler #595
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ng() method to convert from std::filesystem::path to std::string and THEN to filesystem::path
… presynaptic output
* move outputs into genn so archive and warnings can hopefully find
…stic - only delayed variables should always be copied
# Conflicts: # include/genn/genn/code_generator/environment.h
This was
linked to
issues
Oct 27, 2023
tnowotny
approved these changes
Jan 8, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it is time now to start merging (as you suggested). Since we talked this through no particular thoughts about deal-breaking problems have occurred to me. SoI will approve these stages now.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Transpiler
This is the biggest change in GeNN 5. Previous user code had some 'preprocessor-level' transformations applied and was passed straight to the CUDA/C++ compiler. However, this has a lot of issues:
This PR solves this by implementing a pretty basic source-to-source transpiler which I have to say works rather nicely. Integrating this involved rewriting a large proportion of GeNN's code generator which definitely improved it but does mean reviewing the change is a nightmarish proposition. However, hopefully the following explains the ideas and highlights some of the potentially-controversial aspects.
Implementation
Transpiling code from GeNN code strings to backend-specific code is done in 4 stages. The implementation of the first two steps is heavily inspired by the first section of https://craftinginterpreters.com/contents.html which has been my key reading for this journey into CS 😄 Also because, compared to a general purpose compiler, GeNN code strings are very short and because stages 2-4 happen only on merged groups there is no real need for any of this to be super-high performance so the implementations all aim for simplicity rather than cutting-edge compiler design.
1 - Scanning
The scanner (https://github.com/genn-team/genn/blob/transpiler/src/genn/genn/transpiler/scanner.cc) converts strings to vectors of tokens (https://github.com/genn-team/genn/blob/transpiler/include/genn/genn/transpiler/token.h) to make all subsequent processing simpler.
2 - Parsing
The parser (https://github.com/genn-team/genn/blob/transpiler/src/genn/genn/transpiler/parser.cc) turns sequences of tokens into a Abstract Syntax Tree consisting of expressions (https://github.com/genn-team/genn/blob/transpiler/include/genn/genn/transpiler/expression.h) and statements (https://github.com/genn-team/genn/blob/transpiler/include/genn/genn/transpiler/statement.h). This is implemented as a Recursive Descent Parser (https://en.wikipedia.org/wiki/Recursive_descent_parser) where the C++ call stack takes on a lot of the heavy lifting.
3 - Type-checking
A large proportion of compile errors you get in C are the results of type checking so, to achieve the dream of users not having to deal with real compiler errors, the transpiler needs a type checker (https://github.com/genn-team/genn/blob/transpiler/src/genn/genn/transpiler/typeChecker.cc). Basically what this does is 'visit' (https://en.wikipedia.org/wiki/Visitor_pattern) the AST and, for each expression, recursively determine its type checking that children's types are valid for each operation e.g. that you're not assigning to a
const
variable or whatever. Because, as I mention later, function overloading is supported, this stage also emits a dictionary of expressions->types to allow the pretty printer to pick the correct function implementation.4 - Pretty printing
For the current backends we need to go from the AST back to C-like code and this is done by the pretty printer (https://github.com/genn-team/genn/blob/transpiler/src/genn/genn/transpiler/prettyPrinter.cc). Like the type-checker, it recursively visits the nodes of the AST but rather than doing some analysis on them it just prints out the C-code. One semi-smart thing this does do is add an underscore in front of all variables declared in user code, thus fixing #385.
Language
#define
stuff in user code but I don't think it's worth it - same effect can easily achieved by combining bits of code in Python)$(xx)
syntax for referencing GeNN stuff is no longer necessary at all and the$(xx, arg1, arg2)
function syntax I added doesn't hold water grammatically so, currently, there is some code which strips this out (https://github.com/genn-team/genn/blob/transpiler/src/genn/genn/gennUtils.cc#L30-L58) before transpiling to improve backward compatibility somewhat although I'm tempted to move this to PyGeNN.const int *egpSubset = &egp[offset];
and instead have to doconst int *egpSubset = egp + offset;
but, personally I think that's ok.sin(30.0f)
will resolve to the floating point rather than double-precision version.30.0
were always treated as the scalar type but this is kind of annoying if you're writing mixed-precision code. Now,30.0
will be treated as scalar but30.0f
will always be treated as float and30.0d
will always be treated as double.long
being compiler-specific (even on 64-bit systems, it's 32-bit on Windows and 64-bit on Linux). The transpiler now guarantees a LP64 data model whereint
is 32-bit andlong
is 64-bit by always generating code with sized types i.e.int32_t
Integration
One of the reasons I chose to build all this from scratch rather than e.g. leverage LLVM is that all of this process is tightly integrated with the rest of GeNN. The scanner gets run on code strings when
NeuronGroup
,SynapseGroup
and friends get constructed and the tokenised representation is then used in place of all the ad-hoc regular expressions for stuff like determining whether e.g. any of the RNG functions have been referenced in a code string. The type system used by the type checker is also used in place of strings to represent types throughout GeNN (the only exception is "scalar" which gets replaced with the actual type when it's encountered in the parser, type-checker and pretty-printer). This means rather than adding stars to strings you can do stuff like:auto type = Type::Uint32.createPointer();
or
One of the increasingly nasty parts of GeNN was the whole group merged class hierarchy mess which meant that logic about what to do with a given merged group was scattered between the code that added fields to the merged group structure and the code that actually generated the code. The answer to this is to build the structures 'lazily' so only adding fields when they are required. Both the type checker and the pretty printer have the concept of 'environments' which are basically scopes with stuff defined in them, in the case of the type checker, what matters is the type e.g.
const int*
and, in the case of the pretty printer, how they should be displayed. These environments (https://github.com/genn-team/genn/blob/transpiler/include/genn/genn/code_generator/environment.h) extend outwards from the transpiler to form a replacement for the oldSubstitions
class and a lot of the functionality that was inGroupMerged
and provide various helpers for correctly populating the merged structures as you generate code e.g.groupEnv.printLine("const unsigned int npre = $(_col_length)[$(id_post)];");
will mark the struct field corresponding to
_col_length
as required (the _ syntax here means that these variables aren't exposed to user code but are only used internally). Another issue that caused a lot of unused code to be generated or expensive index-calculation code to be duplicated (e.g. this finally fixes #47) so bits of initialisation code can be attached to variables you add to the environment and only generated if the variable is referenced e.g.:will only read the postsynaptic index from memory into a register if it's required.
Other inclusions
Due to the long time it's taken me to tie this down, sadly, this PR also includes a bunch of other stuff as well as various syntactic improvements that it made sense to include as I reimplemented the code generation for various features.
Structural plasticity
The syntax I originally developed for this in the GeNN 4.XX version creaked at the seams rather but, using the new transpiler functionality, I've implemented a
for_each_synapse
language extension that behaves like a normal for-loop (admittedly one where stuff likeid_post
magically appears inside it) rather than a scary macro:Python feature tests
Some are still outstanding waiting on future PRs but the majority of the feature tests are now ported to PyGeNN + pytest. I've tried to merge similar tests together into larger models to reduce the time it takes to run the test suit and have implemented variants like with/without delay and with/without batching using parameterisation (https://docs.pytest.org/en/7.3.x/how-to/parametrize.html). As you might imagine, this was a very painful process but it did find a lot of bugs and the result is way less cumbersome and actually tests PyGeNN properly! @tnowotny one thing that came out of this is that we were performing statistical tests on the generation of random numbers from discrete distributions i.e. the binomial distribution we added in #498 correctly. I think a chi-squared test is the right test for this but I struggled to figure out how to use it against a series of samples which might result in a "gappy" histogram if you see what I mean.
Syntax simplification
There will be more of this to come as some stuff has got a bit convoluted but, for now:
for_each_synapse
structure described above to do similar.GLOBALG
andINDIVIDUALG
confuse almost all to new users and arereally
only used withStaticPulse
weight update models. Same functionality can be achieved with aStaticPulseConstantWeight
version with the weight as a parameter. Then I've renamed all the 'obvious'SynapseMatrixType
variants so you just choseSPARSE
,DENSE
,TOEPLITZ
orPROCEDURAL
(withDENSE_PROCEDURALG
andPROCEDURAL_KERNELG
for more unusual options)Future
Aside from preventing users from doing things that the compiler would allow but don't actually work in GeNN, generating rather nicer code and giving users better error messages, this doesn't actually do a whole lot. However, with the AST representation, a whole load of things become possible. First target will be generating the nasty semi-vectorised code you need to get good half-precision performance in CUDA.