Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add INLINE to define inline functions and methods #1029

Closed
wants to merge 32 commits into from

Conversation

digikar99
Copy link
Contributor

@digikar99 digikar99 commented Oct 20, 2023

This PR is a sketch to add inlinable functions and methods to coalton. List of tasks to complete before merge:

  • Basic function inlining
  • Basic method inlining
  • Parsing errors
  • Basic tests
  • Inline/Non-inline recursive and mutually recursive definitions: we don't want a stackoverflow
  • Confirm slot naming conventions: is inline-p okay, or should they be renamed to inline? Although, I hope it doesn't create confusion with set-method-inline and friends.
  • Any suggestions about whether some code changes are good or bad?
  • (Optional) Inline cases when the number of arguments differ from the number of parameters
  • (Optional) Any related trivial features
  • Ready for merge

The intended use is as follows:

(coalton-toplevel
  (declare foo (single-float -> single-float))
  (inline) ; <--- THIS ATTRIBUTE
  (define (foo x)
    (lisp single-float (x) (cl:+ x x))))

(coalton-toplevel
  (define (foo-caller x)
    (foo x)))

Without foo being inline, the disassembly of foo-caller would have included a call to foo as follows:

COALTON-USER> (disassemble 'foo-caller)

; disassembly for FOO-CALLER
; Size: 53 bytes. Origin: #x540AAB0D                          ; FOO-CALLER
; 0D:       498B4510         MOV RAX, [R13+16]                ; thread.binding-stack-pointer
; 11:       488945F8         MOV [RBP-8], RAX
; 15:       4883EC10         SUB RSP, 16
; 19:       488BD6           MOV RDX, RSI
; 1C:       B902000000       MOV ECX, 2
; 21:       48892C24         MOV [RSP], RBP
; 25:       488BEC           MOV RBP, RSP
; 28:       E8D54F33FC       CALL #x503DFB02                  ; #<FDEFN FOO>
; 2D:       480F42E3         CMOVB RSP, RBX
; 31:       488B75F0         MOV RSI, [RBP-16]
; 35:       80FA19           CMP DL, 25
; 38:       7503             JNE L0
; 3A:       C9               LEAVE
; 3B:       F8               CLC
; 3C:       C3               RET
; 3D: L0:   CC55             INT3 85                          ; OBJECT-NOT-SINGLE-FLOAT-ERROR
; 3F:       08               BYTE #X08                        ; RDX(d)
; 40:       CC10             INT3 16                          ; Invalid argument count trap

But now that foo can be inlined, the disassembly of foo-caller is:

COALTON-USER> (disassemble 'foo-caller)

; disassembly for FOO-CALLER
; Size: 31 bytes. Origin: #x540AAE7F                          ; FOO-CALLER
; 7F:       498B4510         MOV RAX, [R13+16]                ; thread.binding-stack-pointer
; 83:       488945F8         MOV [RBP-8], RAX
; 87:       0F28D1           MOVAPS XMM2, XMM1
; 8A:       F30F58D1         ADDSS XMM2, XMM1
; 8E:       660F7ED2         MOVD EDX, XMM2
; 92:       48C1E220         SHL RDX, 32
; 96:       80CA19           OR DL, 25
; 99:       C9               LEAVE
; 9A:       F8               CLC
; 9B:       C3               RET
; 9C:       CC10             INT3 16                          ; Invalid argument count trap

This not only extends to regular functions, but also to class method instances.

(coalton-toplevel
  (define-class (bar :a)
    (add (:a -> :a -> :a))))

(coalton-toplevel
    (define-instance (bar single-float)
    (inline) ; <--- THIS ATTRIBUTE
    (define (add x y)
      (lisp single-float (x) (cl:+ x x)))))

(coalton-toplevel
  (declare add-caller (single-float -> single-float -> single-float))
  (define (add-caller x y)
    (add x y)))

Effectively, this brings us inlinable static dispatch with zero runtime overhead for calling simple functions.


In an earlier PR, it was suggested that a coalton:inline declaration may be added to define inline functions and methods. I felt a slight inclination towards a coalton:define-inline form instead of a coalton:inline declaration. Nonetheless, if a declaration seems better, I will attempt to modify the PR accordingly. I am adding it as an attribute for the moment instead of any separate toplevel form. If someone has a preference for a toplevel form, they can always write a macro. The case for converting it into a CL-like declaration for more granular control remains.

This is still a sketch since I haven't looked over every case carefully. I'd be also be delighted if this facility could be achieved by more minimal changes.

@stylewarning
Copy link
Member

I haven't reviewed but I just wanted to leave a few pre-comments.

  1. Amazing job diving in and making such a broad change. That's a lot of work, and I definitely want to see this PR go through.

  2. I suspect we will want a few declarations. One thing we want to do that CL doesn't do is heuristic inlining. I don't know what design we'd choose but things like small lambdas that don't escape or unexported functions might be heuristically inlined by default. (A fully fledged heuristic inliner isn't necessary for this PR, but probably the groundwork for making it happen is.)

Excited to see this. Inlining will be such a huge performance lever in Coalton that could transform it from being "inadequate" to being "faster than Lisp by default".

@digikar99
Copy link
Contributor Author

digikar99 commented Oct 21, 2023

Thank you for the pre-review! I'm still thinking what the right thing to do would be. But yes, I definitely want to see inlining in coalton.

One main concern (EDIT: Fixed below using a node-let) I have with this approach is that it depends on the implementation to perform the inlining - see the following changes in src/codegen/program.lisp :

   (loop :for (name . node) :in bindings
         :if (node-abstraction-p node)
           :append (list
                    (if (node-abstraction-inline-p node)
                        `(declaim (inline ,name))
                        `(declaim (notinline ,name)))
                    (compile-function name node env)
                    `(setf
                      ,name

This means that suppose coalton:inline and coalton:notinline were made similar to cl:inline and cl:notinline like declarations. In that case, granular inlining as in coalton::(let (...) (declare (not/inline ...)) ...) will need to emit (cl:declare (cl:not/inline ...)) declarations. I want to hear others' thoughts on this.

As I understand, avoiding this would require some more work, so that inlining can be performed within coalton compilation without depending on the implementation.

@stylewarning
Copy link
Member

We should definitely attempt to inline ourselves (at the AST level) so that further type and monomorphizer/specializer optimizations. Maybe in specific last-mile optimizations we would use the Lisp inliner. (I'm happy Lisp has inlining, but it's a very sharp and specific tool.)

@eliaslfox
Copy link
Collaborator

Hey, thanks for working on this.

Wether a function is marked inline should be tracked in the function-environment instead storing a flag on each ast node. The function environment is set for functions here. It looks like this isn't being set for method definitions, but it can be set in this loop.

I would prefer to use attributes instead of adding a new toplevel form:

;; toplevel functions
(inline)
(define (f x) ...)

;; methods
(define-instance (C T)
  (inline)
  (define (m x) ...))

The inliner should be written as a separate optimization pass instead of being merged with the inline-method pass. The purpose of the method inlining pass is to directly call methods when the type class instance being used is known at the call site.

Lastly, the inliner should generate fresh variable names instead of reusing the names of the functions's parameter names. This way we avoid issues with overlapping variable name. You can generate variable names with gentemp, and rewrite the variable references in an ast node with ast substutions.

@digikar99
Copy link
Contributor Author

Thanks for the extensive feedback! I will rework this over the next week or two and push the commits for further review.

Yes, now, I'm in favour of attributes over a new toplevel form. If a user wants, they can write a macro which expands into the attribute+form:

(defmacro define-inline (name &body body)
  `(progn
     (inline)
     (define ,name ,@body)))

However, in the longer run, if we wish to provide more granular SBCL-like control over inlining, we might need something akin to declarations.

(declaim (inline foo))
(defun foo (a)
  (+ a a))
(declaim (notinline foo))

(defun foo-inline-caller (b)
  (declare (inline foo))
  (foo b))

(defun foo-mixed-caller (b)
  (declare (inline foo))
  (+ (foo b)
     (locally (declare (notinline foo)) ; <------ Are coalton attributes suitable for this?
       (foo b))))

@digikar99
Copy link
Contributor Author

Wether a function is marked inline should be tracked in the function-environment instead storing a flag on each ast node. The function environment is set for functions here. It looks like this isn't being set for method definitions, but it can be set in this loop.

It looks like creating a function-inline-environment might be a better idea, both with respect to minimal code changes as well as the granular control over inlining as outlined above.

Without that, it seems we will need to change function-env-entry structure to include an inline-p slot in addition to the existing name and arity. Doing this requires changing all the places where the function-env-entry objects are being created to set the :inline-p slot. One such example is this update-function-env function in codegen/transformations.lisp. However, the value of inline-p itself requires more information. This information could either be obtained by adding a parameter inline-bindings to update-function-env itself, but this seems a bit unclean. Another way is to obtain this information from split-binding-definitions, but this requires information about inlining to be present in the AST itself, which I understand you prefer to be avoided.

Let me know if I should proceed by adding a function-inline-environment or if some other way is preferable.

@eliaslfox
Copy link
Collaborator

In every place set-function is called except in update-function-env whether the function should be inline is known at the call site. update-function-env can instead use a different env updater to set the arity without changing inline-p:

(define-env-updater set-function-arity (env symbol arity)
  (declare (type environment env)
           (type symbol symbol)
           (type fixnum arity))
  (let ((prev (lookup-function env symbol :no-error t)))
    (update-environment
     env
     :function-environment (immutable-map-set
                            (environment-function-environment env)
                            symbol
                            (make-function-env-entry
                             :name symbol
                             :arity arity
                             :inline-p (and prev (function-env-entry-inline-p prev)))
                            #'make-function-environment))))

@digikar99
Copy link
Contributor Author

Sorry, I've been busy. It's been great to see lots of activity on coalton recently!

I might be free to dive into this in detail in another month. Is there any suggestion to break the changes required for this into multiple parts? That way, each part will be easier to review and merge.

Copy link
Collaborator

@eliaslfox eliaslfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Don't worry about breaking this up, it's pretty reviewable.

Sorry I dropped the ball on reviewing this.

This needs some tests, but other than that and a couple nits this looks good to merge.

(node-type node))
:value name)
:rands rands)))))))
;; FIXME: What kind of frontend code evokes DIRECT-APPLICATION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass turns regular calls into direct calls when it's known that regular function calls can be used.

(when (node-variable-p rator)
(let* ((name (node-variable-value rator))
(code (tc:lookup-code env name :no-error t))
;; FIXME: We need to lookup inline-p in the environment rather than the AST
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this FIXME is done

src/parser/toplevel.lisp Outdated Show resolved Hide resolved
@@ -358,7 +360,9 @@
:if (not (zerop method-arity))
:do (setf env (tc:set-function env method-name (tc:make-function-env-entry
:name method-name
:arity method-arity)))
:arity method-arity
;; FIXME: Always declare methods to be inline?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this

@@ -334,7 +334,9 @@
:if (not (zerop class-arity))
:do (setf env (tc:set-function env codegen-sym (tc:make-function-env-entry
:name codegen-sym
:arity class-arity)))
:arity class-arity
;; FIXME: Under what circumstances would we want fundep to be inline?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class constructor functions can't be inlined by the Coalton inliner because they don't appear in the code env.

src/typechecker/define-type.lisp Outdated Show resolved Hide resolved
src/typechecker/expression.lisp Outdated Show resolved Hide resolved
@stylewarning stylewarning changed the title Add COALTON:DEFINE-INLINE to define inline functions and methods Add INLINE to define inline functions and methods Jun 13, 2024
@kartik-s
Copy link
Collaborator

Hi, great PR! I wanted to jump in with some thoughts about laying the groundwork to implement heuristic inlining as mentioned by @stylewarning.

In addition to this (INLINE) attribute, which tells the optimizer to inline all applications of the given function/method globally, what about having an attribute that behaves similar to the (MONOMORPHIZE) attribute, i.e. it tells the optimizer to:

  1. Traverse the call graph rooted by the annotated function/method, collecting candidate applications.
  2. For each candidate application, check if the application matches some heuristic function of the operator, operands, and potentially the context of the application (e.g. the function/method containing the application). If it does, inline it.
  3. Iterate to fixed point, or until some stopping condition is detected by the (stateful) heuristic function.

This could be implemented by mostly copying the existing code for the monomorphizer, tweaking CANDIDATE-SELECTION and REWRITE-CALLSITES appropriately (e.g. REWRITE-CALLSITES would contain the AST splicing/substitution implemented in this PR). We could also factor out this "select + rewrite candidate applications" and then (re)implement the monomorphizer and this inliner using the common logic.

@stylewarning
Copy link
Member

@kartik-s I think these are great ideas, but I think for this PR, we should rename inline to always-inline, and then come up with a smarter inlining plan (in the spirit of your comment) as a separate consideration.

@digikar99 Would it be possible to get this MR rebased, if it's not too much trouble?

@digikar99
Copy link
Contributor Author

digikar99 commented Jun 22, 2024

Sorry, I missed the last review by eliaslfox! I will be putting this on priority for the upcoming week or two.

I think heuristic inlining and a monomorphize-like inline should rather be in a separate PR. Even implementing basic inline-ing seems to require a fair amount of changes.

The currently missing "feature" for basic inlining is method inlining, that will allow inlining the add method of the single-float instance below:

(coalton-toplevel
  (define-class (bar :a)
    (add (:a -> :a -> :a))))

(coalton-toplevel
    (define-instance (bar single-float)
    (inline) ; <--- THIS ATTRIBUTE
    (define (add x y)
      (lisp single-float (x) (cl:+ x x)))))

(coalton-toplevel
  (declare add-caller (single-float -> single-float -> single-float))
  (define (add-caller x y)
    (add x y)))

Earlier I had mistakenly made changes in src/typechecker/define-class.lisp to achieve the same. But it seems the appropriate place for changes might be src/typechecker/define-instance.lisp. May be no changes are required to define-instance.lisp either, and instead, setting the INLINE-P-TABLE in ENTRY-POINT would do the job.


I particularly need a review or suggestion on the ENTRY-POINT in src/entry.lisp. I've added an INLINE-P-TABLE which is populated from the contents of the parsed program. The INLINE-P slot in FUNCTION-ENV-ENTRY would be directly filled from INLINE-P-TABLE. This approach works for normal functions defined using toplevel DEFINE.

However, this approach runs into an issue while dealing with instance-methods. The inline attribute is present in the parsed object, but the parsed object does not contain the codegen-sym that should be marked as inline in INLINE-P-TABLE. The TY-INSTANCES and TOPLEVEL-INSTANCES contain codegen-sym, but they do not contain information about INLINE-P - or atleast my effort would be to avoid polluting them with the attributes. My current issue is: what is a good way to go from method objects of (parser:program-instances program) to the codegen-syms corresponding to those methods?

Or, is creating and filling the INLINE-P-TABLE in ENTRY-POINT a bad idea?

@digikar99
Copy link
Contributor Author

digikar99 commented Jun 23, 2024

Basic method inlining is now in place too! That leaves

  1. Error handling and tests
  2. Slot naming conventions: is inline-p okay, or should they be renamed to inline? Although, I hope it doesn't create confusion with set-method-inline and friends.
  3. Any other suggestions about some code change is a good or bad change.
  4. Further optimization: do we want to inline the case when the number of arguments differ from the number of parameters?
  5. Related trivial features (eg: inline attribute inside define-class)
  6. Anything else I might be overlooking.

Once those are in place, we can move the focus to "monomorphize-like" inline everything according to a heuristic, may be in a different PR.

(in-package #:coalton-impl/codegen/constant-propagation)

(defun constant-var-value (var constant-bindings)
(cdr (assoc var constant-bindings)))
Copy link
Collaborator

@amorphedstar amorphedstar Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when checking that value exists later you need the whole cell, not just its cdr, which might still exist but be nil.
you can do this similar to gethash:

Suggested change
(cdr (assoc var constant-bindings)))
(let ((x (assoc var constant-bindings)))
(if (null x)
(values nil nil) ; not found
(values x t))))

(and then change usage of this function accordingly)
(edit: actually, if the cdr is always a node (if it exists) then I guess this would be unnecessary)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only there were a type system (:

@digikar99
Copy link
Contributor Author

I think this is ready for further review now - I need opinions and suggestions to resolve the FIXMEs and the remaining tasks mentioned in the top comment.

@amorphedstar
Copy link
Collaborator

The main inlining pass is now merged in #1193, so for this PR, now it suffices to just keep the changes for handling syntax (inline) with the table and inline-p stuff. In heuristic-inline-applications in optimizer.lisp, we will just need the inline-p lookup and then (funcall heuristic code) can be replaced with (or inline-p (funcall heuristic code)). (Also, the current inline-applications will need to be updated to call heuristic-inline-applications with heuristic (constantly nil) when settings:*coalton-heuristic-inlining* is nil, instead of just skipping inlining entirely.)

Also, with the new actions in traverse.lisp, you should be able to update constant propagation to use traverse, passing the list the same way args are passed to *traverse* in heuristic-inline-applications and make-binding-list-actions.

@digikar99
Copy link
Contributor Author

Oops, I again missed an update here for quite a while! I have decreased my repository watch-level to a more appropriate level now!

Given the state of this PR and the updates that have taken place in coalton to enable inlining, I think I will issue two separate PRs for

  1. constant-propagation
  2. inlining syntax and connecting syntax with the inline and constant-propagation pass

@stylewarning
Copy link
Member

Oops, I again missed an update here for quite a while! I have decreased my repository watch-level to a more appropriate level now!

Given the state of this PR and the updates that have taken place in coalton to enable inlining, I think I will issue two separate PRs for

  1. constant-propagation

  2. inlining syntax and connecting syntax with the inline and constant-propagation pass

Great ideas!

@stylewarning
Copy link
Member

Closed in favor of the series of PRs that followed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants