Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax table property for comment semantic tokens. #3696

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

JimDBh
Copy link

@JimDBh JimDBh commented Aug 26, 2022

This is important to let emacs understand that certain parts of the code (e.g. wrapped in C marcos) are not in effect and should be by-passed for things like forward-sexp.

For example, if we have the following code in emacs with c-mode:

#include <stdio.h>

#define SOME_MACRO 1

int myFun(int par) { // => This bracelet has no matching
  // ...
#if SOME_MACRO
  if (par == 0) { // => This incorrectly matches with the bracelet at the end of function
    printf("example check 0\n");
#else
  if (par == 1) { // => This should have been ignored by forward-sexp
    printf("example check 1\n");
#endif
    // ...
  }
  return 0;
} // => this bracelet matches with the wrong one

void someFun() {
  // Do something.
  myFun(1);
  return; // => beginning-of-defun won't work here
}

With this patch, these should work fine.

This is important to let emacs understand that certain parts of
the code (e.g. wrapped in C marcos) are not in effect and should
be by-passed for things like forward-sexp.
@yyoncho
Copy link
Member

yyoncho commented Aug 26, 2022

@sebastiansturm @ericdallo - willing to review this one?

Copy link
Member

@yyoncho yyoncho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I am not familiar with this code, I will let the others review it. Meanwhile, a few elisp nits:

lsp-semantic-tokens.el Outdated Show resolved Hide resolved
lsp-semantic-tokens.el Outdated Show resolved Hide resolved
Copy link
Member

@ericdallo ericdallo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I'd like an OK from @sebastiansturm as well 😅

@sebastiansturm
Copy link
Contributor

sorry for being late, have been away for a week. The feature looks very useful, though I'm not sure how to test it. I've set lsp-semantic-tokens-set-comment-syntax to t, and lsp-semantic-tokens--put-comment-syntax is being called, but comment-search-forward fails to return a match (in a small C++ buffer with an ifdef-disabled part that correctly gets fontified using lsp-face-semhl-comment); searching for the regexp \s< doesn't turn up anything either. Any hints for me on how I might verify that the feature is actually working?
I'll add some preliminary comments to the review section now

(setq lsp-semantic-tokens-set-comment-syntax t))
(font-lock-flush))

(defun lsp-semantic-tokens--get-overlapping-comments (beg end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the way this is used, I think it'd make sense to always return non-nil beg and end, with beg or end remaining unmodified if they don't happen to overlap with comment tokens. Maybe the function could also be renamed to something like extend-region-to-include-comment-tokens or something like that (plus the prefix, of course), in my opinion that would be slightly more intuitive

(cons prev-beg next-end)))

(defun lsp-semantic-tokens--remove-comment-syntax-strict (beg end)
"Remove all commnet syntax strictly in (BEG END), even if they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor typo (commnet)

(let ((beg-match (prop-match-beginning loc))
(end-match (prop-match-end loc)))
(remove-text-properties beg-match end-match '(lsp-semantic-token--comment-beg))
(cl-loop for i from beg-match below end-match do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to loop here (i.e., can't beg-match and end-match simply be passed to {put,remove}-text-propert{y,ies}?

(remove-text-properties i (1+ i) '(lsp-semantic-token--previous-syntax-table)))))
;; Remove comment ends
(goto-char end)
(cl-do ((loc (text-property-search-backward
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the body form of cl-do here is very similar to the one above, might make sense to extract it (or not, given that it's not very long -- I'm not sure)

(setq beg new-beg))
(when new-end
(setq end new-end)))
(lsp-semantic-tokens--remove-comment-syntax-strict beg end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that lsp--semantic-tokens-fontify removes comment syntax right at the outset, this call to lsp-semantic-tokens--remove-comment-syntax-strict might needlessly degrade performance? Or is this call necessary for some reason I don't see right now?

@JimDBh
Copy link
Author

JimDBh commented Aug 29, 2022

Thanks for the review! I'll address the code reviews next. As for testing the code, in my local tests, comment-search forward does find the "disabled" lines. Also paren/bracelet matchings are ignoring them. I'm using the provide snippet in this MR to test. Could you let me know what code you used for testing so I can reproduce? Thanks!

@sebastiansturm
Copy link
Contributor

this is what I'm using (with clangd, lsp-semantic-tokens-set-comment-syntax set to t):

int main() {
  int x = 0;
#ifdef NOT_DEFINED
  int y = 1;
#endif
  return x;
}

lsp-semantic-tokens--put-comment-syntax gets called with the right begin/end arguments, no luck with comment-search-forward though

@JimDBh
Copy link
Author

JimDBh commented Aug 29, 2022

Thanks! However I was able to do comment-search-forward from the start of the buffer, and it ends up at the "i" character in the #ifdef macro (not sure why it wouldn't just be at the "#" sign, need to look into this a bit more)

@JimDBh
Copy link
Author

JimDBh commented Aug 30, 2022

Upon further testing with a very large c-mode file with MANY such marcos, this patch makes lsp-mode lag a lot. The culprit seems to be text-property-search-forward/backward. I will see if we can keep a buffer-local cache of the added comment start/end pairs, and use that in the functions. In this way it should be a faster hopefully, and we also don't have to rely on text-property-search-forward, which is not available for 26.x.

@yyoncho
Copy link
Member

yyoncho commented Nov 21, 2022

In this way it should be a faster hopefully, and we also don't have to rely on text-property-search-forward, which is not available for 26.x.

I think we can go ahead and drop 26.x at this point. Not sure if this helps solving your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants