Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash by protecting tm_ctags_*() functions against TM_PARSER_NONE #3865

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

techee
Copy link
Member

@techee techee commented May 4, 2024

This can happen when in the filetype configuration we set

[settings]
tag_parser=

This then leads to this crash in the ctags code:

#0  countKinds (kcb=0x0) at main/kind.c:230
#1  0x0000fffff7e802c8 in countLanguageKinds (language=language@entry=-2)
    at main/parse.c:300
#2  0x0000fffff7e15f88 in tm_ctags_get_lang_kinds (lang=lang@entry=-2)
    at tm_ctags.c:467
#3  0x0000fffff7e173e8 in read_ctags_file
    (file_tags=0xaaaaaad7e260, lang=-2, tags_file=0xaaaaaaf86de0 "/usr/local/share/geany/tags/std.py.tags") at tm_source_file.c:313
#4  tm_source_file_read_tags_file
    (tags_file=tags_file@entry=0xaaaaaaf86de0 "/usr/local/share/geany/tags/std.py.tags", mode=-2) at tm_source_file.c:552
#5  0x0000fffff7e1a4dc in tm_workspace_load_global_tags
    (tags_file=tags_file@entry=0xaaaaaaf86de0 "/usr/local/share/geany/tags/std.py.tags", mode=<optimized out>) at tm_workspace.c:379
#6  0x0000fffff7c9937c in symbols_load_global_tags
    (tags_file=0xaaaaaaf86de0 "/usr/local/share/geany/tags/std.py.tags", ft=ft@entry=0xaaaaab4cdf40) at symbols.c:166
#7  0x0000fffff7c997bc in load_user_tags (ft_id=GEANY_FILETYPES_PYTHON)
    at symbols.c:1377
#8  symbols_global_tags_loaded (file_type_idx=<optimized out>) at symbols.c:193
#9  0x0000fffff7c53bdc in document_load_config
    (doc=0xaaaaac1c90d0, type=0xaaaaab4cdf40, filetype_changed=<optimized out>)
    at document.c:2820
#10 0x0000fffff7c53cc8 in document_set_filetype
    (doc=0xaaaaac1c90d0, type=0xaaaaab4cdf40) at document.c:2859
#11 0x0000fffff7c55ac4 in document_open_file_full (doc=<optimized out>, 
    doc@entry=0x0, filename=<optimized out>, pos=pos@entry=0, readonly=readonly@entry=0, ft=ft@entry=0x0, forced_enc=forced_enc@entry=0x0) at document.c:1490
#12 0x0000fffff7c55b00 in document_open_file
    (locale_filename=<optimized out>, readonly=readonly@entry=0, ft=ft@entry=0x0, forced_enc=forced_enc@entry=0x0) at document.c:904

Until there's some LSP support in Geany, setting tag_parser to the empty value is necessary to disable ctags parsing so it doesn't clash with the LSP implementation.

This can happen when in the filetype configuration we set

[settings]
tag_parser=

This then leads to a crash in the ctags code.
@techee techee added the bug label May 4, 2024
@elextr
Copy link
Member

elextr commented May 5, 2024

What does upstream say about the changes?

What is the setting set to if the tag_parser= is not present? And can tag_parser= mean that?

@techee
Copy link
Member Author

techee commented May 5, 2024

What does upstream say about the changes?

The change is in our sources, not the upstream ones.

What is the setting set to if the tag_parser= is not present?

For most languages the parser is hard-coded using the table here:

static void init_builtin_filetypes(void)

I think only a few non-builtin filetypes like JSON use the tag_parser option to specify the parser by name. But if the name is invalid or not provided, we shouldn't crash which this PR tries to do.

And can tag_parser= mean that?

Mean what?

@techee
Copy link
Member Author

techee commented May 5, 2024

For most languages the parser is hard-coded using the table here: ...
I think only a few non-builtin filetypes like JSON use the tag_parser option to specify the parser by name. But if the name is invalid or not provided, we shouldn't crash which this PR tries to do.

And also the hard-coded parsers can be overridden by this option so by specifying e.g. tag_parser=Python for the C filetype, we'd start using the Python parser for it (doesn't make much sense in this case). So by specifying tag_parser= we override the C parser to "no parser" so TM is completely disabled for C and LSP doesn't interfere with it.

@elextr
Copy link
Member

elextr commented May 5, 2024

Sorry, I wasn't clear, many of the custom filetypes have no parser and don't crash. So what do they set the tag_parser to? And can tag_parser= be set to the same value that successfully works for custom filetypes.

@techee
Copy link
Member Author

techee commented May 5, 2024

Sorry, I wasn't clear, many of the custom filetypes have no parser and don't crash. So what do they set the tag_parser to?

The crash only happened because Geany was trying to load the global tags file for Python (data/tags/std.py.tags) and it was Python for which I set tag_parser=. The custom filetypes that don't use tag_parser also set the parser to TM_PARSER_NONE but don't have any global tags file so the crash doesn't happen for them.

I could have also fixed this particular crash by checking the parser for TM_PARSER_NONE inside e.g. tm_workspace_load_global_tags () but I think it's a good idea to protect all the ctags-interfacing functions against this value.

And can tag_parser= be set to the same value that successfully works for custom filetypes.

Well, I think we should be able to allow users disable some ctags parser if it e.g. causes crashes and tag_parser= is the way to do it currently. And setting the parser of say the Pascal filetype to the C parser isn't a good idea as the C parser will be constantly confused by Pascal sources.

@elextr
Copy link
Member

elextr commented May 5, 2024

The custom filetypes that don't use tag_parser also set the parser to TM_PARSER_NONE but don't have any global tags file so the crash doesn't happen for them.

Ok, it didn't make sense that TM_PARSER_NONE worked for custom, but not built-in, but in fact that we are doing something that just luck says doesn't happen to be used for custom filetypes. Presumably if someone made a hand made global tags file for a custom filetype it would crash as well. So yeah it needs to be protected.

So the question goes back to, since the proposed changes are to ctags will upstream accept the changes so we don't have to keep a patch to ctags?

@techee
Copy link
Member Author

techee commented May 5, 2024

So the question goes back to, since the proposed changes are to ctags will upstream accept the changes so we don't have to keep a patch to ctags?

See the first answer here: #3865 (comment) :-). The file tm_ctags.c is our source that hides the ctags "API" behind some thin layer and it is only this file that includes ctags headers so we don't pollute the rest of our sources with symbols included from ctags.

@elextr
Copy link
Member

elextr commented May 5, 2024

The file tm_ctags.c is our source that hides the ctags "API" behind some thin layer and it is only this file that includes ctags headers so we don't pollute the rest of our sources with symbols included from ctags.

Too easy then.

@techee techee added the lsp Related to LSP API and plugins label May 11, 2024
@techee techee added this to the 2.1 milestone May 11, 2024
Copy link
Member

@b4n b4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, apart possibly the inline comments.
Sad we have to do that in several functions, but makes sense.

Comment on lines +492 to +497
kindDefinition *def;

if (lang == TM_PARSER_NONE)
return "unknown";

def = getLanguageKindForLetter(lang, kind);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this to avoid repeating the fallback value?

Suggested change
kindDefinition *def;
if (lang == TM_PARSER_NONE)
return "unknown";
def = getLanguageKindForLetter(lang, kind);
kindDefinition *def = NULL;
if (lang != TM_PARSER_NONE)
def = getLanguageKindForLetter(lang, kind);

Comment on lines +505 to +510
kindDefinition *def;

if (lang == TM_PARSER_NONE)
return '-';

def = getLanguageKindForName(lang, name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, maybe:

Suggested change
kindDefinition *def;
if (lang == TM_PARSER_NONE)
return '-';
def = getLanguageKindForName(lang, name);
kindDefinition *def = NULL;
if (lang != TM_PARSER_NONE)
def = getLanguageKindForName(lang, name);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug lsp Related to LSP API and plugins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants