Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C: CodeQL seems to be confused by __attribute__((weak)) #18806

Open
randomdude opened this issue Feb 17, 2025 · 1 comment
Open

C: CodeQL seems to be confused by __attribute__((weak)) #18806

randomdude opened this issue Feb 17, 2025 · 1 comment
Labels
question Further information is requested

Comments

@randomdude
Copy link

Hi all. I'm seeing some unexpected behavior, and I can't explain it - perhaps I'm misunderstanding C, or perhaps I've found a bug in CodeQL.

I create a project comprising two source files. The first, a.c:

#include <stdio.h>
void __attribute__((weak)) foo() { printf("The weak func"); }
void main() { }

And the second, b.c:

#include <stdio.h>
void foo() { printf("The strong func"); }

I build these with the Makefile (supplied for completeness, excuse my verbosity):

a:
        gcc -o a a.c b.c

As you can see, foo is defined twice - once as weak.

I then run the following CodeQL query to list all functions, and an attribute of each.

from Function f select f, f.getAnAttribute().toString()

Via the following commands:

./codeql database create a --language=c --overwrite --source-root /home/aliz/a/
./codeql database analyze  --format=csv --output=results a foo --rerun

This results in the following unexpected output:

"foo","foo","error","weak","/a.c","2","28","2","30"
"foo","foo","error","weak","/b.c","2","6","2","8"

As you can see, the two functions have been identified, but have both been marked with the weak attribute. I expected only the first to be marked as weak.
I initially thought this was due to some subtle C behavior beyond my understanding, but if we look a little closer, it does indeed appear that CodeQL has 'mixed up' the two functions. I run the following query, intended to list the string literals for each function:

from
        StringLiteral str,
        Function f
where
    str.getEnclosingFunction() = f
select f, str.getValue()

This results in the following:

"foo","foo","error","The weak func
The strong func","/a.c","2","28","2","30"
"foo","foo","error","The weak func
The strong func","/b.c","2","6","2","8"

Here, the two functions have been reported, but each is reported as referencing both strings, which appears incorrect to me - I would have expected something akin to the following to be returned:

"foo","foo","error","The weak func","/a.c","2","28","2","30"
"foo","foo","error","The strong func","/b.c","2","6","2","8"

Can anyone shed some light on the issue here? Have I really stumbled into a codeql bug, or is this due to some wizard-level C behavior? Thanks for any help!

@randomdude randomdude added the question Further information is requested label Feb 17, 2025
@jketema
Copy link
Contributor

jketema commented Feb 18, 2025

Hi @randomdude,

We do not attempt in any way to simulate ELF's weak linking semantics. This means you'll end up with two copies of the foo function in the database (instead of just the strongly linked one). CodeQL cannot really differentiate between the two copies (they will have the same database name), giving the behavior you're seeing. The FunctionDeclarationEntrys should still be appropriately separated, I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants