New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] CPD is always case sensitive #4396
Comments
4 tasks
The apex tokenizer has an option to be case insensitive:
#4397 changes this option into a language property: https://github.com/pmd/pmd/pull/4397/files#diff-9320afd0816587cbe6b47f1b793f39e1987484e35efedf59c3e63453877e12fdR46 |
jsotuyod
added
needs:pmd7-revalidation
The issue hasn't yet been retested vs PMD 7 and may be stale
and removed
needs:pmd7-revalidation
The issue hasn't yet been retested vs PMD 7 and may be stale
labels
Apr 2, 2024
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Affects PMD Version: 6.x
Description:
Some languages like PL/SQL or the new T-SQL (#4390) are case-insensitive. When tokenizing, this is working correctly, e.g. the lexers are agnostic to casing. JavaCC has a grammar option and ANTLR since 4.10 as well.
However, when we convert the original tokens into CPD TokenEntries, we don't seem to use the token kind and use the original token text, which contains the original casing. It's therefore very easy to work around duplicated for these languages by just changing the casing:
results correctly in:
since file1.plsql and file2.plsql are identical.
However, comparing file1.plsql and file3.plsql which differ only in casing, shows no duplications:
I think, this problem affects both JavaCC and ANTLR based languages.
The text was updated successfully, but these errors were encountered: