Skip to content

Conversation

@agrawalreetika
Copy link
Member

@agrawalreetika agrawalreetika commented Jul 26, 2025

Description

Add quotes to column names and disable UpperCase conversion in BaseJdbcClient

This PR has 2 commits -

  1. Disable UpperCase conversion when the case-sensitive flag is enabled
  2. Add quotes to column names for renameColumn and dropColumn in BaseJdbcClient. Also adding storesUpperCaseIdentifiers() check for dropColumn, similar to renameColumn & addColumn
  3. Disable UpperCase conversion when case-sensitve flag is enabled for Oracle connector

Motivation and Context

  1. Disable UpperCase conversion when the case-sensitive flag is enabled
    As per the JDBC specification, storesUpperCaseIdentifiers() only applies to unquoted identifiers. Since Presto already quotes all identifiers when interacting with JDBC connectors, the result of this method should not influence identifier casing when caseSensitiveNameMatchingEnabled is true.

According to the JDBC specification:

/**
 * Retrieves whether this database treats mixed case *unquoted* SQL identifiers
 * as case insensitive and stores them in *upper case*.
 */
boolean storesUpperCaseIdentifiers();

This PR updates BaseJdbcClient to respect the newly introduced caseSensitiveNameMatchingEnabled flag when interacting with JDBC metadata.

Presto wraps all identifiers (schemas, tables, columns) with connector-specific delimiters (e.g., "MyTable"), which makes them case-sensitive as per the SQL standard. However, the existing logic blindly uppercases schema and table names when the JDBC DatabaseMetaData.storesUpperCaseIdentifiers() method returns true.

  1. Add a quote to column names for renameColumn and dropColumn, as most of the JDBC connectors' case-sensitivity depends on it
  2. Disable UpperCase conversion when case-sensitve flag is enabled for Oracle connector

Impact

Improve BaseJdbcClient for case-sensitive scenarios

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@agrawalreetika agrawalreetika requested a review from a team as a code owner July 26, 2025 08:45
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Jul 26, 2025
@prestodb-ci prestodb-ci requested review from a team, BryanCutler and wanglinsong and removed request for a team July 26, 2025 08:45
@agrawalreetika agrawalreetika self-assigned this Jul 26, 2025
@prestodb-ci
Copy link
Contributor

@agrawalreetika imported this issue as lakehouse/presto #25628

Copy link
Contributor

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agrawalreetika this looks ok, but I'm wondering since storesUpperCaseIdentifiers only applies to unquoted identifiers, and presto wraps all identifiers, then should we even be checking storesUpperCaseIdentifiers at all?

@agrawalreetika
Copy link
Member Author

agrawalreetika commented Jul 30, 2025

storesUpperCaseIdentifiers

Thanks for your reveiw @BryanCutler . I am very sure, why is it kept in there.
I kept that as it is to keep the backward compatibility and have the feature as it is when case-sensitve flag is not on. That may require a spearate discussion if we have to decommision that altogther.

@BryanCutler
Copy link
Contributor

I kept that as it is to keep the backward compatibility and have the feature as it is when case-sensitve flag is not on. That may require a spearate discussion if we have to decommision that altogther.

It does sounds like it is already disabled by quoting the identifiers, but I'm not too sure, so there should be a larger discussion to make sure all cases are covered. Another thing I don't quite understand is if the db returns true for storesUpperCaseIdentifiers, then shouldn't that take precedence over caseSensitiveNameMatchingEnabled and presto should continue to capitalize the identifiers anyway?

@agrawalreetika
Copy link
Member Author

My understanding here is that when caseSensitiveNameMatchingEnabled we should keep the identifier as it is since Presto wraps the identifier with a delimiter before sending it to the actual data source.

And as storesUpperCaseIdentifiers should be considered for the cases when there is no delimiter as per documentation, so storesUpperCaseIdentifiers check is breaking case-sensitve senarios when caseSensitiveNameMatchingEnabled is enabled.
As per my understanding storesUpperCaseIdentifiers check should be removed from JDBC as we have delimtier added to identifeir always from Presto for the JDBC-based connector. But I haven't made any changes related to that to keep the Presto legacy behaviour as it is.

@BryanCutler
Copy link
Contributor

And as storesUpperCaseIdentifiers should be considered for the cases when there is no delimiter as per documentation, so storesUpperCaseIdentifiers check is breaking case-sensitve senarios when caseSensitiveNameMatchingEnabled is enabled.

But you say Presto will always wrap with a delimiter, so what cases would this be needed?

As per my understanding storesUpperCaseIdentifiers check should be removed from JDBC as we have delimtier added to identifeir always from Presto for the JDBC-based connector. But I haven't made any changes related to that to keep the Presto legacy behaviour as it is.

It is a change of behavior because if storesUpperCaseIdentifiers and caseSensitiveNameMatchingEnabled are true, then the identifiers are not capitalized anymore.

It seems ok to me, if all identifiers are quoted now then the result of storesUpperCaseIdentifiers doesn't really matter.
My only concern is these additions make the checks even more confusing and might be more difficult to straighten out in the future. I'll approve, but I think it would be a good idea to create a github issue about possibly removing storesUpperCaseIdentifiers check altogether and make a note of this in the code.

BryanCutler
BryanCutler previously approved these changes Aug 6, 2025
String sql = format(
"ALTER TABLE %s RENAME TO %s",
quoted(catalogName, oldTable.getSchemaName(), oldTableName),
quoted(newTableName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the changes here covered in other case-sensitive related tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oracle tests are diabled in Presto as of now, @namya28 is working on enabling it. I can add case-senetive related test class once its enabled. I tested this scenario with my oracle instance for now.

DatabaseMetaData metadata = connection.getMetaData();
String newTableName = newTable.getTableName();
String oldTableName = oldTable.getTableName();
if (metadata.storesUpperCaseIdentifiers() && !caseSensitiveNameMatchingEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the storesUpperCaseIdentifiers just missing for Oracle before or is there something different with it? Please add these Oracle changes explicitly to the PR description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OracleClient had renameTable implementation which had storesUpperCaseIdentifiers missing and converting table names to uppercase by default only for rename operation w/o check. So added it here. I can separate out this in different commit

String remoteSchema = toRemoteSchemaName(session, identity, connection, schemaTableName.getSchemaName());
String remoteTable = toRemoteTableName(session, identity, connection, remoteSchema, schemaTableName.getTableName());
if (uppercase) {
if (uppercase && !caseSensitiveNameMatchingEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have one field for "caseInsensitiveNameMatching" and another "caseSensitiveNameMatchingEnabled". what are they each for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @rschlussel

caseInsensitiveNameMatching was originally introduced to support case-insensitive matching of schema and table names, useful for databases like MySQL. It works by converting identifiers to lowercase before matching, but this can cause issues when databases contain identifiers that differ only by case (e.g., Test vs TEST).

The new caseSensitiveNameMatchingEnabled flag offers broader support for case-sensitive handling of all identifiers — schema, table, and column names — via the normalizeIdentifier API. When enabled, Presto preserves the case of user-supplied identifiers, allowing connectors to apply database-specific case sensitivity rules. When disabled, Presto defaults to lowercasing identifiers. More details about this in here

@agrawalreetika
Copy link
Member Author

@BryanCutler PR is now rebased on the latest master also added a description for all 3 commits. Please take a look.
@rschlussel, please review the PR when you get a chance. Thanks!

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, overall looks good to me exception one little question. By the way, is the Oracle test enabled?

quoted(handle.getCatalogName(), handle.getSchemaName(), handle.getTableName()),
jdbcColumn.getColumnName(),
newColumnName);
quoted(jdbcColumn.getColumnName()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very sure, but do we need to execute toUpperCase for jdbcColumn.getColumnName() in the above if clause as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hantangwangd So I tried checking this with Oracle -
The data returned in JdbcTableHandle & JdbcColumnHandle is from an existing DB (Oracle in this case) so its returned as UPPERCASE for the database -
image

So I think its conversion is not added already in BaseJdbcClient already renameColumn & dropColumn but I think there is no harm adding schema, table and column name conversion inside metadata.storesUpperCaseIdentifiers() check even for these 2 operations as well like its done in addColumn
Lmk what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And about enabling tests for Oracle, its being done as part of #25762

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree with you that adding these conversions into the storesUpperCaseIdentifiers() check clause brings no harm, but just make the code more consistent and easier to understand.

And about enabling tests for Oracle, its being done as part of #25762

Thanks for the message. Got it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made changes, please review when you get a chance.

Copy link
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, LGTM!

@agrawalreetika agrawalreetika merged commit dd1cf90 into prestodb:master Oct 1, 2025
74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants