-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat(plugin-oracle): Implemented getTableStatistics for oracle connector #26120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideThis PR extends OracleClient by overriding getTableStatistics to fetch and parse table-level and column-level statistics from Oracle’s DBA_TAB_STATISTICS and ALL_TAB_COLUMNS views, converting them into Presto’s TableStatistics and ColumnStatistics via helper methods and enriched logging. Sequence diagram for getTableStatistics data retrieval and conversionsequenceDiagram
participant OC as OracleClient
participant CF as connectionFactory
participant DB as OracleDB
participant TS as TableStatistics
participant CS as ColumnStatistics
OC->>CF: openConnection(JdbcIdentity)
CF->>DB: Connect
OC->>DB: Execute SQL for table stats (DBA_TAB_STATISTICS)
DB-->>OC: Return NUM_ROWS, AVG_ROW_LEN, LAST_ANALYZED
OC->>DB: Execute SQL for column stats (ALL_TAB_COLUMNS)
DB-->>OC: Return column stats
OC->>CS: Build ColumnStatistics for each column
OC->>TS: Build TableStatistics with column stats and row count
OC-->>Caller: Return TableStatistics
Class diagram for updated OracleClient with getTableStatisticsclassDiagram
class OracleClient {
+getTableStatistics(session, handle, columnHandles, tupleDomain) TableStatistics
-getColumnStaticsSql(handle) String
-toDouble(number) double
}
OracleClient --|> BaseJdbcClient
class TableStatistics {
+builder()
+empty()
+setColumnStatistics(columnStatisticsMap)
+setRowCount(rowCount)
}
class ColumnStatistics {
+builder()
+setDataSize(dataSize)
+setNullsFraction(nullsFraction)
+setDistinctValuesCount(distinctValuesCount)
+setRange(DoubleRange)
}
class DoubleRange {
+DoubleRange(low, high)
}
OracleClient --> TableStatistics
OracleClient --> ColumnStatistics
ColumnStatistics --> DoubleRange
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Wrap the PreparedStatement and ResultSet objects in try-with-resources (not just the Connection) to ensure all JDBC resources are properly closed.
- Use SLF4J’s {} placeholders instead of %s in log messages so parameters are substituted correctly.
- Avoid building SQL via String.format with unescaped identifiers—use bind parameters or proper identifier quoting to prevent SQL injection or quoting errors.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Wrap the PreparedStatement and ResultSet objects in try-with-resources (not just the Connection) to ensure all JDBC resources are properly closed.
- Use SLF4J’s {} placeholders instead of %s in log messages so parameters are substituted correctly.
- Avoid building SQL via String.format with unescaped identifiers—use bind parameters or proper identifier quoting to prevent SQL injection or quoting errors.
## Individual Comments
### Comment 1
<location> `presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java:152-154` </location>
<code_context>
}
+ @Override
+ public TableStatistics getTableStatistics(ConnectorSession session, JdbcTableHandle handle, List<JdbcColumnHandle> columnHandles, TupleDomain<ColumnHandle> tupleDomain)
+ {
+ try {
+ Preconditions.checkNotNullOrEmpty(handle.getSchemaName(), "schema name");
+ Preconditions.checkNotNullOrEmpty(handle.getTableName(), "table name");
</code_context>
<issue_to_address>
**issue:** Consider handling the case where NUM_ROWS is zero to avoid division by zero.
Guard against numRows being zero when computing nullsFraction to prevent runtime errors and ensure accurate statistics.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Show resolved
Hide resolved
| public TableStatistics getTableStatistics(ConnectorSession session, JdbcTableHandle handle, List<JdbcColumnHandle> columnHandles, TupleDomain<ColumnHandle> tupleDomain) | ||
| { | ||
| try { | ||
| Preconditions.checkNotNullOrEmpty(handle.getSchemaName(), "schema name"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a requireNonNull should suffice for this
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Outdated
Show resolved
Hide resolved
| "FROM ALL_TAB_COLUMNS\n" + | ||
| "WHERE OWNER = '%s'\n" + | ||
| " AND TABLE_NAME = '%s' AND COLUMN_NAME = '%s') "); | ||
| StringBuffer sqlQueryBuffer = new StringBuffer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not StringBuilder ?
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Outdated
Show resolved
Hide resolved
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Outdated
Show resolved
Hide resolved
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Outdated
Show resolved
Hide resolved
|
@ScrapCodes fyi, there is an effort to re-enable Oracle tests here - #25762 |
78b7109 to
79706f4
Compare
314094b to
dca512e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- Consider using try-with-resources for PreparedStatements and ResultSets to ensure they are always closed and avoid resource leaks.
- Build SQL queries using parameterized statements or proper identifier quoting instead of String.format to prevent SQL injection and handle edge cases.
- Use '{}' placeholders in Logger calls instead of '%s' for consistent formatting with the logging framework.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider using try-with-resources for PreparedStatements and ResultSets to ensure they are always closed and avoid resource leaks.
- Build SQL queries using parameterized statements or proper identifier quoting instead of String.format to prevent SQL injection and handle edge cases.
- Use '{}' placeholders in Logger calls instead of '%s' for consistent formatting with the logging framework.
## Individual Comments
### Comment 1
<location> `presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java:150` </location>
<code_context>
}
+ @Override
+ public TableStatistics getTableStatistics(ConnectorSession session, JdbcTableHandle handle, List<JdbcColumnHandle> columnHandles, TupleDomain<ColumnHandle> tupleDomain)
+ {
+ try {
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider closing PreparedStatements explicitly to avoid resource leaks.
PreparedStatements (preparedStatement and preparedStatementCol) should also be closed explicitly to ensure all JDBC resources are released, especially under high query loads.
Suggested implementation:
```java
try (Connection connection = connectionFactory.openConnection(JdbcIdentity.from(session));
PreparedStatement preparedStatement = getPreparedStatement(session, connection, sql)) {
```
If there is another PreparedStatement (e.g., `preparedStatementCol`) created in this method, you should also include it in the try-with-resources statement:
try (Connection connection = ...;
PreparedStatement preparedStatement = ...;
PreparedStatement preparedStatementCol = ...) {
// use preparedStatement and preparedStatementCol
}
If the method creates PreparedStatements in different scopes, ensure each is closed in a try-with-resources or explicitly in a finally block.
</issue_to_address>
### Comment 2
<location> `presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java:174` </location>
<code_context>
+ PreparedStatement preparedStatementCol = getPreparedStatement(session, connection, getColumnStaticsSql(handle));
+ resultSet = preparedStatementCol.executeQuery();
+ Map<ColumnHandle, ColumnStatistics> columnStatisticsMap = new HashMap<>();
+ Map<String, JdbcColumnHandle> columnHandleMap = Maps.uniqueIndex(columnHandles, JdbcColumnHandle::getColumnName);
+ while (resultSet.next() && numRows > 0) {
+ String columnName = resultSet.getString("COLUMN_NAME");
</code_context>
<issue_to_address>
**issue:** Handle missing column names gracefully when populating columnStatisticsMap.
If columnHandleMap.get(columnName) returns null, this null key will be added to columnStatisticsMap, which may cause errors later. Please add a null check and handle such cases appropriately.
</issue_to_address>
### Comment 3
<location> `presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java:239` </location>
<code_context>
+ // varchar.
+ LOG.debug(e, "error while decoding : %s", number);
+ }
+ return NaN;
+ }
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Use Double.NaN for clarity and consistency.
'NaN' is undefined here; replace it with 'Double.NaN' to prevent compilation errors.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-oracle/src/main/java/com/facebook/presto/plugin/oracle/OracleClient.java
Show resolved
Hide resolved
dca512e to
d1aec45
Compare
d1aec45 to
d9f17eb
Compare
|
@ScrapCodes please fix the release note. |
Description
There is a separate PR enabling tests, once that is in - we can work on the tests.
Following unsolved problems exist:
Motivation and Context
Stats can improve the plans involving Oracle connector.
Impact
Test Plan
Contributor checklist