-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Add support for ALTER COLUMN SET DATA TYPE in the Iceberg connector #25418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff 709047e...7b915f5.
|
|
|
85658a0 to
fb22b19
Compare
steveburnett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local doc build, looks good. Thanks!
|
@imjalpreet imported this issue as lakehouse/presto #25418 |
hantangwangd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work, the whole change overall looks good! Just a few things to address:
1. As I see, the ShortDecimalType -> LongDecimalType conversion (e.g. decimal(10, 2) -> decimal(24, 2)) might cause problems in PARQUET format Iceberg table.
2. Since this change primarily focuses on supporting ALTER COLUMN TYPE in Iceberg, we should add test cases in Iceberg covering all type conversions (int->long, float->double, decimal->decimal, v2-compatible referring here), verifying that both query results and SHOW CREATE TABLE outputs meet expectations. An example test case may look as follows:
@Test
public void testAlterColumnType()
{
testWithAllFileFormats(this::testAlterColumnType);
}
private void testAlterColumnType(Session session, FileFormat fileFormat)
{
String schemaName = session.getSchema().get();
String tableName = "test_alter_column_type_" + randomTableSuffix();
assertUpdate("CREATE TABLE " + tableName + "(a int, b real, c decimal(4, 2)) with (\"write.format.default\" = '" + fileFormat + "')");
assertUpdate("INSERT INTO " + tableName + " VALUES(1, 1.1, 1.21)", 1);
assertUpdate("INSERT INTO " + tableName + " VALUES(2, 2.2, 2.34)", 1);
assertQuery("SELECT * from " + tableName, "VALUES(1, 1.1, 1.21), (2, 2.2, 2.34)");
validateShowCreateTable(tableName,
ImmutableList.of(
columnDefinition("a", "integer"),
columnDefinition("b", "real"),
columnDefinition("c", "decimal(4,2)")),
getCustomizedTableProperties(ImmutableMap.of(
"write.format.default", "'" + fileFormat + "'",
"location", "'" + getLocation(schemaName, tableName) + "'")));
assertUpdate("ALTER TABLE " + tableName + " ALTER COLUMN a SET DATA TYPE bigint");
assertUpdate("ALTER TABLE " + tableName + " ALTER COLUMN b SET DATA TYPE double");
assertUpdate("ALTER TABLE " + tableName + " ALTER COLUMN c SET DATA TYPE decimal(10, 2)");
assertQuery("SELECT * from " + tableName, "VALUES(1, 1.1, 1.21), (2, 2.2, 2.34)");
validateShowCreateTable(tableName,
ImmutableList.of(
columnDefinition("a", "bigint"),
columnDefinition("b", "double"),
columnDefinition("c", "decimal(10,2)")),
getCustomizedTableProperties(ImmutableMap.of(
"write.format.default", "'" + fileFormat + "'",
"location", "'" + getLocation(schemaName, tableName) + "'")));
assertUpdate("ALTER TABLE " + tableName + " ALTER COLUMN c SET DATA TYPE decimal(24, 2)");
assertQuery("SELECT * from " + tableName, "VALUES(1, 1.1, 1.21), (2, 2.2, 2.34)");
validateShowCreateTable(tableName,
ImmutableList.of(
columnDefinition("a", "bigint"),
columnDefinition("b", "double"),
columnDefinition("c", "decimal(24,2)")),
getCustomizedTableProperties(ImmutableMap.of(
"write.format.default", "'" + fileFormat + "'",
"location", "'" + getLocation(schemaName, tableName) + "'")));
dropTable(getSession(), tableName);
}
Please feel free to adopt it or modify it to your needs.
pratyakshsharma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took first pass. One high level comment, we can add documentation as well around this feature on https://prestodb.io/docs/current/sql/alter-table.html and update Iceberg connector with the limitation clearly stating what data type changes are we supporting currently.
| HiveColumnConverterProvider.DEFAULT_COLUMN_CONVERTER_PROVIDER, | ||
| context.getWarningCollector(), | ||
| context.getRuntimeStats()); | ||
| if (isTableOwner(transaction, identity, metastoreContext, tableName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check should be reversed here I think.
| .commit(); | ||
| } | ||
| catch (RuntimeException e) { | ||
| throw new PrestoException(ICEBERG_UNKNOWN_TABLE_TYPE, "Failed to set column type: " + firstNonNull(e.getMessage(), e), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we expecting one of them to be null here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ICEBERG_UNKNOWN_TABLE_TYPE ?
Should be unknown column name or incompatible column type.
| return immediateFuture(null); | ||
| } | ||
|
|
||
| private static List<RowType.Field> getCandidates(Type type, String fieldName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is not getting used and can be reverted.
|
|
||
| protected ConnectorContext context; | ||
|
|
||
| protected QueryStateMachine queryStateMachine; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is not getting used.
| return Optional.empty(); | ||
| } | ||
|
|
||
| public int getCreateTableCallCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not getting used.
| return connectorCapabilities; | ||
| } | ||
|
|
||
| public void setConnectorCapabilities(ConnectorCapabilities... connectorCapabilities) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this as well
| protected WarningCollector warningCollector; | ||
|
|
||
| private Session testSession; | ||
| private TableHandle tableHandle; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be converted to a local variable.
| { | ||
| private static final String CATALOG_NAME = "catalog"; | ||
| public static final String SCHEMA = "schema"; | ||
| private CatalogManager catalogManager; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check intellij suggestions here. There are few variables which can be converted to a local variable. Please do the needful
PingLiuPing
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general question, does Presto support alter column type for other connector without this PR?
PingLiuPing
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add test case to cover the iceberg primitive type promotions.
See https://iceberg.apache.org/spec/#schema-evolution
fb22b19 to
7b915f5
Compare
Description
Add support for ALTER COLUMN SET DATA TYPE in the Iceberg connector
Motivation and Context
To address the issue : #25417
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.