Skip to content

Conversation

@acarpente-denodo
Copy link
Contributor

@acarpente-denodo acarpente-denodo commented Oct 21, 2025

Description

I split the original PR (#26278) into a smaller commit to check if the source-ai bot can analyze it. In the original PR is not possible because it is to big and the sourcery-ai returns the following message:
Sorry @acarpente-denodo, your pull request is larger than the review limit of 150000 diff characters

Engine support for SQL MERGE INTO. The MERGE INTO command inserts or updates rows in a table based on specified conditions.

Syntax:

MERGE INTO target_table [ [ AS ]  target_alias ]
USING { source_table | query } [ [ AS ] source_alias ]
ON search_condition
WHEN MATCHED THEN
    UPDATE SET ( column = expression [, ...] )
WHEN NOT MATCHED THEN
    INSERT [ column_list ]
    VALUES (expression, ...)

Example: MERGE INTO usage to update the sales information for existing products and insert the sales information for the new products in the market.

MERGE INTO product_sales AS s
    USING monthly_sales AS ms
    ON s.product_id = ms.product_id
WHEN MATCHED THEN
    UPDATE SET
        sales = sales + ms.sales
      , last_sale = ms.sale_date
      , current_price = ms.price
WHEN NOT MATCHED THEN
    INSERT (product_id, sales, last_sale, current_price)
    VALUES (ms.product_id, ms.sales, ms.sale_date, ms.price)

The Presto engine commit introduces an enum called RowChangeParadigm, which describes how a connector modifies rows. The iceberg connector will utilize the DELETE_ROW_AND_INSERT_ROW paradigm, as it represents an updated row as a combination of a deleted row followed by an inserted row. The CHANGE_ONLY_UPDATED_COLUMNS paradigm is meant for connectors that support updating individual columns of rows.

Note: Changes were made after reviewing the following Trino PR: trinodb/trino#7126
So, this commit is deeply inspired by Trino's implementation.

Motivation and Context

The MERGE INTO statement is commonly used to integrate data from two tables with different contents but similar structures.
For example, the source table could be part of a production transactional system, while the target table might be located in a data warehouse for analytics.
Regularly, MERGE operations are performed to update the analytics warehouse with the latest production data.
You can also use MERGE with tables that have different structures, as long as you can define a condition to match the rows between them.

Test Plan

Automated tests developed in TestSqlParser, TestSqlParserErrorHandling, TestStatementBuilder, AbstractAnalyzerTest, TestAnalyzer, and TestClassLoaderSafeWrappers classes.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

General Changes
* Add support for the MERGE INTO command in the Presto engine.

Working in progress

Cherry-pick of trinodb/trino@cee96c3

Co-authored-by: David Stryker <[email protected]>
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @acarpente-denodo, your pull request is larger than the review limit of 150000 diff characters

@acarpente-denodo acarpente-denodo deleted the feature/20578_SQL_Support_for_MERGE_INTO_(engine-sourcery-ai) branch October 21, 2025 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant