Skip to content

DATEDIFF does not consistently cast params to date #6083

@kyle-cheung

Description

@kyle-cheung

read Snowflake
write DuckDB

Depending on the complexity of the params in a DATEDIFF transpile does not consistently cast the params to the intended datatype, and depending on the date part could produce incorrect results

Snowflake is capable of ingesting varchars as params to DATEDIFF

SELECT 
    datediff(year, TO_CHAR(current_timestamp - interval '5 years', 'YYYY-MM-DD'), '2025-01-01'),
    datediff(year, current_timestamp - interval '5 years', '2025-01-01'),
    datediff(seconds, '2020-01-01', '2025-10-12 00:56:42.345 +0000'),
    datediff(seconds, '2020-01-01', '2025-05-01'::varchar),
    datediff(hours, '12:30'::time, '14:30'::time);
Image

However, DuckDB is not.

Image

SQLGlot will cast varchars to dates sometimes

import sqlglot
from sqlglot.optimizer import optimize


sql = """
SELECT 
    datediff(seconds, '2020-01-01', '2025-05-01'::varchar);
"""

sqlglot.transpile(sql, "snowflake", "duckdb", pretty=True)[0]
> SELECT
  DATE_DIFF('SECOND', CAST('2020-01-01' AS DATE), CAST('2025-05-01' AS TEXT))

In the event that there are actual timestamp values, it could produce incorrect results

# sqlglot.transpile(SELECT datediff('second', '2020-01-01', '2025-10-12 00:56:42.345 +0000') , "snowflake", "duckdb", pretty=True)[0]

SELECT
  DATE_DIFF('SECOND', CAST('2020-01-01' AS DATE), CAST('2025-10-12 00:56:42.345 +0000' AS DATE))

> 182390400 
-- This should be 182393802

Might be safer to cast everything to a TIMESTAMP unless it is explicitly casted to a TIME

Reproducible Code

import sqlglot
from sqlglot.optimizer import optimize


sql = """
SELECT 
    datediff(year, TO_CHAR(current_timestamp - interval '5 years', 'YYYY-MM-DD'), '2025-01-01'),
    datediff(year, current_timestamp - interval '5 years', '2025-01-01'),
    datediff(seconds, '2020-01-01', '2025-10-12 00:56:42.345 +0000'),
    datediff(seconds, '2020-01-01', '2025-05-01'::varchar),
    datediff(hours, '12:30'::time, '14:30'::time);
"""

transpiled_sql = sqlglot.transpile(sql, "snowflake", "duckdb", pretty=True)[0]

>
    SELECT
      DATE_DIFF(
        'YEAR',
        STRFTIME(CURRENT_TIMESTAMP - INTERVAL '5' YEARS, '%Y-%m-%d'),
        CAST('2025-01-01' AS DATE)
      ),
      DATE_DIFF('YEAR', CURRENT_TIMESTAMP - INTERVAL '5' YEARS, CAST('2025-01-01' AS DATE)),
      DATE_DIFF('SECOND', CAST('2020-01-01' AS DATE), CAST('2025-10-12 00:56:42.345 +0000' AS DATE)),
      DATE_DIFF('SECOND', CAST('2020-01-01' AS DATE), CAST('2025-05-01' AS TEXT)),
      DATE_DIFF('HOUR', CAST('12:30' AS TIME), CAST('14:30' AS TIME))

Official Documentation
https://docs.snowflake.com/en/sql-reference/functions/datediff
https://duckdb.org/docs/stable/sql/functions/date.html#date_diffpart-startdate-enddate

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions