Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case Sensitivity in Schemas #32

Closed
JoelJuntunen opened this issue Mar 11, 2022 · 5 comments · May be fixed by #33
Closed

Case Sensitivity in Schemas #32

JoelJuntunen opened this issue Mar 11, 2022 · 5 comments · May be fixed by #33
Labels
bug Something isn't working

Comments

@JoelJuntunen
Copy link

JoelJuntunen commented Mar 11, 2022

Teradata ANSI mode which the adapter uses is case sensitive. This causes compilation errors if the schemas are defined in BTET mode and have different capitalisations since the adapter uses "where" conditions and won't find unambiguous results.

Steps To Reproduce:

  1. Make a schema and tables with a Teradata SQL tool with capitalisations in the naming.
  2. Create dbt model with lowercase schemas and identifier
  3. Run the dbt model

Code:

  1. CREATE MULTISET TABLE Schema_Name.IDENTIFIER_NAME ,FALLBACK ,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    CHECKSUM = DEFAULT,
    DEFAULT MERGEBLOCKRATIO,
    MAP = TD_MAP1
    (
    field_name CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL
    )
    PRIMARY INDEX ( field_name );

{{
config(
materialized = 'table'
)
}}

with src as (
select *
from schema_name.identifier_name
)

dbt run -s {name of model}

Expected behavior

A clear and concise description of what you expected to happen.
Run clears without errors and the schemas are identified in a case insensitive way.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.
image

The output of dbt --version:

installed version: 1.0.1
   latest version: 1.0.3
Plugins:
  - teradata: 1.0.1.1

The operating system you're using:
Windows 10
The output of python --version:
Python 3.9.9

Possible considerations:
Should the adapter support databases where there might be schemas with only differences being the capitalisation? If yes, this issue should be ignored and a workaround documented.

@JoelJuntunen JoelJuntunen added the bug Something isn't working label Mar 11, 2022
@JoelJuntunen
Copy link
Author

Before taking dbt to our stack, we used case insensitive modes so our developers are not used to having to remember the capitalisations of schemas. It might be that this is an issue we should deal with locally but putting this out here in case others are finding the behaviour undesirable as well.

@JoelJuntunen
Copy link
Author

The fix I made for the pull request is quite crude and suggestions of improvement are very welcome.

@adamtwo
Copy link
Contributor

adamtwo commented Mar 14, 2022

Hi @JoelJuntunen , thanks for submitting the issue. One of the early adopters of dbt-teradata reported the same issue a while ago. Let's have a more public conversation on this issue.

You are right, the current convention is to use exactly the same casing in dbt models as was used in original CREATE TABLE statements. This is in line with the ANSI mode. Yes, it can be confusing for people who have worked with the BTET/Teradata mode where the casing is ignored. On the other hand, the error message is clear and it's easy to fix it by updating the casing in model files.

Now, why don't we lower-case all database names in the the dbt-teradata adapter?

  1. It would be against the ANSI mode that the dbt-teradata adapter documentation advertises. It might be confusing to some users why they don't see correct casing in e.g. dbt-generated documentation.
  2. It's hard to predict what such a change could do to the backward compatibility of the adapter. This is especially true when users override certain core macros like teradata__list_schemas or list_relations_without_caching.
  3. The default behavior is relatively easy to customize by overriding generate_schema_name, e.g. this override should allow using all lower case letters in model files even if the database contains mixed-case names:
    {% macro generate_schema_name(custom_schema_name, node) -%}
      {%- if custom_schema_name is none -%}
        {{ target.schema }}
      {% else %}
        {{ custom_schema_name | trim | lower }}
      {%- endif -%}
    {%- endmacro %}
    

BTW, it's not enough to add lower() in teradata__list_schemas and list_relations_without_caching macros as captured in PR #33 . dbt processes the information schema in other places as well. See the failing tests for details: https://github.com/Teradata/dbt-teradata/runs/5540511873?check_suite_focus=true

So the question is, is the community in favor of changing generate_schema_name to lower-case all names?

@Austin1
Copy link

Austin1 commented Dec 11, 2023

Maybe it's my lack of understand ANSI, but does it only apply to the column/schema names and not values of data? E.g., if we use where ColumnName like ('austin',adamtwo','Joel') and the data has it as ColumnName = AUSTIN ADAMTWO JOEL; will that work in Ansi mode?

Everyone at my company is very used to non-ANSI and getting analytics power users to use DBT with ANSI has proven to create a lot of problems. I'm seeing string functions popup everywhere in our code.

@tallamohan
Copy link
Contributor

Teradata transaction mode (TERA) support added in dbt-teradata v1.7.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants