Case Sensitivity in Schemas #32

JoelJuntunen · 2022-03-11T15:33:08Z

Teradata ANSI mode which the adapter uses is case sensitive. This causes compilation errors if the schemas are defined in BTET mode and have different capitalisations since the adapter uses "where" conditions and won't find unambiguous results.

Steps To Reproduce:

Make a schema and tables with a Teradata SQL tool with capitalisations in the naming.
Create dbt model with lowercase schemas and identifier
Run the dbt model

Code:

CREATE MULTISET TABLE Schema_Name.IDENTIFIER_NAME ,FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO,
MAP = TD_MAP1
(
field_name CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL
)
PRIMARY INDEX ( field_name );

with src as (
select *
from schema_name.identifier_name
)

dbt run -s {name of model}

Expected behavior

A clear and concise description of what you expected to happen.
Run clears without errors and the schemas are identified in a case insensitive way.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

The output of dbt --version:

installed version: 1.0.1
   latest version: 1.0.3
Plugins:
  - teradata: 1.0.1.1

The operating system you're using:
Windows 10
The output of python --version:
Python 3.9.9

Possible considerations:
Should the adapter support databases where there might be schemas with only differences being the capitalisation? If yes, this issue should be ignored and a workaround documented.

The text was updated successfully, but these errors were encountered:

JoelJuntunen · 2022-03-11T15:34:51Z

Before taking dbt to our stack, we used case insensitive modes so our developers are not used to having to remember the capitalisations of schemas. It might be that this is an issue we should deal with locally but putting this out here in case others are finding the behaviour undesirable as well.

JoelJuntunen · 2022-03-11T15:52:56Z

The fix I made for the pull request is quite crude and suggestions of improvement are very welcome.

adamtwo · 2022-03-14T16:56:41Z

Hi @JoelJuntunen , thanks for submitting the issue. One of the early adopters of dbt-teradata reported the same issue a while ago. Let's have a more public conversation on this issue.

You are right, the current convention is to use exactly the same casing in dbt models as was used in original CREATE TABLE statements. This is in line with the ANSI mode. Yes, it can be confusing for people who have worked with the BTET/Teradata mode where the casing is ignored. On the other hand, the error message is clear and it's easy to fix it by updating the casing in model files.

Now, why don't we lower-case all database names in the the dbt-teradata adapter?

It would be against the ANSI mode that the dbt-teradata adapter documentation advertises. It might be confusing to some users why they don't see correct casing in e.g. dbt-generated documentation.
It's hard to predict what such a change could do to the backward compatibility of the adapter. This is especially true when users override certain core macros like teradata__list_schemas or list_relations_without_caching.

The default behavior is relatively easy to customize by overriding generate_schema_name, e.g. this override should allow using all lower case letters in model files even if the database contains mixed-case names:

{% macro generate_schema_name(custom_schema_name, node) -%}
  {%- if custom_schema_name is none -%}
    {{ target.schema }}
  {% else %}
    {{ custom_schema_name | trim | lower }}
  {%- endif -%}
{%- endmacro %}

BTW, it's not enough to add lower() in teradata__list_schemas and list_relations_without_caching macros as captured in PR #33 . dbt processes the information schema in other places as well. See the failing tests for details: https://github.com/Teradata/dbt-teradata/runs/5540511873?check_suite_focus=true

So the question is, is the community in favor of changing generate_schema_name to lower-case all names?

Austin1 · 2023-12-11T21:21:35Z

Maybe it's my lack of understand ANSI, but does it only apply to the column/schema names and not values of data? E.g., if we use where ColumnName like ('austin',adamtwo','Joel') and the data has it as ColumnName = AUSTIN ADAMTWO JOEL; will that work in Ansi mode?

Everyone at my company is very used to non-ANSI and getting analytics power users to use DBT with ANSI has proven to create a lot of problems. I'm seeing string functions popup everywhere in our code.

tallamohan · 2024-02-16T02:09:57Z

Teradata transaction mode (TERA) support added in dbt-teradata v1.7.1

JoelJuntunen added the bug Something isn't working label Mar 11, 2022

JoelJuntunen mentioned this issue Mar 11, 2022

Applies LOWER() to schema metadata queries #33

Open

3 tasks

tallamohan closed this as completed Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Case Sensitivity in Schemas #32

Case Sensitivity in Schemas #32

JoelJuntunen commented Mar 11, 2022 •

edited

Loading

JoelJuntunen commented Mar 11, 2022

JoelJuntunen commented Mar 11, 2022

adamtwo commented Mar 14, 2022 •

edited

Loading

Austin1 commented Dec 11, 2023

tallamohan commented Feb 16, 2024

Case Sensitivity in Schemas #32

Case Sensitivity in Schemas #32

Comments

JoelJuntunen commented Mar 11, 2022 • edited Loading

Expected behavior

Screenshots and log output

JoelJuntunen commented Mar 11, 2022

JoelJuntunen commented Mar 11, 2022

adamtwo commented Mar 14, 2022 • edited Loading

Austin1 commented Dec 11, 2023

tallamohan commented Feb 16, 2024

JoelJuntunen commented Mar 11, 2022 •

edited

Loading

adamtwo commented Mar 14, 2022 •

edited

Loading