-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHAR_LENGTH for postgres #4555
base: main
Are you sure you want to change the base?
CHAR_LENGTH for postgres #4555
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Could you share concrete examples in which CHAR_LENGTH
and LENGTH
differ in Postgres? The psql docs specify that LENGTH
returns the number of characters as does CHAR_LENGTH
, in contrast to e.g MySQL:
mysql> SELECT LENGTH('€'), CHAR_LENGTH('€');
3 1
psql> SELECT LENGTH('€'), LENGTH('€', 'UTF8'), CHAR_LENGTH('€');
1 1 1
OOPS! You're right, I messed something up in my notes. It is now fixed. Also, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do note that Postgres supports an encoding arg in
LENGTH
, currently this will be dropped afaict. If you want, you can extendexp.Length
to also store that - Could we add a couple Postgres tests as well? They'd look like:
self.validate_identity("LENGTH(x)")
self.validate_identity("LENGTH(x, y)")
self.validate_identity("CHAR_LENGTH(x)", "LENGTH(x)")
self.validate_identity("CHAR_LEN(x)", "LENGTH(x)")
With these we should cover all Postgres roundtrips (the 2nd test requires bullet 1)
@@ -377,6 +377,7 @@ class Parser(parser.Parser): | |||
"GENERATE_SERIES": _build_generate_series, | |||
"JSON_EXTRACT_PATH": build_json_extract_path(exp.JSONExtract), | |||
"JSON_EXTRACT_PATH_TEXT": build_json_extract_path(exp.JSONExtractScalar), | |||
"LEN": lambda args: exp.Length(this=seq_get(args, 0), binary=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be removed? Can't find LEN
on Postgres's docs, and even if it exists it's probably an alias of LENGTH
which is not binary (?)
I've added support for
CHAR_LEN[GTH]
for postgres. I've also reused the concept ofexp.Length(this='x', binary=True)
from bigquery/snowflake (though I don't think is ever used in those dialects).