-
Notifications
You must be signed in to change notification settings - Fork 1k
feat(snowflake): transpile CORR with NaN-->NULL #6619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
SQLGlot Integration Test ResultsComparing:
By Dialect
Overallmain: 6173 total, 5610 passed (pass rate: 90.9%), sqlglot version: sqlglot:trey/corr-sf-duckdb: 6173 total, 5610 passed (pass rate: 90.9%), sqlglot version: Difference: No change |
sqlglot/dialects/duckdb.py
Outdated
| return super().window_sql(expression) | ||
|
|
||
| corr_no_null = corr.copy() | ||
| corr_no_null.set("null_on_zero_variance", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set this to false here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this unwrapping and arg setting into a small helper - see what you think
65a57ea to
0d175bc
Compare
| def _maybe_corr_null_to_false( | ||
| expression: t.Union[exp.Filter, exp.Window, exp.Corr], | ||
| ) -> t.Optional[t.Union[exp.Filter, exp.Window, exp.Corr]]: | ||
| expr = expression.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this copy necessary ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A window/filter AST containing CORR could be modified here then passed through to other rendering methods.
I was under the impression that, when modifying an AST, we should copy if it will then be passed to unknown downstream consumers.
It sounds like that's not correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, I see the point, ^ this works.
I was thinking that .set doesn't break anything here, so we could skip the copy in some cases.
What do you think @VaggelisD ?
CORRperforms an internal calculation that hasVARIANCE([column])in the denominator. When that occurs, Snowflake returnsNULLwhile DuckDB returnsNaN.To replicate the Snowflake's
NULLin DuckDB, we must wrap theCORRcall inISNAN:Naively wrapping in
ISNANgenerates invalid syntax for windowed and filtered function calls.Therefore, we add special DuckDB generator handling to route
exp.Window,exp.Window(exp.Filter), andexp.Filterto acorr_sqlmethod designed to accept window and filtered expressions.Examples:
SELECT CORR(a, b) OVER (PARTITION BY c)SELECT CASE WHEN ISNAN(CORR(a, b) OVER (PARTITION BY c)) THEN NULL ELSE CORR(a, b) OVER (PARTITION BY c) ENDSELECT CORR(a, b) FILTER (WHERE c > 0)SELECT CASE WHEN ISNAN(CORR(a, b) FILTER(WHERE c > 0)) THEN NULL ELSE CORR(a, b) FILTER(WHERE c > 0) ENDSELECT CORR(a, b) FILTER (WHERE c > 0) OVER (PARTITION BY d)SELECT CASE WHEN ISNAN(CORR(a, b) FILTER(WHERE c > 0) OVER (PARTITION BY d)) THEN NULL ELSE CORR(a, b) FILTER(WHERE c > 0) OVER (PARTITION BY d) END