You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The rewriter currently generates the subquery using a Postgres-compatible AST, with the individual XXXSerializer classes responsible for converting the AST to the target engine's dialect. Although SQLAlchemy cannot map arbitrarily from one SQL text dialect to another (which is also a non-goal for us), SQLAlchemy can map from a universal Python code graph to any compatible SQL text dialect. Switching the rewriter to create a Python code graph and then using SQLAlchemy to serialize to SQL text would have some benefits:
More maintainable code for complex processing, such as sampling for quantiles.
Easier to support more engines. New dialects would only need to be covered in the grammar enough to be parsed into our AST to be used to drive the rewriter. Since we only support a subset of SQL-92 for differentially private processing, this is easier than trying to make adapters for everything the rewriter might need.
Can deprecate Serializer
We already have a dependency on SQLAlchemy for Pandas support, so this proposal introduces no new dependencies.
The current rewriter maps column names, including generated intermediate columns, safely across all of the subqueries. SQLAlchemy requires table metadata, similar to, but not the same as, our metadata. For the rewriter to work, we will need to implement a mapping from our metadata to SQLAlchemy's metadata, ensuring that both are compatible with the actual tables supplied in the connection.
A straightforward implementation would be to walk the parsed AST (from our supported and validated subset of SQL-92), and generate the appropriate SQLAlchemy Python statements. IOW, generate a single-purpose Python execution and use SQLAlchemy to serialize it to SQL text.
An alternate implementation would be to augment the AST to return SQLAlchemy functions, so any parsed AST could be used to generate SQL text in any other dialect. Then the rewriter would simply use a parsed AST, or AST fragments, to generate the subquery's SQL text for the target engine. This would allow the rewriter to take a more hybrid style, mixing and matching explicit Python execution graph and SQL AST.
The text was updated successfully, but these errors were encountered:
The rewriter currently generates the subquery using a Postgres-compatible AST, with the individual
XXXSerializer
classes responsible for converting the AST to the target engine's dialect. Although SQLAlchemy cannot map arbitrarily from one SQL text dialect to another (which is also a non-goal for us), SQLAlchemy can map from a universal Python code graph to any compatible SQL text dialect. Switching the rewriter to create a Python code graph and then using SQLAlchemy to serialize to SQL text would have some benefits:Serializer
We already have a dependency on SQLAlchemy for Pandas support, so this proposal introduces no new dependencies.
The current rewriter maps column names, including generated intermediate columns, safely across all of the subqueries. SQLAlchemy requires table metadata, similar to, but not the same as, our metadata. For the rewriter to work, we will need to implement a mapping from our metadata to SQLAlchemy's metadata, ensuring that both are compatible with the actual tables supplied in the connection.
A straightforward implementation would be to walk the parsed AST (from our supported and validated subset of SQL-92), and generate the appropriate SQLAlchemy Python statements. IOW, generate a single-purpose Python execution and use SQLAlchemy to serialize it to SQL text.
An alternate implementation would be to augment the AST to return SQLAlchemy functions, so any parsed AST could be used to generate SQL text in any other dialect. Then the rewriter would simply use a parsed AST, or AST fragments, to generate the subquery's SQL text for the target engine. This would allow the rewriter to take a more hybrid style, mixing and matching explicit Python execution graph and SQL AST.
The text was updated successfully, but these errors were encountered: