Use SQLAlchemy for Rewriter

The rewriter currently generates the subquery  using a Postgres-compatible AST, with the individual `XXXSerializer` classes responsible for converting the AST to the target engine's dialect.  Although SQLAlchemy cannot map arbitrarily from one SQL text dialect to another (which is also a non-goal for us), SQLAlchemy can map from a universal Python code graph to any compatible SQL text dialect.  Switching the rewriter to create a Python code graph and then using SQLAlchemy to serialize to SQL text would have some benefits:

* More maintainable code for complex processing, such as sampling for quantiles.
* Easier to support more engines.  New dialects would only need to be covered in the grammar enough to be parsed into our AST to be used to drive the rewriter.  Since we only support a subset of SQL-92 for differentially private processing, this is easier than trying to make adapters for everything the rewriter might need.
* Can deprecate `Serializer`

We already have a dependency on SQLAlchemy for Pandas support, so this proposal introduces no new dependencies.

The current rewriter maps column names, including generated intermediate columns, safely across all of the subqueries.  SQLAlchemy requires table metadata, similar to, but not the same as, our metadata.  For the rewriter to work, we will need to implement a mapping from our metadata to SQLAlchemy's metadata, ensuring that both are compatible with the actual tables supplied in the connection.

A straightforward implementation would be to walk the parsed AST (from our supported and validated subset of SQL-92), and generate the appropriate SQLAlchemy Python statements.  IOW, generate a single-purpose Python execution and use SQLAlchemy to serialize it to SQL text.

An alternate implementation would be to augment the AST to return SQLAlchemy functions, so any parsed AST could be used to generate SQL text in any other dialect.  Then the rewriter would simply use a parsed AST, or AST fragments, to generate the subquery's SQL text for the target engine.  This would allow the rewriter to take a more hybrid style, mixing and matching explicit Python execution graph and SQL AST.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use SQLAlchemy for Rewriter #420

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use SQLAlchemy for Rewriter #420

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions