Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for row filtering and column masking access control #24278

Open
BryanCutler opened this issue Dec 18, 2024 · 1 comment
Open

Add support for row filtering and column masking access control #24278

BryanCutler opened this issue Dec 18, 2024 · 1 comment

Comments

@BryanCutler
Copy link
Contributor

BryanCutler commented Dec 18, 2024

I would like to discuss adding row filtering and column masking to access control as part of governance requirements. This has been discussed several times before, but hasn't reached on consensus on implementation, see #20572, #21913 and #18119.

I propose using the following commits cherry-picked from TrinoDB as a basis:

Additional followup commits will be added to retrieve a list of row filters and get column masks in bulk.

This implementation has existed for a while, it is straight-forward, has been in use and compatible with current production systems. There are existing extensions with Ranger https://github.com/trinodb/trino/blob/9499dc82f2d23314dbc76b0443bedd121e6400eb/plugin/trino-ranger/src/main/java/io/trino/plugin/ranger/RangerSystemAccessControl.java#L822, Opa https://github.com/trinodb/trino/blob/9499dc82f2d23314dbc76b0443bedd121e6400eb/plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaAccessControl.java#L732, etc. These could also be ported to Presto.

Expected Behavior or Use Case

Presto Component, Service, or Connector

Access control SPIs to add interfaces to get row filters and column masks. Changes required in Presto main to apply filters/masks to queries.

Possible Implementation

Below are the major changes to Presto from this implementation:

Changes to SPI

The major changes to the SPI are for access control to add interfaces for getting row filters and column masks

    /**
     * Get row filters associated with the given table and identity.
     * <p>
     * Each filter must be a scalar SQL expression of boolean type over the columns in the table.
     *
     * @return the list of filters, or empty list if not applicable
     */
    default List<ViewExpression> getRowFilters(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName)
    {
        return Collections.emptyList();
    }

    /**
     * Bulk method for getting column masks for a subset of columns in a table.
     * <p>
     * Each mask must be a scalar SQL expression of a type coercible to the type of the column being masked. The expression
     * must be written in terms of columns in the table.
     *
     * @return a mapping from columns to masks, or an empty map if not applicable. The keys of the return Map are a subset of {@code columns}.
     */
    default Map<ColumnMetadata, ViewExpression> getColumnMasks(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName, List<ColumnMetadata> columns)
    {
        return Collections.emptyMap();
    }

Class ViewExpression to hold the filter/mask expression

public ViewExpression(String identity, Optional<String> catalog, Optional<String> schema, String expression)

Changes to Presto main

The major changes to presto-main are done in StatementAnalyzer.java to retrieve filter/masks from access control during the analysis phase and translate the filter or mask into an Expression.

Then in RelationPlanner.java the filter/mask Expressions are applied during a rewrite of the plan.

Draft PR with cherry-picked commits

#24277

Alternate design considered

An alternative way to rewrite the query to apply column masks and row filters is to use the existing SPI for connector plan optimization. This allows the query plan to be rewritten during the optimization phase. This approach has been discussed before and the major downside is that each connector would need to add the feature to enable row filtering and column masking, making it not as centralized as the design above.

Context

Governance is required in most production systems that include the need to filter and mask sensitive data. Presto should have this functionality built-in.

@BryanCutler
Copy link
Contributor Author

Adding related RFC at prestodb/rfcs#34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant