Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PolarisAuthorizerImpl to use PolarisGrantManager interface and add EntityCache based PolarisGrantManager impl #465

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

collado-mike
Copy link
Contributor

Description

Currently, the PolarisAuthorizer implementation relies on the ResolvedPolarisEntity to return a list of grants for each entity in order to determine if a Principal has authorization to perform an operation on a given target. The grant records are populated by the Resolver, utilizing the EntityCache for a given realm to avoid unnecessary roundtrips to the persistence store for, e.g., the grants held by a given PrincipalRole or CatalogRole.

Because the PolarisAuthorizer relies on the grant records in the ResolvedPolarisEntity, there's no opportunity to utilize custom grant logic (for example, to declare a target to be globally viewable or attach PrincipalRoles dynamically) without either updating the Resolver or writing custom grant records prior to the Resolver execution.

This change decouples the PolarisAuthorizer from the ResolvedPolarisEntity so that it depends entirely on the PolarisGrantManager to declare which grants exist on a securable. It implements a EntityCacheGrantManager that, under the hood, still works with the EntityCache so that the lookups from the Resolver populate the cache with both entities and grants. However, the PolarisAuthorizer doesn't need to know anything about the EntityCache or the Resolver types. For convenience, it still uses the PolarisResolvedPathWrapper, as it's still a useful container for passing around full paths for entities.

The logic in the PolarisAuthorizer is unchanged and all tests still pass. An additional test class is added to verify the EntityCacheGrantManager works directly with the cache and will delegate to the underlying PolarisGrantManager when necessary.

Note that the logic that implicitly grants the service_admin role SERVICE_MANAGE_ACCESS privileges on the implicit root container has been moved to the EntityCacheGrantManager. I ran the tests without this move and they all still passed, so it seems that, at some point, the root container is being persisted and the grants are recorded. However, to ensure backward compatibility, I ported the logic. We may want to get rid of that at some point.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • EntityCacheGrantManagerTest

Test Configuration:

  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue

* #getOrCreateMetaStoreManager(RealmContext)}
*/
@Override
default PolarisGrantManager getGrantManagerForRealm(RealmContext realm) {
Copy link
Contributor

@dimas-b dimas-b Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feel like we have a clash between Encapsulation / Decoupling and Dependency Injection concerns here. My impression from #417 was that we wanted to separate different "managers" (which, I agree, is a good idea). However, it looks like call paths to those managers still lead to one MetaStoreManagerFactory. From my POV I do not see any other reason for this but dependency injection. MetaStoreManagerFactory is provided the the application on startup and all smaller "managers" stem from it.

I just had an idea that perhaps we could make a java services-based mechanism for discovering specific metastore managers and have them isolated from each other at the interface level. Would you be interested in a POC PR for that (I can give it a try)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reasoning for this PR is to move toward dependency injection without relying on it fully implemented. #417 was, in large part, to encourage callers to declare only the most specific dependency interface they need (e.g., PolarisGrantManager to manage grants, CredentialVendor to manage credentials) so that we could use dependency injection to support different manager implementations. Today, the PolarisMetaStoreManager implements all of the component interfaces, so it's easy to make this interface implement the PolarisGrantManager.Factory interface. In the future, I imagine we might break the interface inheritance so that a PolarisMetaStoreManager doesn't necessarily implement, e.g., CredentialVendor, since the persistence implementation might not want to know anything about credential vending. But for now, this factory method seems reasonable.

But we can still take advantage of the different interfaces even if all roads eventually lead to the PolarisMetaStoreManager. The caching implementation of the PolarisGrantManager is one such example. I'd also like to rewrite the StorageCredentialsCache as an implementation of CredentialVendor, even if the PolarisMetaStoreManager still implements the credential vending methods.

I think we should resolve the DI discussion on the mailing list before we introduce another service-locator approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Let's continue the DI discussion on the mailing list.


private void invalidate(@NotNull PolarisEntityCore role) {
EntityCacheEntry roleEntity = entityCache.getEntityById(role.getId());
if (roleEntity != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a get before remove (next line)? Why not add remove-by-id method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's a good idea. I was just using what was already there :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ultimately, this does a lookup for the entity in order to generate a cache key anyway. I just moved this logic into the EntityCache, so the caller just invokes removeCacheEntry(PolarisEntityCore) and it does the lookup and cache key generation internally.

@@ -226,7 +230,17 @@ public void run(PolarisApplicationConfig configuration, Environment environment)
new PolarisCallContextCatalogFactory(
entityManagerFactory, metaStoreManagerFactory, taskExecutor, fileIOFactory);

PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore);
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realms are never removed from this map... is that a concern (for large deployments)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RealmEntityManagerFactory already stores the cache in a map that's never cleaned up. I'm hoping to address this with the @RealmScoped annotation in the HK2 branch - we could consolidate all the realm-specific caches in a single location with its own read/write-based TTL

PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore);
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>();
PolarisGrantManager.Factory factory =
new EntityCacheGrantManager.EntityCacheGrantManagerFactory(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PolarisGrantManager objects returned from this factory are very different from what would be returned from MetaStoreManagerFactory.getGrantManagerForRealm()... What's the reason for having getGrantManagerForRealm in MetaStoreManagerFactory? I do not really see any call paths leading to the latter factory method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metastoreManagerFactory passed to the constructor on the next line implements the PolarisGrantManager.Factory. The factory method is invoked by the EntityCacheGrantManagerFactory constructor here: https://github.com/apache/polaris/pull/465/files/3b10cb2021cecfc85dc93b2ed51ea3a4489439e4#diff-dec23d1ca2299b42bd2926a823cad0da6f0a8334ae5093346467f410280e77a2R69

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this factory (line 235) provides the getGrantManagerForRealm() method. This factory implementation is not the same as MetaStoreManagerFactory.getGrantManagerForRealm(). Why does MetaStoreManagerFactory extend PolarisGrantManager.Factory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getGrantManagerForRealm method implemented by this factory calls delegateGrantManagerFactory.getGrantManagerForRealm(realm), which is invoking the getGrantManagerForRealm method on the MetaStoreManagerFactory. The EntityCacheGrantManagerFactory constructor takes in a PolarisGrantManager.Factory argument - here, we're passing in the MetaStoreManagerFactory as the delegate implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this code got refactored since my first comment (commit b9ded77).

@collado-mike
Copy link
Contributor Author

@dennishuo

@@ -34,7 +35,7 @@
* configuration
*/
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY, property = "type")
public interface MetaStoreManagerFactory extends Discoverable {
public interface MetaStoreManagerFactory extends Discoverable, PolarisGrantManager.Factory {
Copy link
Contributor

@dimas-b dimas-b Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this change conflicts with #470

return null;
}
return resolvedPath.subList(0, resolvedPath.size() - 1);
return resolvedPath;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we return the "resolved" path from getRawFullPath() but remove the getResolvedFullPath() method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're eliminating the ResolvedPolarisEntity type and just using PolarisEntity types everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is specific to the internal code "logic" in PolarisResolvedPathWrapper. What does "resolved" mean in the context of that class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, it referred to the types generated by the Resolver. Now, it really means nothing. I'm happy to change it to get rid of the old reference

for (PolarisGrantRecord grantRecord : resolvedSecurableEntity.getGrantRecordsAsSecurable()) {
for (PolarisEntity resolvedSecurableEntity : resolvedPath.getRawFullPath()) {
PolarisGrantManager.LoadGrantsResult grantsResult =
grantManager.loadGrantsOnSecurable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally fine with this change, but I'd like to clarify one point. My impression from current code on main is that we want to run authZ check on the exact same state of Polaris data as what the API services use for generating API responses.

Is that a correct assumption?

If I am mistaken, that is fine and this change is fine too. However, if my assumption is valid, how do we make sure the loaded grant records match the other data, which is loaded independently by API services?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a reference to the same PolarisMetaStoreManagerImpl.loadCachedEntryById and PolarisMetaStoreManagerImpl.loadGrantsOnSecurable API difference you call out below, I do think there's more work to line the cache up with the metastore API calls. Ultimately, I think the PolarisServerImpl should be fetching the same cached grant records that are being returned to the Authorizer so that the return values and the authz assertions are the same.

Copy link
Contributor

@dimas-b dimas-b Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thread is about the overall assumptions on the consistency of data in storage in the presence of concurrent changes.

With this code, grants for different objects in the same logical API-level operation may be loaded from different states of the data (e.g. results of two of more RDBMS transactions). So my question is whether this is a concern. I'm not proposing any particular approach, just wondering for my own understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code, like the existing code, assumes that all of the entities are in the cache once the Resolver runs. That process puts all of the entities and their grants into the cache so that by the time the Authorizer retrieves the grants, it fetches them from the cache. This approach is basically hiding that fact by wrapping the cache in a PolarisGrantManager. The impact on cache hits is the same, but we can make other changes to the persistence interface and the principal role resolution without affecting the Authorizer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait? Are you saying that cache eviction breaks requests?

import org.apache.polaris.core.entity.PolarisEntity;

/**
* Holds fully-resolved path of PolarisEntities representing the targetEntity with all its grants
* and grant records.
*/
public class PolarisResolvedPathWrapper {
private final List<ResolvedPolarisEntity> resolvedPath;
private final List<PolarisEntity> resolvedPath;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using plain PolarisEntity inside a "Resolved" entity path wrapper does not sound good to me purely from a code readability point of view.

I'm fine using PolarisEntity in Authrorizer inputs, but perhaps we should rename the wrapper class and change its javadoc? What does "resolved" mean in this context?

If we merely want a dedicate class to hold a path of nested entities, how about PolarisEntityPath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds fine to me - the main thing for me is to remove the different entity types so that each interface is dealing directly with the entity types (albeit, with a convenience wrapper in this case).

PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore);
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>();
PolarisGrantManager.Factory factory =
new EntityCacheGrantManager.EntityCacheGrantManagerFactory(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this code got refactored since my first comment (commit b9ded77).

@NotNull PolarisCallContext callCtx, long securableCatalogId, long securableId) {
EntityCacheLookupResult lookupResult =
entityCache.getOrLoadEntityById(callCtx, securableCatalogId, securableId);
if (lookupResult == null || lookupResult.getCacheEntry() == null) {
Copy link
Contributor

@dimas-b dimas-b Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this causes an "API skew". We never call loadGrantsOnSecurable on delegateGrantManager, therefore we assume that the entityCache is somehow connected to the same source of data as the delegateGrantManager but their APIs are totally different.

Essentially this means that PolarisMetaStoreManagerImpl.loadCachedEntryById and PolarisMetaStoreManagerImpl.loadGrantsOnSecurable() must produce coherent results.

It does not look like we achieve any clean API separation, which is kind of implied by having the multitude of interfaces that PolarisMetaStoreManager extends.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the API separation still needs a lot of work. Originally, I wanted to break the cache methods out of the PolarisMetaStoreManager entirely, but without the DI support, it was too hard to cast underlying impls to the right interfaces.

But, in reality, this API skew has really always been there. It's just more clearly evident now that we're explicitly invoking loadCachedEntryById when loadGrantsOnSecurable is called. Previously, this was just hidden by the fact that the authorizer simply got passed the ResolvedPolarisEntity and the Resolver just loaded everything from the cache without any pretense of talking to the PolarisMetaStoreManager.

Eventually, I think the EntityCache and the PolarisRemoteCache types need to be addressed, but first is to encapsulate their usage into more generalized types so that they can be changed without impacting the rest of the calling code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, @collado-mike that proper DI should be in place before tackling this effort. Let's put this PR into draft state for now.

I also agree with @dimas-b that a better API / separation of concerns is needed. Guess, CDI can indirectly help here as well.

@collado-mike collado-mike force-pushed the mcollado-authz-use-grantmanager branch from b9ded77 to 55e0d29 Compare December 3, 2024 19:31
@collado-mike collado-mike force-pushed the mcollado-authz-use-grantmanager branch 2 times, most recently from d7525ba to 2bbcdd7 Compare December 11, 2024 00:35
@collado-mike
Copy link
Contributor Author

@dimas-b @adutra now that the CDI stuff is in, can we move forward with this PR? This is part of the overall move to help support the federated identities (removing the dependency on the resolver to obtain principal role grants). There are a lot of refactoring PRs in flight and I'm worried that there are too many conflicting approaches to the overall problem.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@collado-mike : Thanks for updating this PR. It looks good to me overall.

I think we can leave all my API comments for later.

The only real concern I have is with special-casing of grants on entity name. TBH, I'm not sure whether it used to work the same way or not before this PR. If this is not a change in behaviour I think resolving it can be deferred too.

@@ -301,7 +303,12 @@ protected void configure() {
// manager
bindFactory(PolarisMetaStoreManagerFactory.class)
.in(RealmScoped.class)
.named("delegateGrantManager")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name looks too generic. It does not convey the purpose of the beam. Also, the system could potentially have multiple delegates. How about primaryGrantManager, persistingGrantManager?


@Inject
public PolarisAuthorizerImpl(PolarisConfigurationStore featureConfig) {
public PolarisAuthorizerImpl(
PolarisConfigurationStore featureConfig, Provider<PolarisGrantManager> grantManagerFactory) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we injecting a Provider instead of PolarisGrantManager directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PolarisAuthorizer is in a different scope from the PolarisGrantManager. Since the authorizer isn't realm or request-scoped, I prefer fetching the realm-scoped bean when it's needed. I know I can rely on runtime proxies, but... I dislike runtime proxies when they're not needed :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PolarisAuthorizerImpl is a light-weight object. Does it hurt to make it request-scoped? Cross-scope references resolved at call time are indeed to be avoided, but, IMHO, the Provider falls into the same category.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@collado-mike : from my POV this is the last remaining concern on this PR :)

Comment on lines +67 to +71
return delegateGrantManager.grantUsageOnRoleToGrantee(callCtx, catalog, role, grantee);
} finally {
LOGGER.debug("Invalidating cache for role {} and grantee {}", role, grantee);
entityCache.removeCacheEntry(role);
entityCache.removeCacheEntry(grantee);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the internals of caches and persistence are still a mix intermixed, but overall, this pattern looks concerning to me. Having to invalidate the cache for role upon granting a privilege to it is non-intuitive. I know I made API skew comments elsewhere already, so please treat it as just another example... I do not think we should address this in the current PR, but I think we ought to address it at some point :)

.getCacheEntry()
.getEntity()
.getName()
.equals(PolarisEntityConstants.getRootContainerName())) {
Copy link
Contributor

@dimas-b dimas-b Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special cases by entity name look risky to me. The entity here is not restricted, right? So, it can be a namespace, I suppose, and might have the name of root_container, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good callout. This is intended to replicate the "simulated" root container entity logic that was in the PolarisEntityManager, but I'll add logic and a test to ensure we aren't adding privileges to random other entity types.

… add EntityCache based PolarisGrantManager impl
@collado-mike collado-mike force-pushed the mcollado-authz-use-grantmanager branch from 1dad2e0 to 2ec5473 Compare December 19, 2024 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

4 participants