-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update PolarisAuthorizerImpl to use PolarisGrantManager interface and add EntityCache based PolarisGrantManager impl #465
base: main
Are you sure you want to change the base?
Conversation
* #getOrCreateMetaStoreManager(RealmContext)} | ||
*/ | ||
@Override | ||
default PolarisGrantManager getGrantManagerForRealm(RealmContext realm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feel like we have a clash between Encapsulation / Decoupling and Dependency Injection concerns here. My impression from #417 was that we wanted to separate different "managers" (which, I agree, is a good idea). However, it looks like call paths to those managers still lead to one MetaStoreManagerFactory
. From my POV I do not see any other reason for this but dependency injection. MetaStoreManagerFactory
is provided the the application on startup and all smaller "managers" stem from it.
I just had an idea that perhaps we could make a java services-based mechanism for discovering specific metastore managers and have them isolated from each other at the interface level. Would you be interested in a POC PR for that (I can give it a try)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the reasoning for this PR is to move toward dependency injection without relying on it fully implemented. #417 was, in large part, to encourage callers to declare only the most specific dependency interface they need (e.g., PolarisGrantManager
to manage grants, CredentialVendor
to manage credentials) so that we could use dependency injection to support different manager implementations. Today, the PolarisMetaStoreManager
implements all of the component interfaces, so it's easy to make this interface implement the PolarisGrantManager.Factory
interface. In the future, I imagine we might break the interface inheritance so that a PolarisMetaStoreManager
doesn't necessarily implement, e.g., CredentialVendor
, since the persistence implementation might not want to know anything about credential vending. But for now, this factory method seems reasonable.
But we can still take advantage of the different interfaces even if all roads eventually lead to the PolarisMetaStoreManager
. The caching implementation of the PolarisGrantManager
is one such example. I'd also like to rewrite the StorageCredentialsCache
as an implementation of CredentialVendor
, even if the PolarisMetaStoreManager
still implements the credential vending methods.
I think we should resolve the DI discussion on the mailing list before we introduce another service-locator approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let's continue the DI discussion on the mailing list.
|
||
private void invalidate(@NotNull PolarisEntityCore role) { | ||
EntityCacheEntry roleEntity = entityCache.getEntityById(role.getId()); | ||
if (roleEntity != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a get before remove (next line)? Why not add remove-by-id method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's a good idea. I was just using what was already there :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ultimately, this does a lookup for the entity in order to generate a cache key anyway. I just moved this logic into the EntityCache
, so the caller just invokes removeCacheEntry(PolarisEntityCore)
and it does the lookup and cache key generation internally.
...is-core/src/main/java/org/apache/polaris/core/persistence/cache/EntityCacheGrantManager.java
Outdated
Show resolved
Hide resolved
...is-core/src/main/java/org/apache/polaris/core/persistence/cache/EntityCacheGrantManager.java
Outdated
Show resolved
Hide resolved
@@ -226,7 +230,17 @@ public void run(PolarisApplicationConfig configuration, Environment environment) | |||
new PolarisCallContextCatalogFactory( | |||
entityManagerFactory, metaStoreManagerFactory, taskExecutor, fileIOFactory); | |||
|
|||
PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore); | |||
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realms are never removed from this map... is that a concern (for large deployments)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RealmEntityManagerFactory
already stores the cache in a map that's never cleaned up. I'm hoping to address this with the @RealmScoped
annotation in the HK2 branch - we could consolidate all the realm-specific caches in a single location with its own read/write-based TTL
PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore); | ||
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>(); | ||
PolarisGrantManager.Factory factory = | ||
new EntityCacheGrantManager.EntityCacheGrantManagerFactory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PolarisGrantManager
objects returned from this factory are very different from what would be returned from MetaStoreManagerFactory.getGrantManagerForRealm()
... What's the reason for having getGrantManagerForRealm
in MetaStoreManagerFactory
? I do not really see any call paths leading to the latter factory method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metastoreManagerFactory
passed to the constructor on the next line implements the PolarisGrantManager.Factory
. The factory method is invoked by the EntityCacheGrantManagerFactory
constructor here: https://github.com/apache/polaris/pull/465/files/3b10cb2021cecfc85dc93b2ed51ea3a4489439e4#diff-dec23d1ca2299b42bd2926a823cad0da6f0a8334ae5093346467f410280e77a2R69
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this factory (line 235) provides the getGrantManagerForRealm()
method. This factory implementation is not the same as MetaStoreManagerFactory.getGrantManagerForRealm()
. Why does MetaStoreManagerFactory
extend PolarisGrantManager.Factory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getGrantManagerForRealm
method implemented by this factory calls delegateGrantManagerFactory.getGrantManagerForRealm(realm)
, which is invoking the getGrantManagerForRealm
method on the MetaStoreManagerFactory
. The EntityCacheGrantManagerFactory
constructor takes in a PolarisGrantManager.Factory
argument - here, we're passing in the MetaStoreManagerFactory
as the delegate implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this code got refactored since my first comment (commit b9ded77).
8bc43ab
to
b9ded77
Compare
@@ -34,7 +35,7 @@ | |||
* configuration | |||
*/ | |||
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY, property = "type") | |||
public interface MetaStoreManagerFactory extends Discoverable { | |||
public interface MetaStoreManagerFactory extends Discoverable, PolarisGrantManager.Factory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change conflicts with #470
return null; | ||
} | ||
return resolvedPath.subList(0, resolvedPath.size() - 1); | ||
return resolvedPath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we return the "resolved" path from getRawFullPath()
but remove the getResolvedFullPath()
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're eliminating the ResolvedPolarisEntity
type and just using PolarisEntity
types everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment is specific to the internal code "logic" in PolarisResolvedPathWrapper
. What does "resolved" mean in the context of that class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, it referred to the types generated by the Resolver. Now, it really means nothing. I'm happy to change it to get rid of the old reference
for (PolarisGrantRecord grantRecord : resolvedSecurableEntity.getGrantRecordsAsSecurable()) { | ||
for (PolarisEntity resolvedSecurableEntity : resolvedPath.getRawFullPath()) { | ||
PolarisGrantManager.LoadGrantsResult grantsResult = | ||
grantManager.loadGrantsOnSecurable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally fine with this change, but I'd like to clarify one point. My impression from current code on main
is that we want to run authZ check on the exact same state of Polaris data as what the API services use for generating API responses.
Is that a correct assumption?
If I am mistaken, that is fine and this change is fine too. However, if my assumption is valid, how do we make sure the loaded grant records match the other data, which is loaded independently by API services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a reference to the same PolarisMetaStoreManagerImpl.loadCachedEntryById
and PolarisMetaStoreManagerImpl.loadGrantsOnSecurable
API difference you call out below, I do think there's more work to line the cache up with the metastore API calls. Ultimately, I think the PolarisServerImpl
should be fetching the same cached grant records that are being returned to the Authorizer so that the return values and the authz assertions are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thread is about the overall assumptions on the consistency of data in storage in the presence of concurrent changes.
With this code, grants for different objects in the same logical API-level operation may be loaded from different states of the data (e.g. results of two of more RDBMS transactions). So my question is whether this is a concern. I'm not proposing any particular approach, just wondering for my own understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code, like the existing code, assumes that all of the entities are in the cache once the Resolver runs. That process puts all of the entities and their grants into the cache so that by the time the Authorizer retrieves the grants, it fetches them from the cache. This approach is basically hiding that fact by wrapping the cache in a PolarisGrantManager
. The impact on cache hits is the same, but we can make other changes to the persistence interface and the principal role resolution without affecting the Authorizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait? Are you saying that cache eviction breaks requests?
import org.apache.polaris.core.entity.PolarisEntity; | ||
|
||
/** | ||
* Holds fully-resolved path of PolarisEntities representing the targetEntity with all its grants | ||
* and grant records. | ||
*/ | ||
public class PolarisResolvedPathWrapper { | ||
private final List<ResolvedPolarisEntity> resolvedPath; | ||
private final List<PolarisEntity> resolvedPath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using plain PolarisEntity
inside a "Resolved" entity path wrapper does not sound good to me purely from a code readability point of view.
I'm fine using PolarisEntity
in Authrorizer inputs, but perhaps we should rename the wrapper class and change its javadoc? What does "resolved" mean in this context?
If we merely want a dedicate class to hold a path of nested entities, how about PolarisEntityPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds fine to me - the main thing for me is to remove the different entity types so that each interface is dealing directly with the entity types (albeit, with a convenience wrapper in this case).
PolarisAuthorizer authorizer = new PolarisAuthorizerImpl(configurationStore); | ||
ConcurrentHashMap<RealmContext, EntityCache> realmEntityCache = new ConcurrentHashMap<>(); | ||
PolarisGrantManager.Factory factory = | ||
new EntityCacheGrantManager.EntityCacheGrantManagerFactory( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this code got refactored since my first comment (commit b9ded77).
@NotNull PolarisCallContext callCtx, long securableCatalogId, long securableId) { | ||
EntityCacheLookupResult lookupResult = | ||
entityCache.getOrLoadEntityById(callCtx, securableCatalogId, securableId); | ||
if (lookupResult == null || lookupResult.getCacheEntry() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this causes an "API skew". We never call loadGrantsOnSecurable
on delegateGrantManager
, therefore we assume that the entityCache
is somehow connected to the same source of data as the delegateGrantManager
but their APIs are totally different.
Essentially this means that PolarisMetaStoreManagerImpl.loadCachedEntryById
and PolarisMetaStoreManagerImpl.loadGrantsOnSecurable()
must produce coherent results.
It does not look like we achieve any clean API separation, which is kind of implied by having the multitude of interfaces that PolarisMetaStoreManager
extends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the API separation still needs a lot of work. Originally, I wanted to break the cache methods out of the PolarisMetaStoreManager
entirely, but without the DI support, it was too hard to cast underlying impls to the right interfaces.
But, in reality, this API skew has really always been there. It's just more clearly evident now that we're explicitly invoking loadCachedEntryById
when loadGrantsOnSecurable
is called. Previously, this was just hidden by the fact that the authorizer simply got passed the ResolvedPolarisEntity
and the Resolver just loaded everything from the cache without any pretense of talking to the PolarisMetaStoreManager
.
Eventually, I think the EntityCache
and the PolarisRemoteCache
types need to be addressed, but first is to encapsulate their usage into more generalized types so that they can be changed without impacting the rest of the calling code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to leave this concern to the proposal in https://docs.google.com/document/d/1MNCdW-uKVZaR5Ua91FDwxhreBeGjPOoV-CyF0WOWwIg/edit?tab=t.0#heading=h.e7gcs110cqli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, @collado-mike that proper DI should be in place before tackling this effort. Let's put this PR into draft state for now.
I also agree with @dimas-b that a better API / separation of concerns is needed. Guess, CDI can indirectly help here as well.
b9ded77
to
55e0d29
Compare
d7525ba
to
2bbcdd7
Compare
@dimas-b @adutra now that the CDI stuff is in, can we move forward with this PR? This is part of the overall move to help support the federated identities (removing the dependency on the resolver to obtain principal role grants). There are a lot of refactoring PRs in flight and I'm worried that there are too many conflicting approaches to the overall problem. |
2bbcdd7
to
1dad2e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@collado-mike : Thanks for updating this PR. It looks good to me overall.
I think we can leave all my API comments for later.
The only real concern I have is with special-casing of grants on entity name. TBH, I'm not sure whether it used to work the same way or not before this PR. If this is not a change in behaviour I think resolving it can be deferred too.
@@ -301,7 +303,12 @@ protected void configure() { | |||
// manager | |||
bindFactory(PolarisMetaStoreManagerFactory.class) | |||
.in(RealmScoped.class) | |||
.named("delegateGrantManager") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name looks too generic. It does not convey the purpose of the beam. Also, the system could potentially have multiple delegates. How about primaryGrantManager
, persistingGrantManager
?
|
||
@Inject | ||
public PolarisAuthorizerImpl(PolarisConfigurationStore featureConfig) { | ||
public PolarisAuthorizerImpl( | ||
PolarisConfigurationStore featureConfig, Provider<PolarisGrantManager> grantManagerFactory) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we injecting a Provider
instead of PolarisGrantManager
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PolarisAuthorizer
is in a different scope from the PolarisGrantManager
. Since the authorizer isn't realm or request-scoped, I prefer fetching the realm-scoped bean when it's needed. I know I can rely on runtime proxies, but... I dislike runtime proxies when they're not needed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PolarisAuthorizerImpl
is a light-weight object. Does it hurt to make it request-scoped? Cross-scope references resolved at call time are indeed to be avoided, but, IMHO, the Provider
falls into the same category.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@collado-mike : from my POV this is the last remaining concern on this PR :)
return delegateGrantManager.grantUsageOnRoleToGrantee(callCtx, catalog, role, grantee); | ||
} finally { | ||
LOGGER.debug("Invalidating cache for role {} and grantee {}", role, grantee); | ||
entityCache.removeCacheEntry(role); | ||
entityCache.removeCacheEntry(grantee); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the internals of caches and persistence are still a mix intermixed, but overall, this pattern looks concerning to me. Having to invalidate the cache for role
upon granting a privilege to it is non-intuitive. I know I made API skew comments elsewhere already, so please treat it as just another example... I do not think we should address this in the current PR, but I think we ought to address it at some point :)
.getCacheEntry() | ||
.getEntity() | ||
.getName() | ||
.equals(PolarisEntityConstants.getRootContainerName())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Special cases by entity name look risky to me. The entity here is not restricted, right? So, it can be a namespace, I suppose, and might have the name of root_container
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good callout. This is intended to replicate the "simulated" root container entity logic that was in the PolarisEntityManager
, but I'll add logic and a test to ensure we aren't adding privileges to random other entity types.
… add EntityCache based PolarisGrantManager impl
Fix rebase on main
1dad2e0
to
2ec5473
Compare
Description
Currently, the
PolarisAuthorizer
implementation relies on theResolvedPolarisEntity
to return a list of grants for each entity in order to determine if a Principal has authorization to perform an operation on a given target. The grant records are populated by theResolver
, utilizing theEntityCache
for a given realm to avoid unnecessary roundtrips to the persistence store for, e.g., the grants held by a givenPrincipalRole
orCatalogRole
.Because the
PolarisAuthorizer
relies on the grant records in theResolvedPolarisEntity
, there's no opportunity to utilize custom grant logic (for example, to declare a target to be globally viewable or attach PrincipalRoles dynamically) without either updating the Resolver or writing custom grant records prior to the Resolver execution.This change decouples the
PolarisAuthorizer
from theResolvedPolarisEntity
so that it depends entirely on thePolarisGrantManager
to declare which grants exist on a securable. It implements aEntityCacheGrantManager
that, under the hood, still works with theEntityCache
so that the lookups from theResolver
populate the cache with both entities and grants. However, thePolarisAuthorizer
doesn't need to know anything about theEntityCache
or theResolver
types. For convenience, it still uses thePolarisResolvedPathWrapper
, as it's still a useful container for passing around full paths for entities.The logic in the
PolarisAuthorizer
is unchanged and all tests still pass. An additional test class is added to verify theEntityCacheGrantManager
works directly with the cache and will delegate to the underlyingPolarisGrantManager
when necessary.Note that the logic that implicitly grants the
service_admin
roleSERVICE_MANAGE_ACCESS
privileges on the implicit root container has been moved to theEntityCacheGrantManager
. I ran the tests without this move and they all still passed, so it seems that, at some point, the root container is being persisted and the grants are recorded. However, to ensure backward compatibility, I ported the logic. We may want to get rid of that at some point.Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration:
Checklist:
Please delete options that are not relevant.