Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable resource caching per context to fix context aware expressions #3844

Closed
wants to merge 7 commits into from

Conversation

petschki
Copy link
Member

fixes #3789

@mister-roboto
Copy link

@petschki thanks for creating this Pull Request and helping to improve Plone!

TL;DR: Finish pushing changes, pass all other checks, then paste a comment:

@jenkins-plone-org please run jobs

To ensure that these changes do not break other parts of Plone, the Plone test suite matrix needs to pass, but it takes 30-60 min. Other CI checks are usually much faster and the Plone Jenkins resources are limited, so when done pushing changes and all other checks pass either start all Jenkins PR jobs yourself, or simply add the comment above in this PR to start all the jobs automatically.

Happy hacking!

@petschki
Copy link
Member Author

@jenkins-plone-org please run jobs

@tlotze
Copy link
Member

tlotze commented Sep 21, 2023

@petschki Thanks for looking into this so quickly!

Your change looks good to me as far as it concerns fixing the case that the expression depends on context identity, which is probably a very common case and is what the expression that gave rise to the ticket is about.

I don't think, though, that it addresses the issue in the more fundamental way I tried to describe in my ticket comment. The expression could do more complex things than just look at basically immutable properties of the context (like portal type). Suppose, as an example, that the expression uses the context's workflow state which could change between requests; the new cache key wouldn't account for that.

What I'm trying to express is that the cache key shouldn't refer to parameters that might go into the expression but rather to the result of evaluating the expression. I'm not really sure whether evaluating the expression at this point might defeat caching to any notable extent, so working on this with someone who knows these parts of the code better would probably be a good idea. But to be correct, the code should indeed build the cache key from exactly those bundles that will be added to or removed from the theme and request after evaluating the expression, not just those that are statically enabled or disabled.

@petschki
Copy link
Member Author

so working on this with someone who knows these parts of the code better would probably be a good idea.

👉🏼 @jensens

@jensens
Copy link
Member

jensens commented Sep 22, 2023

Yay.
I will have a look, but it will take a while.

Copy link
Contributor

@1letter 1letter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have test this local. It looks good for me. The notes from @tlotze are a good points.
Side note: If the condition is changed, a restart of the instance is still required. The cache dont invalidate after the save action.

@petschki
Copy link
Member Author

Yes, there is and was no invalidation at all. We could take the registry modification time into the cachekey https://github.com/plone/Products.CMFPlone/blob/master/Products/CMFPlone/resources/browser/resource.py#L323 ... and maybe the modification time of the context too, so that changes during the request would be updated too.

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

But this is still a lot of guesswork. The expression can depend on basically anything, all it takes is a custom method on context that looks up the moon phase or whatever. I suggest I'll try to come up with a PR that uses the expression result and we discuss the performance implications of evaluating the expression at this point with @jensens.

@petschki
Copy link
Member Author

I'm sorry, but I don't understand completely ... the expression result is either True or False depending on its logic.

I wonder how this was solved before the RR rewrite. Maybe there's some piece of code which we can reimplement here?

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

Of course a single expression will tell whether a particular bundle should be added. What we need is the set of bundles whose expression evaluates to True. (And we need to make sure not to mix bundles to be added with those to be removed as the set union in line 61 currently does.)

@petschki
Copy link
Member Author

Note: registry modification time is already in cachekey generator. https://github.com/plone/Products.CMFPlone/blob/master/Products/CMFPlone/resources/browser/resource.py#L66 ... this needs obviously be tested if it works correctly.

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

Registry modification time looks strange as it is in the cache key only under a condition. Maybe that's not a problem, though. But in fact, I also consider including registry (or expression) modification time to be guesswork because neither registry nor context nor any other variable that goes in the evaluation context needs to be modified at all for the involved expressions to evaluate in unpredictable ways for any given request.

@yurj
Copy link
Contributor

yurj commented Sep 22, 2023

If the possible values are True or False, a couple of caches must be created. If there are 2 conditions, 4 caches must be saved. So just add the condition value to the cache key, it will get the cached value. The problem is that you also need 2 or 4 or whatever different bundles, depending on the result of the condition.

But I don't think the problem is cache related.

I find condition on bundles quite useless, it is application logic to load additional javascript or css on the fly. Bundle everything and live happy. Or add code in the page to load the additional css/js. Or tell webpack to use a different chunk for code with condition, and then webpack will load it when the condition is true (the result of the condition on the fly must be available to webpack).

Basically, we're mixing 2 different problems. Caching is for not rendering again the bundle, not for conditions.

What is the purpose to use caching when you've to calculate an expensive condition at every request? I don't get the picture, sorry.

@yurj
Copy link
Contributor

yurj commented Sep 22, 2023

https://github.com/plone/Products.CMFPlone/blob/3129403f3524a31a15dae152edefb9380486f1f8/Products/CMFPlone/resources/browser/resource.py#L288C1-L290C10

        self.renderer["js"] = webresource.ResourceRenderer(
            resolver_js, base_url=self.portal_state.portal_url()
        )

this does not depend on the condition, is this the problem?

@yurj
Copy link
Contributor

yurj commented Sep 22, 2023

I'm sorry, but I don't understand completely ... the expression result is either True or False depending on its logic.

I wonder how this was solved before the RR rewrite. Maybe there's some piece of code which we can reimplement here?

Before a different bundle was returned if the condition was True. But with webpack, if you want code in different file to talk (and module federation?) you've to use different chucks to be loaded conditionally based on the condition.

@petschki
Copy link
Member Author

https://github.com/plone/Products.CMFPlone/blob/3129403f3524a31a15dae152edefb9380486f1f8/Products/CMFPlone/resources/browser/resource.py#L288C1-L290C10

        self.renderer["js"] = webresource.ResourceRenderer(
            resolver_js, base_url=self.portal_state.portal_url()
        )

this does not depend on the condition, is this the problem?

The expression is inside the PloneScriptResources which are registered to a ResourceGroup and then resolved with the ResourceResolver and rendered with the ResourceRenderer ... this is webresource logic. Not very explicit though IMHO.

@yurj
Copy link
Contributor

yurj commented Sep 22, 2023

https://github.com/plone/Products.CMFPlone/blob/3129403f3524a31a15dae152edefb9380486f1f8/Products/CMFPlone/resources/browser/resource.py#L288C1-L290C10

        self.renderer["js"] = webresource.ResourceRenderer(
            resolver_js, base_url=self.portal_state.portal_url()
        )

this does not depend on the condition, is this the problem?

The expression is inside the PloneScriptResources which are registered to a ResourceGroup and then resolved with the ResourceResolver and rendered with the ResourceRenderer ... this is webresource logic. Not very explicit though IMHO.

And how you can evaluate the expression? This result is cached, so you need the result of the condition in the cache, thus having 2^n (n=number of conditions) bundles.

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

Yes, at least in theory. Maybe not all combinations actually occur, but if they do, caching would at least work correctly. But what's the problem with many combinations being cached? The only problem I see is that evaluating might be expensive. If that is really a reason not to cache with respect to the actual expression values, expressions don't work in all the generality they allow.

Based on that observation, we might as well merge this PR to solve the 80% case now and reconsider the whole thing afterwards.

@petschki
Copy link
Member Author

@yurj thats the problem @tlotze adresses. The cache is calculated once per context depending on registry mtime and its expression conditions at that time. There are many more expression examples which could be discussed in order to deliver a sane OOTB solution for this. If there are special cases (and I really mean moonphases c'mon ;) we should deliver a solution to hook/adapt into this cache mechanism to make it easy for integrators to customize.

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

Just for the record: The moon phases example was intentionally absurd to illustrate that you just cannot know what the expression might do, and that the result doesn't have to depend on any input that you know before calling the expression, i.e. expression text, context object, request properties, these things. The expression might call code that does things which the caching code cannot anticipate, and those things might make perfect sense for the application.

@MrTango
Copy link
Contributor

MrTango commented Sep 22, 2023

I'm with @tlotze here. The right solution should eval the expression. But more important right now, would be the 80% fix, as this is biting us in many cases. So if we don't find a quick solution for the 100%, let's just merge what we have. And improve it later.
I'm even fine with, resources with expressions are being ignored by caching. Just not cache them at all.
On the other hand, the resource it self could be cached. The problem is merging them with other resources. That was avoided in the old registry. When ever you had a condition different to others the resource was delivered separate, but could be cached bu the browser.

Copy link
Contributor

@MrTango MrTango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good for now, thx

@tlotze
Copy link
Member

tlotze commented Sep 22, 2023

Given that we're going to merge the strategy of making the cache key depend on context, also adding context mtime would be easy and helpful IMO to really make this an 80% improvement.

@petschki
Copy link
Member Author

petschki commented Sep 23, 2023

Checking mtime in the cachekey was a simple indentation error. I've added a test and fixed it.
From my point of view good to merge now.

@jenkins-plone-org please run jobs

@petschki
Copy link
Member Author

@mauritsvanrees jenkins tests break because of the new 6019 version number https://jenkins.plone.org/job/pull-request-6.0-3.8/2908/ ... why?

@petschki
Copy link
Member Author

@jenkins-plone-org please run jobs

@petschki
Copy link
Member Author

Ah ... rebased the branch ... should be green now.

@jensens
Copy link
Member

jensens commented Sep 25, 2023

I think this is technically fine. Just for really large sites, will RAM-usage explode?

@MrTango
Copy link
Contributor

MrTango commented Dec 15, 2023

@mauritsvanrees can we merge this and have it in a release soon?


def update(self):
# cache on request
REQUEST_CACHE_KEY = self._cache_attr_name()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed? It seems to me that the previous hardcoded _WEBRESOURCE_CACHE_ key is fine for using on the request: the request is always for one context.

In fact, with this PR the cache on the request is not actually used. I have added some debugging code:

     def update(self):
         # cache on request
         REQUEST_CACHE_KEY = self._cache_attr_name()
+        # REQUEST_CACHE_KEY = "_WEBRESOURCE_CACHE_"
 
         cached = getattr(self.request, REQUEST_CACHE_KEY, None)
+        print(
+            f"Checking {REQUEST_CACHE_KEY} on "
+            f"request {self.request.URL} "
+            f"for resource type {self.resource_type}."
+        )
         if cached is not None:
+            print(f"Found: {cached}")
             self.renderer = cached
             return
+        print(f"Cache not found.")

When I visit a document, the output is for example:

Checking _v_rendered_cache_ab7307c68d8577d0dd04ba4522a5ad2bae88085477671c53f6bbcd312af902e4 on request http://localhost:8080/Plone/folder/world/document_view for resource type js.
Cache not found.
Checking _v_rendered_cache_c5469a242752e29e839a31cb0e0ac0005263e07e36b783af1109ad02e3c2a2a7 on request http://localhost:8080/Plone/folder/world/document_view for resource type css.
Cache not found.

This is when generating contents of the Script and Styles viewlets. (The contents will still come from the volatile attribute on the site root, once set, but that is not the point here.)

When I use REQUEST_CACHE_KEY = "_WEBRESOURCE_CACHE_", the output becomes as follows, showing that the request cache is working:

Checking _WEBRESOURCE_CACHE_ on request http://localhost:8080/Plone/folder/world/document_view for resource type js.
Cache not found.
Checking _WEBRESOURCE_CACHE_ on request http://localhost:8080/Plone/folder/world/document_view for resource type css.
Found: {'js': <webresource._api.ResourceRenderer object at 0x108dea1d0>, 'css': <webresource._api.ResourceRenderer object at 0x1093ac8d0>}

Alternatively, I suppose the part of update method that builds self.renderer could be changed so that it only handles javascript when self.resource_type is js, and only styles when self.resource_type is css. But that may be too big a change.
Just using the old static request cache key should be fine.

Products/CMFPlone/resources/browser/resource.py Outdated Show resolved Hide resolved
@mauritsvanrees
Copy link
Member

I made two change requests. But it does seem to work as intended, and fixes the problem. So in general +1.

You do get multiple cache keys per context now, instead of per site: 2 for js and css, times the number of different combinations of roles. The value that is cached is not too much, I saw 400 and 700 bytes locally. Let's say that you get combinations that result in total in 10 KB per context. If you then have a site with 10 thousand pages, and you visit them all, in all combinations, this gives a 100 MB increase in memory. Times the number of threads and zeoclients. So this change does have an impact on memory, but it seems reasonable to me.

I can imagine that in the future we want to be restrictive with the allowed expressions, allowing only anonymous versus authenticated, and allowing 1 (or maybe more) portal_types, and not anything else. That would make this code simpler, and would need less cache keys. For more advanced use cases, like indeed the phases of the moon, people can register an own viewlet apart from the resource registries, and do whatever they please. But that is not for the near future.

@yurj
Copy link
Contributor

yurj commented Dec 16, 2023

@mauritsvanrees or anyone involved

while you're on this patch, can you add
**{"data-bundle": "diazo"},

at line 245 and 267?

This is for back compatibility with backend,xml (https://github.com/plone/plonetheme.barceloneta/blob/030535a1e14a321cf46d2280a51338a9a8cd99b8/plonetheme/barceloneta/theme/backend.xml#L90) in Barceloneta theme. This would help in removing the main theme if we want to use Barceloneta to style the backend (content editing, control panel).

This change just add the attribute data-bundle="diazo" like in the other scripts/css which already have the data-bundle attribute set to the bundle name and this can be used to eventually drop them when we want to use Barceloneta instead of the current theme.

See also plone/documentation#1592 (comment)

@jensens
Copy link
Member

jensens commented Dec 18, 2023

For very large sites, which already have some varnish or alike in front, it would be a must have to turn off caching of it at all even in prod. If it is 100MB for 10k pages, then some sites I now with >500k pages would explode in RAM usage.

@petschki
Copy link
Member Author

@mauritsvanrees or anyone involved

while you're on this patch, can you add **{"data-bundle": "diazo"},

at line 245 and 267?

@yurj I think this change would get merged faster if you open a separate PR. We're not ready here yet.

Co-authored-by: Maurits van Rees <[email protected]>
@tlotze
Copy link
Member

tlotze commented Jan 2, 2024

For very large sites, which already have some varnish or alike in front, it would be a must have to turn off caching of it at all even in prod. If it is 100MB for 10k pages, then some sites I now with >500k pages would explode in RAM usage.

Sorry to reiterate, but RAM usage wouldn't be a function of the number of possible context objects if the cache key involved only the outcomes of evaluating conditions rather than a selection of possible inputs. It would depend exponentially on the number of conditions but even if they are many, that number only depends on code and configuration but is fixed with respect to the amount of content in the site.

@jensens
Copy link
Member

jensens commented Jan 23, 2024

Currently the hashtool.update(self.context.absolute_url().encode("utf8")) adds a cache item per context.

@mauritsvanrees
Copy link
Member

I think the current state is: this fixes an irritating bug, but there are doubts beause it increases memory usage, potentially a lot for large sites.

Idea: put the new code behind a switch. Enable it when an environment variable is set. Then it can more easily be tested in practice.

@yurj
Copy link
Contributor

yurj commented Jan 24, 2024

What about dropping conditions at all? They cannot be cached and their purpose is not useful anyway, put the script or css code on an html head viewlet if you need to load it when there's a condition. If you've varnish or a cache in front of the site, the problem of condition evaluation will rise again because Varnish will not know about your application logic. Cache invalidation and conditions on context should be run on the client side if possible.

@MrTango
Copy link
Contributor

MrTango commented Feb 14, 2024

@mauritsvanrees if we just merge it, it only is a problem when you have expressions. If this really becomes a problem in huge sites, one way would be to remove the expressions in those sites.

@mauritsvanrees
Copy link
Member

@mauritsvanrees if we just merge it, it only is a problem when you have expressions. If this really becomes a problem in huge sites, one way would be to remove the expressions in those sites.

In standard Classic UI Plone, we have the eventedit bundle from plone.app.event, which has a condition: python: member is not None. So this will have an effect on all Plone sites.
The javascript from this bundle is small, so if there are no side effects from including it for everyone, we could remove the expression in the plone.app.event code.

I checked a customer site, and it had the same expression for two other bundles: datagridfield-bundle and mosaic.

But still: even if there are no expressions, after merging this PR all Plone sites will use more memory and be a bit slower, right? This is because the hash will differ per context, so the resource cache is far less effective: there will be at least two resource caches per context, instead of per site.

This could be improved by checking if there are any expressions other than python: member is not None (or similar). If not, then the context does not need to be added to the hash.

Still, I think putting this behind an environment variable so we can opt in to try it out in practice, would be better.

@petschki
Copy link
Member Author

petschki commented Feb 15, 2024

After discussing this in yesterdays Classic-UI Meeting and sleeping over it I'm against my current implementation of generating the cachekey because it adds way too much attributes with the same stored value to the portal.

After some investigations and timing logs I came to the conclusion, that there's nearly no difference in calculating the cachekey vs. rendering the webresources uncached.

So I will come up with a new approach in a separate PR (#3900) which removes the volatile cachekey attributes completely and only caches the rendered viewlet content on the request in order to not calculate it twice (because we have 2 viewlets with the same update() method ... could also be cleaned up in another PR to render only one ResourcesViewlet)

@jensens
Copy link
Member

jensens commented Feb 19, 2024

I think you are right, calculating a complex cache key makes no sense. Lets close this one and go on with #3900

@jensens jensens closed this Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Resource Registry is missing resources in production mode when expressions are used
8 participants