Surrogate caching #682

AdrianSosic · 2025-10-29T08:51:10Z

Fixes two problems:

General surrogate caching bug for cases where multi-model surrogates are used. The issue was that get_surrogate trained an ephemeral (😬) surrogate object in this case, i.e. did not store the trained model as an attribute. Therefore, the cache could never be re-used in a second call. The solution is simple: instead of creating the replicated surrogate on-the-fly from the stored template, we store the replicated surrogate itself.
A bug where IndependentGaussianSurrogates would still allow batch recommendation requests when they are wrapped as part of other surrogates (e.g. CompositeSurrogate)

Note: Does not address #568. This will be fixed once #662 is merged and/or the discrete search spaces have been refactored.

Copilot

Pull Request Overview

This PR fixes a surrogate caching bug where multi-model surrogates (used with objectives requiring multiple models) were not being properly cached. Previously, get_surrogate created ephemeral surrogate objects that weren't stored as attributes, preventing cache reuse on subsequent calls. The fix ensures that replicated surrogates are stored directly instead of recreating them from templates.

Key Changes:

Modified surrogate storage to use replicated surrogates directly via converter
Simplified get_surrogate to use the stored surrogate instance
Added comprehensive test coverage for the caching behavior

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
baybe/recommenders/pure/bayesian/base.py	Applied `_autoreplicate` converter to `_surrogate_model` field and simplified `get_surrogate` to use stored instance directly
tests/test_surrogate.py	Added tests for surrogate caching with different objective types and a smoke test for replicated surrogates
CHANGELOG.md	Documented the bug fix in the Fixed section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Previously, a temporary object was created on the fly in the multi-model case, which meant that the cache could not be used.

The RandomForestSurrogate otherwise outsources operations to a MeanPredictionModel, which does not support batching. Was mistakenly ignored previously due to suboptimal batching check.

AdrianSosic · 2025-11-03T15:20:30Z

tests/test_iterations.py

-        run_iterations(campaign, n_iterations, batch_size)
+        try:
+            run_iterations(campaign, n_iterations, batch_size)
+        except OptionalImportError:


@Scienfitz: do we have an immediate nicer solution to this? I could see that we potentially add some is_available classproperty to the surrogates or something that would simplify their collection in tests, but right now that doesn't exist. The previous skip at the top of the file actually never had an effect due to lazy imports.

AdrianSosic · 2025-11-04T13:49:34Z

baybe/recommenders/pure/bayesian/base.py

-        alias="surrogate_model", factory=GaussianProcessSurrogate
+        alias="surrogate_model",
+        factory=GaussianProcessSurrogate,
+        converter=_autoreplicate,


@AVHopp @Scienfitz
As discussed in the meeting, the PR is pretty much ready in terms of fixing the targeted problem. However, it uncovered that second (related) problem I explained to you, which is best demonstrated in the bandit example, where a surrogate passed by the user is only used as a "template" and is not mutated (i.e. the user won't be able to access fitted parameters on that object). Note that the problem is not introduced by the PR but already existed before, e.g. if you passed a single-output surrogate but were in a multi-output context, where the same replication mechanism would happen.

To "fix" this problem, we should first clearly define what our expectations are in terms of resulting behavior. That is, what do we think a user would expect to happen:

When the provide a 1-D surrogate to a 1-D use case

... a 1-D surrogate to an N-D use case

... an N-D surrogate to an N-D use case and

... an N-D surrogate to a 1-D use case

Also, there is a difference whether you provide an N-D surrogate with explicit dimensions or one that "automagically" broadcasts to N dimensions using replication.

And related to the above, what do we expect people to access the fitted surrogate? Specifically, the bandit example would work even with the current PR code but would require some rather unexpected / inelegant access like

recommender.get_surrogate() # <-- this is a composite one, even though a regular one was provided .surrogates[target.name]

AdrianSosic self-assigned this Oct 29, 2025

Copilot AI review requested due to automatic review settings October 29, 2025 08:51

AdrianSosic requested a review from Scienfitz as a code owner October 29, 2025 08:51

AdrianSosic added the bug Something isn't working label Oct 29, 2025

AdrianSosic requested a review from AVHopp as a code owner October 29, 2025 08:51

Copilot AI reviewed Oct 29, 2025

View reviewed changes

AdrianSosic force-pushed the fix/surrogate_caching branch from c9bb85c to 594ce02 Compare October 29, 2025 09:42

AdrianSosic marked this pull request as draft October 31, 2025 13:01

AdrianSosic force-pushed the fix/surrogate_caching branch from 3437d90 to 7ab3ad5 Compare November 3, 2025 14:21

AdrianSosic added 9 commits November 3, 2025 16:15

Add test for surrogate caching

8b3df95

Ensure fitted surrogate model is stored as attribute

b828b6f

Previously, a temporary object was created on the fly in the multi-model case, which meant that the cache could not be used.

Update CHANGELOG.md

d8b6246

Explicitly test non-batchable composite surrogate

65b5f79

Improve check for incompatible batch recommendation request

3c20dab

Increase test dataset

d206cff

The RandomForestSurrogate otherwise outsources operations to a MeanPredictionModel, which does not support batching. Was mistakenly ignored previously due to suboptimal batching check.

Fix collection of valid surrogate models for testing

8a9db35

Add missing default value for CompositeSurrogate._target_names

5217916

Add missing garbage collection call

ca18ae6

AdrianSosic force-pushed the fix/surrogate_caching branch from 0bd9482 to ca18ae6 Compare November 3, 2025 15:15

AdrianSosic commented Nov 3, 2025

View reviewed changes

AdrianSosic marked this pull request as ready for review November 4, 2025 13:41

AdrianSosic commented Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Surrogate caching #682

Surrogate caching #682

AdrianSosic commented Oct 29, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

AdrianSosic Nov 3, 2025

Uh oh!

AdrianSosic Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Surrogate caching #682

Are you sure you want to change the base?

Surrogate caching #682

Conversation

AdrianSosic commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

AdrianSosic Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

AdrianSosic Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AdrianSosic commented Oct 29, 2025 •

edited

Loading