Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H2O3 Mojo model scoring fails in python when offset column is used #16590

Open
arunaryasomayajula opened this issue Mar 12, 2025 · 0 comments
Open
Assignees
Labels
bug cust-statefarm Mojo reporter-support Reported as a support issue by cuetomer

Comments

@arunaryasomayajula
Copy link

H2O version, Operating System and Environment
3.46.0.6
Actual behavior
We have a XGBoost model trained on an older version with offset that has been thru a lot of evaluation. We are planning to deploy this model. However, we are unable to predict using a mojo model after we zero out the offset column. We get the following error.
The model predict works when we use the binary model. Can you please take a look and let us know any alternate way to use the mojo model? I don’t think we can retrain the model at this point.

OSError: Job with key $03017f00000132d4ffffffff$_aa0f1f0d307fc3704b9cb49444844cd7 failed with an exception: DistributedException from /127.0.0.1:54321: 'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
stacktrace:
DistributedException from /127.0.0.1:54321: 'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:549)
at hex.Model.predictScoreImpl(Model.java:2161)
at hex.generic.GenericModel.predictScoreImpl(GenericModel.java:161)
at hex.Model.score(Model.java:2002)
at water.api.ModelMetricsHandler$1.compute2(ModelMetricsHandler.java:555)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1704)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
at hex.genmodel.algos.xgboost.XGBoostMojoModel.score0(XGBoostMojoModel.java:88)
at hex.generic.GenericModel.score0(GenericModel.java:311)
at hex.generic.GenericModel.score0(GenericModel.java:317)
at hex.Model.score0(Model.java:2378)
at hex.Model$BigScore.score0(Model.java:2320)
at hex.Model$BigScore.map(Model.java:2295)
at water.MRTask.compute2(MRTask.java:836)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1707)
at hex.Model$BigScore$Icer.compute1(Model$BigScore$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1703)

Expected behavior
MOJO models created with offset column should not fail as above when loaded into python env and supplied an offset column as required.

Steps to reproduce
import h2o
from h2o.estimators import H2OXGBoostEstimator

h2o.init()
import pandas as pd

Create a sample DataFrame

data = {
"numeric1": [1.0, 2.0, 3.0, 4.0, 5.0],
"numeric2": [5.0, 4.0, 3.0, 2.0, 1.0],
"categorical": ["A", "B", "A", "B", "A"],
"offset": [0.1, 0.2, 0.3, 0.4, 0.5],
"target": [10.0, 20.0, 30.0, 40.0, 50.0]
}

Convert the DataFrame to an H2OFrame

df = pd.DataFrame(data)
h2o_frame = h2o.H2OFrame(df)

Define the predictors and response

predictors = ["numeric1", "numeric2", "categorical", "offset"]
response = "target"

Convert the categorical column to a factor

h2o_frame["categorical"] = h2o_frame["categorical"].asfactor()

Specify the offset column

offset_column = "offset"

Initialize the H2O XGBoost model

xgb_model = H2OXGBoostEstimator(
ntrees=50,
max_depth=5,
learn_rate=0.1,
offset_column=offset_column
)

Train the model

xgb_model.train(x=predictors, y=response, training_frame=h2o_frame)

Print the model performance

Save the model as a binary model

binary_model_path = h2o.save_model(model=xgb_model, path="binary_model", force=True)
print(f"Binary model saved to: {binary_model_path}")

Save the model as a MOJO

mojo_model_path = xgb_model.save_mojo(path="mojo_model", force=True)
print(f"MOJO model saved to: {binary_model_path}")

Load the model from a binary file

binary_model = h2o.load_model(binary_model_path)
print(f"Loaded binary model")

Load the model from a MOJO file

mojo_model = h2o.upload_mojo(mojo_model_path)
print(f"Loaded MOJO model from: {mojo_model_path}")

h2o_frame['offset'] = 0
binary_predict = binary_model.predict(h2o_frame)
binary_predict

########

The following predict will throw an exception

'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset

########
mojo_predict = mojo_model.predict(h2o_frame)
#mojo_predict.shape
generic Model Build progress: |██████████████████████████████████████████████████| (done) 100%
Loaded MOJO model from: /Users/arun/mojo_model/XGBoost_model_python_1740518954413_8.zip
xgboost prediction progress: |███████████████████████████████████████████████████| (done) 100%
generic prediction progress: | (failed)

OSError Traceback (most recent call last)
Cell In[29], line 13
7 binary_predict
9 ########
10 # The following predict will throw an exception
11 # 'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
12 ########
---> 13 mojo_predict = mojo_model.predict(h2o_frame)

File /opt/anaconda3/lib/python3.12/site-packages/h2o/model/model_base.py:334, in ModelBase.predict(self, test_data, custom_metric, custom_metric_func)
331 if not isinstance(test_data, h2o.H2OFrame): raise ValueError("test_data must be an instance of H2OFrame")
332 j = H2OJob(h2o.api("POST /4/Predictions/models/%s/frames/%s" % (self.model_id, test_data.frame_id), data = {'custom_metric_func': custom_metric_func}),
333 self._model_json["algo"] + " prediction")
--> 334 j.poll()
335 return h2o.get_frame(j.dest_key)

File /opt/anaconda3/lib/python3.12/site-packages/h2o/job.py:88, in H2OJob.poll(self, poll_updates)
86 if self.status == "FAILED":
87 if (isinstance(self.job, dict)) and ("stacktrace" in list(self.job)):
---> 88 raise EnvironmentError("Job with key {} failed with an exception: {}\nstacktrace: "
89 "\n{}".format(self.job_key, self.exception, self.job["stacktrace"]))
90 else:
91 raise EnvironmentError("Job with key %s failed with an exception: %s" % (self.job_key, self.exception))

OSError: Job with key $03017f00000132d4ffffffff$_a42e026e7fb346acd0a318c6996a485f failed with an exception: DistributedException from /127.0.0.1:54321: 'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
stacktrace:
DistributedException from /127.0.0.1:54321: 'Model was trained with offset, use score0 with offset', caused by java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:549)
at hex.Model.predictScoreImpl(Model.java:2161)
at hex.generic.GenericModel.predictScoreImpl(GenericModel.java:161)
at hex.Model.score(Model.java:2002)
at water.api.ModelMetricsHandler$1.compute2(ModelMetricsHandler.java:555)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1704)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.IllegalStateException: Model was trained with offset, use score0 with offset
at hex.genmodel.algos.xgboost.XGBoostMojoModel.score0(XGBoostMojoModel.java:88)
at hex.generic.GenericModel.score0(GenericModel.java:311)
at hex.generic.GenericModel.score0(GenericModel.java:317)
at hex.Model.score0(Model.java:2378)
at hex.Model$BigScore.score0(Model.java:2320)
at hex.Model$BigScore.map(Model.java:2295)
at water.MRTask.compute2(MRTask.java:836)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1707)
at hex.Model$BigScore$Icer.compute1(Model$BigScore$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1703)
... 5 more
Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cust-statefarm Mojo reporter-support Reported as a support issue by cuetomer
Projects
None yet
Development

No branches or pull requests

2 participants