-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GH-3018] Add UpliftDRF #5698
base: master
Are you sure you want to change the base?
[GH-3018] Add UpliftDRF #5698
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one suggestion otherwise, LGTM.
@@ -85,6 +85,11 @@ def testExtendedIsolationForestParameters(prostateDataset): | |||
model = algorithm.fit(prostateDataset) | |||
compareParameterValues(algorithm, model) | |||
|
|||
def testExtendedIsolationForestParameters(prostateDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated to UpliftDRF, so I suggest moving it to another task. :)
@@ -97,6 +100,10 @@ class AlgorithmConfigurations extends MultipleAlgorithmsConfiguration { | |||
val gamFields = Seq(ignoredCols, betaConstraints, gamCols) | |||
val gbmFields = Seq(monotonicity, calibrationDataFrame, ignoredCols) | |||
val drfFields = Seq(calibrationDataFrame, ignoredCols) | |||
val upliftDrfFields = Seq( | |||
ExplicitField("treatment_column", "HasTreatmentCol", "treatment"), | |||
ExplicitField("response_column", "HasLabelCol", "label"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to specify label column explicitly? Isn't uplift DRF just another supervised algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response column is used the same way as in the supervised algorithm.
@@ -97,6 +100,10 @@ class AlgorithmConfigurations extends MultipleAlgorithmsConfiguration { | |||
val gamFields = Seq(ignoredCols, betaConstraints, gamCols) | |||
val gbmFields = Seq(monotonicity, calibrationDataFrame, ignoredCols) | |||
val drfFields = Seq(calibrationDataFrame, ignoredCols) | |||
val upliftDrfFields = Seq( | |||
ExplicitField("treatment_column", "HasTreatmentCol", "treatment"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need an explicit field for treatment? Could we just add another rule to ParameterNameConverter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for the uplift algorithm, the new treatment column is crucial. However I am not sure how this algorithm configuration works, so I am not sure if this is the correct way to add the treatment column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
No description provided.