Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-3018] Add UpliftDRF #5698

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

[GH-3018] Add UpliftDRF #5698

wants to merge 2 commits into from

Conversation

krasinski
Copy link
Contributor

@krasinski krasinski commented Dec 4, 2023

No description provided.

Copy link

@maurever maurever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one suggestion otherwise, LGTM.

@@ -85,6 +85,11 @@ def testExtendedIsolationForestParameters(prostateDataset):
model = algorithm.fit(prostateDataset)
compareParameterValues(algorithm, model)

def testExtendedIsolationForestParameters(prostateDataset):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to UpliftDRF, so I suggest moving it to another task. :)

@@ -97,6 +100,10 @@ class AlgorithmConfigurations extends MultipleAlgorithmsConfiguration {
val gamFields = Seq(ignoredCols, betaConstraints, gamCols)
val gbmFields = Seq(monotonicity, calibrationDataFrame, ignoredCols)
val drfFields = Seq(calibrationDataFrame, ignoredCols)
val upliftDrfFields = Seq(
ExplicitField("treatment_column", "HasTreatmentCol", "treatment"),
ExplicitField("response_column", "HasLabelCol", "label"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to specify label column explicitly? Isn't uplift DRF just another supervised algorithm?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response column is used the same way as in the supervised algorithm.

@@ -97,6 +100,10 @@ class AlgorithmConfigurations extends MultipleAlgorithmsConfiguration {
val gamFields = Seq(ignoredCols, betaConstraints, gamCols)
val gbmFields = Seq(monotonicity, calibrationDataFrame, ignoredCols)
val drfFields = Seq(calibrationDataFrame, ignoredCols)
val upliftDrfFields = Seq(
ExplicitField("treatment_column", "HasTreatmentCol", "treatment"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an explicit field for treatment? Could we just add another rule to ParameterNameConverter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for the uplift algorithm, the new treatment column is crucial. However I am not sure how this algorithm configuration works, so I am not sure if this is the correct way to add the treatment column.

Copy link
Collaborator

@valenad1 valenad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants