-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create error estimator for SUM #528
Conversation
ratio_dropped_linf = self.get_ratio_dropped_linf(linf_bound) | ||
ratio_dropped = 1 - (1 - ratio_dropped_l0) * (1 - ratio_dropped_linf) | ||
stddev = self._get_stddev(l0_bound, linf_bound) | ||
if to_print: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it for debugging only? should we remove it? just double checking :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. Removed
@@ -49,9 +49,18 @@ def _get_estimator( | |||
epsilon: float = 2**0.5 / 2, | |||
delta: Optional[float] = None, | |||
): | |||
return histogram_error_estimator.create_error_estimator( | |||
return histogram_error_estimator.create_estimator_for_count_privacy_id_count( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename from _get_estimator
to _get_estimator_for_count_and_privacy_id_count
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, done
@@ -121,6 +139,17 @@ def test_get_ratio_dropped_linf(self, linf_bound, expected): | |||
self.assertAlmostEqual(estimator.get_ratio_dropped_linf(linf_bound), | |||
expected) | |||
|
|||
@parameterized.parameters((0, 1), (0.5, 0.89), (1, 0.78), (2, 0.76), | |||
(40, 0)) | |||
# there 1 is contribution 40 and 10 contribution 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is 1 contribution of 40 and 10 contributions of 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
(40, 0)) | ||
# there 1 is contribution 40 and 10 contribution 1. | ||
# total contribution = 1*40+10*1 = 50 | ||
# when linf_bound = 0.5, left after contribution bounding 11*0.5=5.5, i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how linf can be a double and not an integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linf_bound is max contribution per partition, which means
max_contributions_per_partition
for COUNT
max_sum_per_partition
for SUM (which can be double)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review!
ratio_dropped_linf = self.get_ratio_dropped_linf(linf_bound) | ||
ratio_dropped = 1 - (1 - ratio_dropped_l0) * (1 - ratio_dropped_linf) | ||
stddev = self._get_stddev(l0_bound, linf_bound) | ||
if to_print: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. Removed
@@ -49,9 +49,18 @@ def _get_estimator( | |||
epsilon: float = 2**0.5 / 2, | |||
delta: Optional[float] = None, | |||
): | |||
return histogram_error_estimator.create_error_estimator( | |||
return histogram_error_estimator.create_estimator_for_count_privacy_id_count( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, done
@@ -121,6 +139,17 @@ def test_get_ratio_dropped_linf(self, linf_bound, expected): | |||
self.assertAlmostEqual(estimator.get_ratio_dropped_linf(linf_bound), | |||
expected) | |||
|
|||
@parameterized.parameters((0, 1), (0.5, 0.89), (1, 0.78), (2, 0.76), | |||
(40, 0)) | |||
# there 1 is contribution 40 and 10 contribution 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
(40, 0)) | ||
# there 1 is contribution 40 and 10 contribution 1. | ||
# total contribution = 1*40+10*1 = 50 | ||
# when linf_bound = 0.5, left after contribution bounding 11*0.5=5.5, i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linf_bound is max contribution per partition, which means
max_contributions_per_partition
for COUNT
max_sum_per_partition
for SUM (which can be double)
ErrorEstimator
usesContributionHistograms
to estimate approximately RMSE error for different L0/Linf bounds. This PR allows to createErrorEstimator
for SUM (the different with COUNT, only different set of histograms)