Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add numpy like percentile and quantile #942

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Dhruvanshu-Joshi
Copy link
Member

Description

This pr aims to add support for numpy like percentile and quantile functions.

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

Copy link

codecov bot commented Jul 19, 2024

Codecov Report

Attention: Patch coverage is 95.91837% with 2 lines in your changes missing coverage. Please review.

Project coverage is 81.61%. Comparing base (c6d85d1) to head (77d8604).
Report is 9 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #942      +/-   ##
==========================================
+ Coverage   81.38%   81.61%   +0.22%     
==========================================
  Files         172      178       +6     
  Lines       46868    47285     +417     
  Branches    11423    11497      +74     
==========================================
+ Hits        38145    38592     +447     
+ Misses       6542     6504      -38     
- Partials     2181     2189       +8     
Files Coverage Δ
pytensor/tensor/math.py 91.47% <95.91%> (+0.19%) ⬆️

... and 25 files with indirect coverage changes


percentiles = [100.0 * x for x in q]

return percentile(input, percentiles, axis)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a but irrelevant, but does it make sense to define quantile first and then percentile from quantile. So the other way around?

Quantile seems more fundamental

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense. I have pushed this.

@Dhruvanshu-Joshi Dhruvanshu-Joshi force-pushed the quantile_percentile branch 2 times, most recently from 14967fe to c3bda8a Compare July 31, 2024 21:07
@ricardoV94 ricardoV94 added enhancement New feature or request NumPy compatibility labels Aug 1, 2024
Comment on lines 2914 to 2932
for quantile in q:
if quantile < 0 or quantile > 1:
raise ValueError("Quantiles must be in the range [0, 1]")

result = []
for quantile in q:
k = (quantile) * (input_shape - 1)
k_floor = floor(k).astype("int64")
k_ceil = ceil(k).astype("int64")

slices1[-1] = slice(k_floor, k_floor + 1)
slices2[-1] = slice(k_ceil, k_ceil + 1)
val1 = sorted_input[tuple(slices1)]
val2 = sorted_input[tuple(slices2)]

d = k - k_floor
quantile_val = val1 + d * (val2 - val1)

result.append(quantile_val.squeeze(axis=-1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work with symbolic q. Can we have a vectorized implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Sorry this completely skipped me. I'll get back to it and see how to vectorise this and also look into the failing tests.

@ricardoV94
Copy link
Member

ricardoV94 commented Oct 11, 2024

This code could look a lot like the median one, with the exception that the k in x_sorted[..., k] will be an array if q is an array as well. And we can't use ifelse if q is an array, because then oddness/eveness may depend on the quantile (whether size / k is an integer). We have to use switch in that case.

We should be able to re-implement the median as partial(quantile, q=0.5) once this is done.

@ricardoV94 ricardoV94 mentioned this pull request Oct 11, 2024
11 tasks
@Dhruvanshu-Joshi
Copy link
Member Author

Hi @ricardoV94

I have added some changes. However I need some help in some parts.

For "ndim, axis, q" equal to (3, (1, 2), [0.1, 0.9]), the tests fail due to some shapes issues:

For percentile, numpy output:
[[0.14843302]
 [0.65699253]]

Our output:
[[0.14843302 0.65699253]]

Also, I am confused on how to introduce symbolic checks to ensure the all of the q is between 0 to 1 in case of quantile and 0 to 100 in case of percentile.

@ricardoV94
Copy link
Member

Also, I am confused on how to introduce symbolic checks to ensure the all of the q is between 0 to 1 in case of quantile and 0 to 100 in case of percentile.

We can be a bit permissive and consider it undefined behavior if it makes our code simpler. Otherwise there's a RaiseAndCheck Op that can be used for symbolic checks

@ricardoV94
Copy link
Member

For "ndim, axis, q" equal to (3, (1, 2), [0.1, 0.9]), the tests fail due to some shapes issues:

I'll try to take a look when I get some free time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement equivalent numpy median and quantile / percentile
2 participants