Add Generalized Pareto (GPD) and Extended GPD distributions by maresb · Pull Request #638 · pymc-devs/pymc-extras

maresb · 2026-02-04T23:07:49Z

Note

AI Assistance Disclosure: This PR was developed with significant assistance from an AI coding assistant (Claude) via Cursor. The code has not yet been fully reviewed by the author.

Summary

This PR adds two new probability distributions to PyMC-Extras:

Generalized Pareto Distribution (GPD) - A fundamental distribution in extreme value theory for modeling exceedances over a threshold
Extended Generalized Pareto Distribution (ExtGPD) - A transformation of GPD that provides smoother behavior in the lower tail, useful for modeling entire distributions without explicit threshold selection

Features

Full implementation with logp, logcdf, icdf, and support_point methods
Random sampling via scipy's genpareto
Comprehensive test coverage using PyMC's testing utilities
The icdf method enables efficient inverse CDF sampling when used with pm.Truncated, avoiding rejection sampling failures

Mathematical Background

The ExtGPD CDF is defined as:
$$G(x \mid \mu, \sigma, \xi, \kappa) = \left[H\left(\frac{x - \mu}{\sigma}\right)\right]^\kappa$$

where $H$ is the standard GPD CDF. The parameter $\kappa > 0$ controls lower tail behavior, and when $\kappa = 1$, ExtGPD reduces to standard GPD.

Test plan

pytest tests/distributions/test_continuous.py::TestGenParetoClass - GPD tests
pytest tests/distributions/test_continuous.py::TestExtGenParetoClass - ExtGPD tests
pytest tests/distributions/test_continuous.py::TestGenPareto - GPD random sampling
pytest tests/distributions/test_continuous.py::TestExtGenPareto - ExtGPD random sampling

- Add GenPareto distribution for peaks-over-threshold extreme value modeling - Add ExtGenPareto distribution following Naveau et al. (2016) for modeling entire distributions without threshold selection - ExtGPD uses CDF G(x) = H(x)^kappa where H is the GPD CDF and kappa > 0 controls lower tail behavior (reduces to GPD when kappa = 1) - Include comprehensive tests for logp, logcdf, support_point, and random sampling

This enables PyMC's Truncated distribution to use inverse CDF sampling instead of rejection sampling, which is critical when the truncation probability is high (e.g., when kappa is small and most mass is below the truncation threshold). The inverse CDF for ExtGPD is: G^{-1}(p) = H^{-1}(p^{1/kappa}) where H^{-1} is the GPD quantile function.

ricardoV94 · 2026-02-05T07:43:07Z

pymc_extras/distributions/continuous.py

 from scipy import stats


+class GenParetoRV(RandomVariable):


How tricky is the rvs method to implement symbolically? If scipy is just doing inverse cdf sampling we could also do it.

In that case you would implement the RV as a SymbolicRV. The advantage of that is that the random methods work out of the box on the different backends.

Same for the other distribution

…iable Switch from scipy-based RandomVariable to symbolic inverse CDF sampling. This enables backend-agnostic random sampling (JAX, NumPy, etc.) by implementing the inverse CDF transformation directly in PyTensor. The inverse CDF for GPD is: - mu + sigma * [(1-u)^(-xi) - 1] / xi for xi != 0 - mu - sigma * log(1-u) for xi = 0 For ExtGPD, we use the transformation u^{1/kappa} before applying the GPD inverse CDF. Co-authored-by: Claude AI Co-authored-by: Cursor <cursoragent@cursor.com>

- Add _exprel helper for (exp(t)-1)/t with correct gradient at t=0 - Use survival probability directly to avoid 1-u cancellation in upper tail - For ExtGPD, compute GPD survival probability via -expm1(log(u)/kappa) Co-authored-by: Cursor <cursoragent@cursor.com>

maresb · 2026-02-05T13:41:22Z

Thanks for the tip @ricardoV94!!!

aloctavodia · 2026-02-05T21:11:01Z

Should we try to implement these distributions here https://github.com/pymc-devs/distributions?

maresb · 2026-02-05T22:18:10Z

Oh cool, thanks @aloctavodia! I wasn't aware of that.

I'd be happy to move it over, but it may be a while before I can get to it.

maresb added 2 commits February 4, 2026 19:16

ricardoV94 reviewed Feb 5, 2026

View reviewed changes

maresb and others added 2 commits February 5, 2026 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Generalized Pareto (GPD) and Extended GPD distributions#638

Add Generalized Pareto (GPD) and Extended GPD distributions#638
maresb wants to merge 4 commits intopymc-devs:mainfrom
maresb:extgpd

maresb commented Feb 4, 2026

Uh oh!

ricardoV94 Feb 5, 2026

Uh oh!

maresb commented Feb 5, 2026

Uh oh!

aloctavodia commented Feb 5, 2026

Uh oh!

maresb commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maresb commented Feb 4, 2026

Summary

Features

Mathematical Background

Test plan

Uh oh!

ricardoV94 Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

maresb commented Feb 5, 2026

Uh oh!

aloctavodia commented Feb 5, 2026

Uh oh!

maresb commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants