title | tags | authors | affiliations | date | bibliography | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EMRI_MC: A GPU-based code for Bayesian inference of EMRI waveforms |
|
|
|
18 December 2023 |
paper.bib |
We describe a simple and efficient Python code to perform Bayesian forecasting for gravitational waves (GW) produced by Extreme-Mass-Ratio-Inspiral systems (EMRIs). The code runs on GPUs for an efficient parallelised computation of thousands of waveforms and sampling of the posterior through a Markov-Chain-Monte-Carlo (MCMC) algorithm. EMRI_MC
generates EMRI waveforms based on the so--called kludge scheme, and propagates it to the observer accounting for cosmological effects in the observed waveform due to modified gravity/dark energy. Extending the code to more accurate schemes for the generation of the waveform is straightforward. Despite the known limitations of the kludge formalism, we believe that the code can provide a helpful resource for the community working on forecasts for interferometry missions in the milli-Hz scale, predominantly, the satellite-mission LISA.
In view of the burst of GW astronomy and the need for parameter estimation
[@Christensen:2022bxb; @Iacovelli:2022mbg; @Iacovelli:2022bbs; @Mastrogiovanni:2021wsd],
including possible effects of modified gravity, EMRI_MC
provides a simple, yet efficient code for the GW community of astrophysics and cosmology towards parameter estimation and forecasts for the future LISA detector
[@LISA:2022yao; @LISA:2022kgy; @LISACosmologyWorkingGroup:2022jok; @Iacovelli:2022mbg].
Parameter forecasting for EMRI signals is not an easy task, because of the challenge to model their waveforms and the high-dimensional parameter space that needs be explored. Most of the attempts in the literature to date are based on the kludge scheme for the waveform generation, as well as on the Fisher information matrix approach for the parameter forecast; see, e.g., [@Barack:2003fp; @Barack:2006pq; @Vallisneri:2007ev; @Chua:2017ujo; @Moore:2017lxy; @Babak:2017tow]. An early attempt at parameter estimation using Bayesian inference was performed in [@Ali:2012zz] and, more recently, in [@Burke:2023lno; @Liu:2023onj].
EMRI_MC
relies on four main elements:
i) the waveform generator;
ii) the inclusion of the amplitude damping and modified speed of GWs;
iii) the posterior sampling through MCMC methods;
iv) the GPU-based vectorisation of quantities such as the likelihood, in order to accelerate computations. Our code aims to provide a simple and efficient tool that could be of help for the community working on the interface of GWs and modified gravity. We emphasise that the structure of the code provides for enough flexibility to allow extension and improvements in each of its elements, according to the need of the specific task.
-
The waveform generator. Our choice for the waveform generator relies on the popular Analytic Kludge (AK) model for the generation of inspiraling EMRI waveforms [@Barack:2003fp; @Barack:2006pq]. Though it is not the most accurate waveform model to date, this choice is justified as follows. AK waveforms provide a sufficiently good approximation of the binary's dynamics, as long as one remains sufficiently far away from the merger. Additionally, AK waveforms allows for an analytic handle on the physics. The equations can be consistently extended with new post-Newtonian and self-force corrections, as well as the inclusion of new physics such as dark matter effects. In this regard, they provide an excellent proxy to perform parameter estimation for future missions, as well as investigate the significance of effects related to new physics. The AK model can be replaced with a more accurate waveform model generator. Examples are the Augmented Analytic Kludge[@Chua:2015mua; @Chua:2017ujo], the Numerical Kludge [@PhysRevD.66.064005; @PhysRevD.66.044002; @PhysRevD.73.064037; @PhysRevD.75.024005], the Fast EMRI Waveforms [@Chua:2020stf; @Katz:2021yft], and the Effective One-Body approach [@PhysRevLett.104.091102; @PhysRevD.83.044044; @PhysRevD.104.024067; @Albertini:2023xcn].
-
The inclusion of modified gravity effects. We include effects beyond General Relativity during the propagation of GWs on the cosmological background. Specifically, we include the effects of damping of the amplitude and modification of the GW speed [@Saltas:2014dha; @Nishizawa:2017nef; @LISACosmologyWorkingGroup:2022wjo].
-
The posterior sampling. Most of the codes currently available for posterior samplings for EMRIs parameter space make use of the Fisher information matrix. We adopt a Bayesian approach using Markov-Chain-Monte-Carlo (MCMC)methods. These are implemented via the MCMC package
emcee
[@Foreman_Mackey_2013], which employs an affine-invariant ensemble sampler [@{2010CAMCS...5...65G}]. -
The GPU-based vectorisation and parallelisation features. Bayesian inferences through MCMC methods are rather expensive for CPUs, especially when posteriors evaluations involve high-dimensional parameter space. To overcome this computational limitation, the code adopts GPU-based vectorisation and parallelisation features.
Waveform generation: The generation of accurate GW waveforms for binary systems and efficient parameter estimation is key for current and future GW missions such as LISA. For EMRIs, the accuracy of the waveform in the inspiral phase requires adiabatic, post-adiabatic and self-force approximations [@Poisson:2011nh; @vandeMeent:2017bcc; @Barack:2018yvs; @Chua:2020stf; @Pound:2021qin; @Hughes:2021exa; @Warburton:2021kwk; @Wardell:2021fyy; @Katz:2021yft].
The AK model we adopt in our code relies on the Peters-Mathews formalism [@Peters:1963ux; @Peters:1964zz], where adiabatic and post-Newtonian approximations are adopted. The main advantage thereof, for our purposes, is the analytic command over the waveform generation and its parameter space. AK model can also be easily extended and modified in the presence of new physics. For technical details on the theoretical framework and equations we will be using, we refer to [@Barack:2003fp] and references therein.
The system of equations consists of two main parts:
i) the equations describing the orbital dynamics of the small body with mass
\begin{equation}\label{eq:kludge-ODEs} \frac{d {\bf Y}}{dt} = f({\bf Y}(t); \boldsymbol{\theta}), \end{equation}
where the vector
The solution of the orbital equations under the quadrupole approximation allows for the computation of the waveform as
\begin{equation}\label{eq:waveform} h_{ij}(t) = \sum_{n = 1}^{n_{max}} h_{ij}^{n}(t) = \sum_{n = 1}^{n_{max}} A_{(n)}^+(t, \boldsymbol{\theta})e_{ij}^{+}(t) + A_{(n)}^{\times}(t, \boldsymbol{\theta})e_{ij}^{\times}(t), \end{equation}
where it is understood that
Waveform propagation: Assuming a plane GW travelling far away from the source through the cosmological medium, we can write
Eq. \autoref{eq:waveform} as
\begin{equation} \label{eq:GWpropagation} \ddot{h} + 3H(\tau)(2 + \alpha_{\text{M}}) \dot{h} + k^2 (1 + \alpha_{\text{T}}) h = 0. \end{equation}
\begin{equation}\label{eq:GWpropagation-solution} h(z) = h_{\text{MG}} \times h_{\text{GR}} \equiv \frac{1}{\Xi} \times e^{-ik \Delta T} \times h_{\text{GR}}, \end{equation}
with
\begin{equation} \Xi(z) \equiv \frac{d^{\text{GW}}(z)}{d^{\text{EM}}(z)} \exp \left( \frac{1}{2}\int_{0}^{z} d\tilde{z} \frac{\alpha_{\text{M}}(\tilde{z})}{1+\tilde{z}} \right), ; ; ; ; \Delta T \equiv \exp \left(- i k \int_{0}^{z} d\tilde{z} \frac{\alpha_{\text{T}}(\tilde{z})}{1+\tilde{z}}\right), \end{equation}
\begin{equation} \label{eq:Xi} \Xi(z) = \Xi_0 + \frac{1- \Xi_0}{(1+z)^n}, \end{equation}
with
\begin{equation} \alpha_{\text{M}} = F(z, f), ; ; \alpha_{\text{T}} = G(z, f) , \end{equation}
with
Numerically, such frequency-dependent terms need to act upon a tabulated waveform in frequency space.
Overview: Our goal is to streamline an efficient parameter estimation pipeline to get joint constraints on the free parameters of the model.
As a first step, we define a fiducial model with an associated set of fiducial parameters
\begin{equation}\label{eq:likelihood} \log \mathcal{L} \propto \frac{1}{2} \frac{\left(h\boldsymbol{ \theta} - h\boldsymbol{ \theta_0} \right)^2}{S_{\rm noise}(f)}, \end{equation}
with
Waveform computation: The orbital equations \autoref{eq:kludge-ODEs} are solved as an initial-value problem on a time grid using standard ODE methods, such as the 7th-order Runge-Kutta scheme. Initial conditions are set at the Last Stable Orbit (LSO) and the equations are integrated backwards for a given time window, typically ranging between few months to one year. The resolution of the time grid is set at
Posterior sampling: The posterior sampling is performed through the Python MCMC package emcee
, which allows for various sampling techniques and parallelisation features.
Functionality overview: Vectorisation is achieved mainly through appropriate use of the ElementwiseKernel
functionality. The code is optimised for running on GPUs, making use of Python's cupy
library. Computations such as the waveform and the likelihood are GPU-vectorised reducing significantly the evaluation time. Moreover, we have parallelised the MCMC run by allowing each walker to run in parallel. In overall, for
The code's architecture consists of the following main files:
1. global_parameters.py
: This module defines the values of physical constants in cgs units, the parameters of the fiducial model, geometrical parameters and initial conditions of the binary system, parameters for the ODE solver (e.g., integration time window and grid resolution), and MCMC-related definitions. It also defines the maximum number of orbital overtones n_max
in the computation of the waveform. A change in the number of the parameters in the MCMC requires adjusting the parameter vector in this module.
2. waveform.py
: This module defines the set of kludge ODE equations (see
Eq. \autoref{eq:kludge-ODEs}), the waveform generator according to Eq. \autoref{eq:waveform}, and some GPU-related functionality. Its main functions reads as follows:
-- eqs():
Defines the set of kludge ODE equations and returns the right-hand-side of them in the sense of
Eq. \autoref{eq:kludge-ODEs}. Notice that in the case of the use of an ODE solver other than the native ones in Python, currently used, the return statement of this function might need to be changed.
-- compute_orbit():
Computes the solutions of the kludge ODE equations defined in eqs()
. To solve the system of ODEs, we use the Pythonic framework of solve_ivp
(). This choice allowed to switch between the different native solvers in this library.
-- waveform():
It calls eqs()
and compute_orbit()
, computes the time-domain waveform including the LISA response function
\autoref{eq:waveform} and then performs its FFT, via the function FFT_gpu()
. The computation of waveforms is implemented fully on cuda. GPU vectorisation and acceleration is implemented with appropriate use of ElementwiseKernel
. For computational convenience, the ouputted waveform does not include the overall factor of the GW luminosity distance. This is included in the function iterate_mcmc()
below.
-- compute_fiducial():
Computes the fiducial model based on the fiducual values defined in global_parameters.py
. The parameter vector defined in this function needs adjustment when adding/removing parameters in the MCMC run.
3. mcmc.py:
This module defines the MCMC-related functions and the MCMC iterator. Its main functions reads as follows:
-- lnprior(), lnprob():
These functions define the log-prior and the log-probability, respectively. They need be adjusted when the set of parameters in the MCMC run is modified.
-- iterate_mcmc():
It calls waveform()
to compute the waveform, and the likelihood in frequency domain for a given choice of parameters around the fiducial model using GPU vectorisation.
-- get_noise():
This defines the LISA noise function for GPU parallelisation through an ElementwiseKernel
.
-- get_Likelihood():
This function computes the likelihood according to Eq. \autoref{eq:likelihood} in GPU vectorised form through an ElementwiseKernel
. It is used in iterate_mcmc
() to compute the likelihood at each MCMC step. Modifications to the GW luminosity distance enter here. This function needs to be adjusted according to any change of parameters in the MCMC run.
4. propagation.py:
This module defines the functions needed for the propagation of the GW wave through the cosmological background in the presence of any modified gravity effects.
-- get_damping(), get_modified_speed():
This defines the possible frequency-dependent damping of the waveform's amplitude, or the respective change in its propagation speed, due to modified gravity. It is defined through an ElementwiseKernel
for an efficient evaluation on the frequency grid. After defining the functional form of the frequency-dependent damping and/or GW speed, one should modify appropriately the computation of the likelihood in the function iterate_mcmc
(). Detailed comments are provided in the code.
-- dL():
This function defines the electromagnetic luminosity.
-- dGW_Xi():
This function defines a redshift-dependent parametrisation of the GW luminosity distance due to modified gravity
according to Eq. \autoref{eq:Xi}. It should appear as an overall multiplicative factor of the waveform in the likelihood computation in iterate_mcmc
(). We remind that the function waveform
() does not include in the outputted waveform the luminosity distance factor.
5. main.ipynb:
Assuming all parameters and fiducial are properly defined as explained earlier, this Jupyter notebook calls the main functions to initiate the MCMC run, using the package emcee
. As a simple choice, we have currently set throughout the numerical computation the source location waveform.py
.
Extending the parameters in the MCMC run: First we notice that, in the vector p
defining the parameters to be varied in the MCMC, the first three values should be by default the central mass (args =
-- global_parameters.py: Edit the sections on parameters for the MCMC run and parameter values for the fiducial model.
-- mcmc.py: Edit the vector p
in the functions lnprior
() and lnprob
(). Edit the parameters to be varied in the MCMC in the function iterate_mcmc
(), including possible modifications in the GW luminosity distance which enters in the computation of the likelihood in iterate_mcmc
().
-- main.ipynb: Extend the vector p_init_MC
which initialises the walkers with the new parameters. Ensure that values for the initialisation of the walkers is meaningful given the problem at hand, otherwise the MCMC will not converge as expected.
-- propagation.py: Any new parameters which affect the GW luminosity distance must be also reflected in this module where the definitions of the luminosity distances are placed.
Computational overhead, ODE solver, overtones summation: An important computational overhead in the evaluation of each MCMC step comes from the choice of the ODE solver. We currently use the native ODE solvers provided by Python's numerical libraries. However, this can be improved using different, external solvers, such as the ones provided by the Fast EMRI Waveforms scheme [@Chua:2020stf; @Katz:2021yft]. We notice that the choice of the ODE solver enters in the functions waveform()
and compute_fiducial()
. We should also notice that, the choice of different solvers (e.g. RK23 or LSODA) leads to different computation times. Contrary to the ODE solver, the for-loop which sums over overtones in the function waveform()
, does not seem to cause any sizable computational overhead for reasonable choices of the maximum overtone (
Running the code is particularly simple. Placing all files in the same folder, and setting up all parameters as explained above, one starts the notebook main.ipynb, and executes the cells. The first cell computes the fiducial model, and the following cells start the MCMC run around the chosen fiducial. The MCMC results are stored in a .txt
file. Currently, for the sake of an example we consider a 4-parameter case: 3 source parameters (2 masses and 1 spin), and 1 propagation parameter (
As an illustration, in Figure \autoref{fig:waveforms} we plot the computed waveforms for characteristic values of the eccentricity and spin, as computed by the function plot_waveform()
. The orbital angles have been fixed according to the conventions mentioned earlier. What is more, Figure \autoref{fig:corner} shows an example corner plot from a MCMC run with 4 free parameters - 3 for the generation (masses+spin) and 1 for the propagation of the waveform respectively (
Surely the current implementation of this code can be expanded in different interesting and more accurate ways, for example: i) The inclusion of environmental effects in the production of the waveform, such as dark matter or baryonic effects due to accretion of the central black hole. Such effects would introduce amongst other things, a new dissipating channel due to the force of the dynamical friction encountered by the orbiting mass. ii) Our current use of the standard kludge equations has been based on a trade-off between simplicity and accuracy, combined with the popularity of this formalism for parameter estimation in the literature. However, its waveforms are known to suffer from certain inaccuracies. Improvement can be achieved by implementing the so--called Augmented Kludge Formalism, or the waveforms of Fast EMRI Waveforms discussed earlier, which would need a more involved implementation. iii) Finally, the implementation of a more efficient ODE solver for the orbital equations could allow us to achieve even faster iterations in the MCMC sampling run.
We are indebted to Stéphane Ilić for collaboration on an early stage of the project, for his critical advice on MCMC simulations, and for his comments on the final version of the draft. We also thank Leor Barack, George Loukes-Gerakopoulos, Simone Mastrogiovanni, Adam Pound and Nick Stergioulas for discussions. The work of R.O. is supported by the Région Île-de-France within the DIM ACAV+ project SYMONGRAV (Symétries asymptotiques et ondes gravitationnelles). I.D.S. acknowledges funding by the Czech Grant Agency (GA^CR) under the grant number 21-16583M. He also acknowledges the warm hospitality of R.O at the Observatory of Paris in November 2022, when the project was initiated. We acknowledge the use of the computer cluster at CEICO/FZU where the code was tested, and warmly thank its administrator Josef Dvořáček for his helpful input on GPU optimisation, and the use of the "Phoebe" computer cluster.
The kludge waveform computed for the last hour before the plunge at the Last Stable Orbit (LSO). Parameters: $M = 10^6 M_{\odot}$ (central mass), $\mu=10 M_{\odot}$ (orbiting mass), $(\theta_S, \phi_S) = (\pi/4, 0)$, $(\theta_K, \phi_K) = (\pi/8, 0)$, $\lambda = \pi/6$ (frame angles), $D= 1$ Gpc (distance to the source). First row: $S/M^2 = 0$ (dimensionless spin of the central black hole), $e_{\text{LSO}}= 0, 0.3, 0.6$ (eccentricity). Second row: $S/M^2 = 0.4$, $e_{\text{LSO}}= 0, 0.3, 0.6$. Third row: $S/M^2 = 0.8$, $e_{\text{LSO}}= 0, 0.3, 0.6$. \label{fig:waveforms}