Simultaneous fitting (#1)

* WIP: simultaneous fitting * wip2 * simultaneous works; ruff * clean up pyproject * quality of life for fitting * plotting: comments and defaults * update readme * put data into examples (sorry GitHub) * ruff * add chi2 factory (oops)
settwi · Nov 7, 2024 · e1bd39d · e1bd39d
1 parent f3e234b
commit e1bd39d
Show file tree

Hide file tree

Showing 13 changed files with 1,905 additions and 592 deletions.
diff --git a/.gitignore b/.gitignore
@@ -160,3 +160,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+
+uv.lock
diff --git a/examples/.gitignore b/examples/.gitignore
@@ -1,3 +1,5 @@
 rhessi-data
 rhessi-data*
-stix\ data
+stix\ data
+nustar-data
+nustar_example.asdf
diff --git a/examples/line_fit.ipynb b/examples/line_fit.ipynb
diff --git a/examples/rhessi_fit.ipynb b/examples/rhessi_fit.ipynb
diff --git a/examples/simultaneous_nustar_fit.ipynb b/examples/simultaneous_nustar_fit.ipynb
diff --git a/examples/stix_fit.ipynb b/examples/stix_fit.ipynb
diff --git a/pyproject.toml b/pyproject.toml
@@ -13,5 +13,25 @@ dependencies = [
     "corner",
     "dill",
     # For pickling locals; uses dill instead of pickle
-    "multiprocess"
+    "multiprocess",
+    "ruff",
+    "tqdm",
+]
+
+[project.optional-dependencies]
+examples = [
+    "asdf",
+    "asdf-astropy",
+    "ipywidgets",
+    "ipykernel",
+    "ipynb",
+    "jupyter",
+    "pyqt6",
+    "pyqt5",
+    "pyside2",
+    "pip",
+
+    # Required for model functions and data loading
+    "sunkit_spex @ git+https://github.com/sunpy/sunkit-spex.git",
+    "sunpy[all]",
 ]
diff --git a/readme.md b/readme.md
@@ -11,25 +11,37 @@ which incorporates
 
 ## Table of contents
 - [Installation](#installation)
+- [List of examples](#examples)
 - [Spectroscopy fundamentals](#x-ray-spectroscopy-basics)
 - [Package overview](#package-overview)
 
 ## Installation
 `yaff` can be installed from GitHub.
+It's recommended to use `uv` to manage your Python versions and virtual environments: [uv project GitHub page, with installation instructions](https://github.com/astral-sh/uv).
+It is very fast and works well.
+
 Clone the repository and run
 ```bash
 cd /path/to/where/you/cloned/yaff
-pip install -e .
-```
-The package will install its dependencies and itself.
+# Optionally--but preferably--make a venv to work in
+# here is an example of how to do that using the `uv` tool
+uv venv
+source .venv/bin/activate
 
-You also need to manually install `sunkit-spex` to run
-the examples as the release version is rather outdated.
-```bash
-pip install git+https://github.com/sunpy/sunkit-spex.git
-# Or, clone the repo and do a `pip install -e .`
+uv pip install -e .
+# or omit the `uv` if you are just using pip
+
+# If you want to install the packages required by the examples as well:
+uv pip install -e .[examples]
 ```
 
+## Examples
+All of these can be found in `examples` directory. (TODO: make them a gallery)
+- `line_fit`: Fit some synthetic counts data to a line.
+- `rhessi_fit`: Fit a (thermal + cold thick target) model to a RHESSI spectrum near the peak of an M9 GOES class flare.
+- `stix_fit`: Fit a (thermal + cold thick target) model to a STIX spectrum during part of the impulsive phase of an M1 GOES class flare.
+- `simulataneous_nustar_fit`: Fit a thermal bremsstrahlung model to both NuSTAR focal plane modules (FPMA and FPMB) simultaneously to a quiet sun brightening.
+
 ## X-ray spectroscopy basics
 ### Spectroscopy: background
 Analyzing X-rays from the Sun and other stars is straightforward
@@ -83,4 +95,4 @@ X-ray spectroscopy consists of just a few steps:
 That's it. Once you are "happy" in a rigorous sense, you are done performing spectroscopy.
 
 ## Package overview
-TBD
+TBD
diff --git a/yaff/common_likelihoods.py b/yaff/common_likelihoods.py
@@ -0,0 +1,70 @@
+"""Common likelihoods used when processing X-ray data"""
+
+import scipy.stats as st
+import numpy as np
+from numpy.typing import ArrayLike
+from typing import Callable
+
+from yaff import fitting
+
+
+def poisson_factory(
+    restriction: np.ndarray[bool],
+) -> Callable[[ArrayLike, ArrayLike], float]:
+    """
+    Construct a Poisson log likelihood function which restricts evaluation
+    using the given `restrict` variable.
+    The `restrict` variable should be the same shape as the counts you plan to fit.
+    This can be used to, e.g., change the energy range you wish to fit,
+    or exclude known bad bins.
+
+    **For those familiar with "cstat" or the "cash" statistic: just use this.**
+    It's the exact version of those two.
+
+    The background data is assumed to be a constant with no error for evaluation
+    of this likelihood. If background error important, a Negative Binomial likelihood
+    should be used, in the sense that it is a Poisson distribution with "extra" variance.
+    Wikipedia describes how to map one to the other:
+    https://en.wikipedia.org/wiki/Negative_binomial_distribution#Poisson_distribution
+    The background error can be added on to the Negative Binomial's
+    extra variance.
+    """
+
+    def log_likelihood(data: fitting.DataPacket, model: np.ndarray):
+        # Set energy bounds to restrict where we care about the likelihood
+        # For Poisson likelihood, the model must comprise
+        # of integers, otherwise the `logpmf` is sad
+        discrete_model = model.astype(int)
+
+        # Any zero-count bins cannot contribute to the log-likelihood for two reasons:
+        # 1. a "Poisson distribution" with expected value zero has variance zero, so
+        #    pmf(x) = (1 if x == 0 else 0),
+        #    meaning ANY model value other than zero will screw up the log likelihood
+        # 2. even if the model IS exactly zero, it doesn't affect the log likelihood as
+        #    ln(p(0)) = ln(1) = 0.
+        restrict = (data.counts > 0) & restriction
+        rv = st.poisson(data.counts)
+        return rv.logpmf(discrete_model + data.background_counts)[restrict].sum()
+
+    return log_likelihood
+
+
+def chi_squared_factory(
+    restriction: np.ndarray[bool],
+) -> Callable[[ArrayLike, ArrayLike], float]:
+    """Construct a chi2 likelihood that is weighted by
+    errors on the background counts and counts.
+
+    The `restriction` array may be used to restrict the energy
+    range which is considered when fitting.
+    """
+
+    def chi2_likelihood(data: fitting.DataPacket, model: np.ndarray):
+        total_sq_err = data.counts_error**2 + data.background_counts_error**2
+        numerator = (data.counts - data.background_counts - model) ** 2
+
+        # Some count bins might be negative, or have zero error,
+        # so use nan_to_num
+        return -np.nan_to_num((numerator / total_sq_err)[restriction]).sum()
+
+    return chi2_likelihood
-Original file line number
+Diff line change
@@ Expand Up / @@ -160,3 +160,5 @@ cython_debug/ @@
     #  and can be added to the global gitignore or merged into this file.  For a more nuclear
     #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
     #.idea/
+    uv.lock