Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.1.11 #10

Merged
merged 3 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@

__Important: this package assumes the following about input hyperspectral data:__
- Data must be in NetCDF (.nc) or ENVI (.hdr)
- Data in radiance and not georeferenced
- if using RTM must be in microW/cm2/nm/sr (for now)
- Pushbroom imaging spectrometer, such as:
- AVIRIS-Classic, AVIRIS-NG, AVIRIS-3, DESIS, EnMAP, EMIT, GaoFen-5, HISUI, Hyperion EO-1, HySIS, PRISMA, SEBASS, Tanager
- Data not georeferenced (typically referred to as L1B before ortho)
- Data in radiance (assumed microW/cm2/nm/sr (for now))
- Pushbroom imaging spectrometer, such as, but not limited to:
- AVIRIS-NG, AVIRIS-3, DESIS, EnMAP, EMIT, GaoFen-5, HISUI, Hyperion EO-1, HySIS, PRISMA, Tanager-1

## Installation Instructions

Expand All @@ -31,15 +31,17 @@ pip install hyperquest
| **SNR** | `hrdsdc()` | Homogeneous regions division and spectral de-correlation (Gao et al., 2008) |
| | `rlsd()` | Residual-scaled local standard deviation (Gao et al., 2007) |
| | `ssdc()` | Spectral and spatial de-correlation (Roger & Arnold, 1996) |
| **Intrinsic Dimensionality (ID)**| `random_matrix_theory()` | Determining the Intrinsic Dimension (ID) of a Hyperspectral Image Using Random Matrix Theory (Cawse-Nicholson et al., 2012, Cawse-Nicholson et al., 2022) |
| **Co-Registration** | `sub_pixel_shift()` | Computes sub pixel co-registration between the VNIR & VSWIR imagers using skimage phase_cross_correlation |
| **Striping (not destriping)**| `sigma_theshold()` | As presented in Yokoya 2010, Preprocessing of hyperspectral imagery with consideration of smile and keystone properties. |
| **Smile** | `smile_metric()` | Similar to MATLAB "smileMetric". Computes derivatives of O2 and CO2 absorption features across-track (Dadon et al., 2010). |
| | `nodd_o2a()` | Similar to method in Felde et al. (2003) to solve for nm shift at O2-A across-track. Requires radiative transfer model run. |
| **Radiative Transfer** | `run_libradtran()` | Runs libRadtran based on user input geometry and atmosphere at 1.0 cm-1. Saves to a .csv file for use in methods requiring radiative transfer.|
| **Radiative Transfer** | `run_libradtran()` | Runs libRadtran based on user input geometry and atmosphere at 0.1 nm spectral resolution. Saves to a .csv file for use in methods requiring radiative transfer.|


## Usage example

- see [SNR example](tutorials/example_using_EMIT.ipynb) and [Smile example](tutorials/testing_smile_methods.ipynb) which has different methods computed over Libya-4.
- see [EMIT example](tutorials/example_using_EMIT.ipynb) which has different methods computed over Libya-4.

## libRadtran install instructions
- Can be installed on Unix type system using the following link:
Expand All @@ -48,6 +50,8 @@ pip install hyperquest

## References:

- Cawse-Nicholson, K., Damelin, S. B., Robin, A., & Sears, M. (2012). Determining the intrinsic dimension of a hyperspectral image using random matrix theory. IEEE Transactions on Image Processing, 22(4), 1301-1310.
- Cawse‐Nicholson, K., Raiho, A. M., Thompson, D. R., Hulley, G. C., Miller, C. E., Miner, K. R., ... & Zareh, S. K. (2022). Intrinsic dimensionality as a metric for the impact of mission design parameters. Journal of Geophysical Research: Biogeosciences, 127(8), e2022JG006876.
- Cogliati, S., Sarti, F., Chiarantini, L., Cosi, M., Lorusso, R., Lopinto, E., ... & Colombo, R. (2021). The PRISMA imaging spectroscopy mission: overview and first performance analysis. Remote sensing of environment, 262, 112499.
- Curran, P. J., & Dungan, J. L. (1989). Estimation of signal-to-noise: a new procedure applied to AVIRIS data. IEEE Transactions on Geoscience and Remote sensing, 27(5), 620-628.
- Dadon, A., Ben-Dor, E., & Karnieli, A. (2010). Use of derivative calculations and minimum noise fraction transform for detecting and correcting the spectral curvature effect (smile) in Hyperion images. IEEE Transactions on Geoscience and Remote Sensing, 48(6), 2603-2612.
Expand Down
3 changes: 2 additions & 1 deletion hyperquest/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@
from .coregistration import *
from .libradtran import *
from .optimization import *
from .radiative_transfer import *
from .radiative_transfer import *
from .intrinsic_dimensionality import *
105 changes: 105 additions & 0 deletions hyperquest/intrinsic_dimensionality.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
import numpy as np

from .utils import *



def random_matrix_theory(path_to_data, noise_variance, mask_waterbodies=True, alpha=0.5, no_data_value=-9999):
'''
A line-by-line python adaption of K. Cawse-Nicholson algorithm presented in,

Cawse-Nicholson, K., Raiho, A. M., Thompson, D. R., Hulley,
G. C., Miller, C. E., Miner, K. R., ... & Zareh, S. K. (2022).
Intrinsic dimensionality as a metric for the impact of mission design parameters.
Journal of Geophysical Research: Biogeosciences, 127(8), e2022JG006876.

Parameters:
path_to_data (str): Path to the .hdr or .nc
noise_variance (ndarray): noise variance for each band. Used to compute N (noise covariance matrix of size bands x bands).
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.
alpha (float): Significance level. 0.5 was found to be optimal in a study using hyperspectral data (Cawse-Nicholson et al., 2012).
no_data_value (int or float): Value used to describe no data regions.

Returns:
K_est (int): Intrinsic Dimensionality (ID) of image.

'''

# Identify data type
if path_to_data.lower().endswith('.nc'):
img, _, w, _ = retrieve_data_from_nc(path_to_data)
else:
# Load raster
img_path = get_img_path_from_hdr(path_to_data)
img = np.array(envi.open(path_to_data, img_path).load(), dtype=np.float64)
# get wavelengths
w, _, _ = read_hdr_metadata(path_to_data)

# ensure data is 3d
if len(img.shape) != 3:
raise Exception('Data needs to be a 3D array.')

# mask waterbodies
if mask_waterbodies is True:
img = mask_water_using_ndwi(img, w)

# Mask no data values
img[img <= no_data_value] = np.nan

# reshape R based on n,bands
rows, cols, bands = img.shape
n = rows * cols
R = np.reshape(img, (n,bands))

# mean pixel value, size [1,bands]
m = np.nanmean(R, axis=0)

# create a matrix of mean values the same size as R
M = np.tile(m, (n, 1))

# set nan to zero
R[np.isnan(R)] = 0

# image covariance matrix
S = (R-M).T @ (R-M) / n
S[np.isnan(S)] = 0

# compute sigma, used for threshold at end
a = 0.5
b = 0.5
mu = 1/n * (np.sqrt(n-a) + np.sqrt(bands-b))**2
sig = 1/n * (np.sqrt(n-a) + np.sqrt(bands-b))*(1/np.sqrt(n-a)+1/np.sqrt(bands-b))**(1/3)
s = (-3/2 * np.log(4*np.sqrt(np.pi) * alpha/100))**(2/3)
sigma = mu + s*sig

# Get the noise covariance matrix of size bands x bands
noise_variance[np.isnan(noise_variance)] = 0
N = np.diag(noise_variance).copy()
N[np.isnan(N)] = 0

# Eigenvectors and eigenvalues of S
# Note these are opposite of MATLAB [evec_S,eval_S]
eval_S, evec_S = np.linalg.eig(S)
sortind = np.argsort(eval_S)[::-1]
eval_S = eval_S[sortind]
evec_S = evec_S[:,sortind]

# Eigenvectors and eigenvalues of Pi = S-N
eval_Pi, evec_Pi = np.linalg.eig(S-N)
sortind2 = np.argsort(eval_Pi)[::-1]
eval_Pi = eval_Pi[sortind2]
evec_Pi = evec_Pi[:,sortind2]

# there is a different threshold for each band to represent noise conditions
X = []
for i in range(bands):
X.append((evec_Pi[:,i].T @ N @ evec_S[:,i]) / (evec_Pi[:,i].T @ evec_S[:,i]))
X = np.array(X)

# The ID is the number of eigenvalues exceeding the threshold X.T*sigma
thresholded = np.maximum(0, eval_S - X * sigma)
intersect_i = np.argwhere(thresholded > 0)
K_est = len(intersect_i)


return K_est
4 changes: 2 additions & 2 deletions hyperquest/smile.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def smile_metric(path_to_data, rotate, mask_waterbodies=True):
Parameters:
path_to_data (str): Path to the .hdr or .nc
rotate (int): rotate counter clockwise, either 0, 90, 180, or 270.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0. Default is True.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.

Returns:
o2_mean, co2_mean, o2_std, co2_std: 1d array of cross-track mean do2, mean dco2, std do2, std dco2
Expand Down Expand Up @@ -133,7 +133,7 @@ def nodd_o2a(path_to_data, rotate, path_to_rtm_output_csv, ncpus=1,rho_s=0.15, m
path_to_rtm_output_csv (str): Path to output from radiative transfer.
ncpus (int, optional): Number of CPUs for parallel processing. Default is 1.
rho_s (float): value from 0-1. As stated, this does not influence nodd method very much and 0.15 is common in literature.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0. Default is True.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.

Returns:
cwl_opt, fwhm_opt, sensor_band_near_760, fwhm_near_760: 1d array of cross-track CWL, 1d array of cross-track FWHM, band near 760, fwhm near 760
Expand Down
72 changes: 33 additions & 39 deletions hyperquest/snr.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from .mlr import *


def rlsd(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_db = False, mask_waterbodies=True, no_data_value=-9999):
def rlsd(path_to_data, block_size, nbins=150, ncpus=1, snr_in_db = False, mask_waterbodies=True, no_data_value=-9999):
'''
Residual-scaled local standard deviation (Gao et al., 2007)

Expand All @@ -17,13 +17,12 @@ def rlsd(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_
block_size (int): Block size for partitioning (for example 5 would be 5x5 pixels).
nbins (int, optional): Number of bins for histogram analysis. Default is 150.
ncpus (int, optional): Number of CPUs for parallel processing. Default is 1.
output_all (bool, optional): Whether to return all outputs. Default is False returing SNR, True returns mu and sigma.
snr_in_db (bool, optional): Whether SNR is in dB. Default is False.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0. Default is True.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.
no_data_value (int or float): Value used to describe no data regions.

Returns:
out: Either an ndarray of SNR, or a tuple containing (mu, sigma, SNR) with respect to wavelength.
SNR, noise_variance : tuple containing SNR and noise variance with respect to wavelength.

'''

Expand Down Expand Up @@ -72,22 +71,21 @@ def rlsd(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_
mu = mask_atmos_windows(mu, w)
sigma = mask_atmos_windows(sigma, w)

# division (watching out for zero in denominator)
out = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
out[sigma == 0] = np.nan
# Compute SNR
snr = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
snr[sigma == 0] = np.nan

# check to convert to db
if snr_in_db is True:
out = linear_to_db(out)

# check to have full output
if output_all is True:
out = (mu, sigma, out)
snr = linear_to_db(snr)

return out
# convert noise to variance
noise_variance = np.square(sigma, dtype=np.float64)

return snr, noise_variance

def ssdc(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_db = False, mask_waterbodies=True, no_data_value=-9999):

def ssdc(path_to_data, block_size, nbins=150, ncpus=1, snr_in_db = False, mask_waterbodies=True, no_data_value=-9999):
'''
Spectral and spatial de-correlation (Roger & Arnold, 1996)

Expand All @@ -96,13 +94,12 @@ def ssdc(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_
block_size (int): Block size for partitioning (for example 5 would be 5x5 pixels).
nbins (int, optional): Number of bins for histogram analysis. Default is 150.
ncpus (int, optional): Number of CPUs for parallel processing. Default is 1.
output_all (bool, optional): Whether to return all outputs. Default is False returing SNR, True returns mu and sigma.
snr_in_db (bool, optional): Whether SNR is in dB. Default is False.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0. Default is True.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.
no_data_value (int or float): Value used to describe no data regions.

Returns:
out: either an ndarray of SNR, or a tuple containing (mu, sigma, SNR) with respect to wavelength.
SNR, noise_variance : tuple containing SNR and noise variance with respect to wavelength.

'''

Expand Down Expand Up @@ -151,23 +148,22 @@ def ssdc(path_to_data, block_size, nbins=150, ncpus=1, output_all=False, snr_in_
mu = mask_atmos_windows(mu, w)
sigma = mask_atmos_windows(sigma, w)

# division (watching out for zero in denominator)
out = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
out[sigma == 0] = np.nan
# Compute SNR
snr = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
snr[sigma == 0] = np.nan

# check to convert to db
if snr_in_db is True:
out = linear_to_db(out)

# check to have full output
if output_all is True:
out = (mu, sigma, out)
snr = linear_to_db(snr)

# convert noise to variance
noise_variance = np.square(sigma, dtype=np.float64)

return out
return snr, noise_variance


def hrdsdc(path_to_data, n_segments=200, compactness=0.1, n_pca=3, ncpus=1, include_neighbor_pixel_in_mlr=True,
output_all=False, snr_in_db=False, mask_waterbodies=True, no_data_value=-9999):
snr_in_db=False, mask_waterbodies=True, no_data_value=-9999):
'''
Homogeneous regions division and spectral de-correlation (Gao et al., 2008)

Expand All @@ -178,13 +174,12 @@ def hrdsdc(path_to_data, n_segments=200, compactness=0.1, n_pca=3, ncpus=1, incl
n_pca (int): Number of PCAs to compute and provide to SLIC segmentation.
ncpus (int, optional): Number of CPUs for parallel processing. Default is 1.
include_neighbor_pixel_in_mlr (bool, optional): If True, neighbor pixel is used in MLR (for k`). Else, MLR only contains spectral data (k+1, k-1).
output_all (bool, optional): Whether to return all outputs. Default is False returing SNR, True returns mu and sigma.
snr_in_db (bool, optional): Whether SNR is in dB. Default is False.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0. Default is True.
mask_waterbodies (bool, optional): Whether to mask water bodies based on NDWI threshold of 0.25. Default is True.
no_data_value (int or float): Value used to describe no data regions.

Returns:
out: either an ndarray of SNR, or a tuple containing (mu, sigma, SNR) with respect to wavelength.
SNR, noise_variance : tuple containing SNR and noise variance with respect to wavelength.

'''

Expand All @@ -203,8 +198,8 @@ def hrdsdc(path_to_data, n_segments=200, compactness=0.1, n_pca=3, ncpus=1, incl
if mask_waterbodies is True:
array = mask_water_using_ndwi(array, w)

# Mask no data values
array[array <= no_data_value] = np.nan
# Mask no data values (to negative 9999 for PCA and SLIC to work)
array[array <= no_data_value] = -9999

# Apply PCA
pca = PCA(n_components=n_pca)
Expand Down Expand Up @@ -258,15 +253,14 @@ def process_segment(u):
sigma = mask_atmos_windows(sigma, w)

# Compute SNR
out = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
out[sigma == 0] = np.nan
snr = np.divide(mu, sigma, out=np.zeros_like(mu), where=(sigma != 0))
snr[sigma == 0] = np.nan

# check to convert to db
if snr_in_db is True:
out = linear_to_db(out)
snr = linear_to_db(snr)

# Output full results if requested
if output_all:
out = (mu, sigma, out)
# convert noise to variance
noise_variance = np.square(sigma, dtype=np.float64)

return out
return snr, noise_variance
Loading