entropy_multiscale() generates incorrect results with the CMSE and RCMSE methods. #1071

Joaq-Mar · 2025-02-11T16:21:33Z

For white and pink (1/f) noises generated with neurokit2, I compared the MSEn, CMSEn, and RCMSEn methods against the algorithms presented in the original papers (Costa et al. 2002, Wu et al. 2013, and Wu et al. 2014, respectively) implemented by me.

And these were the results:

For N=1000 points (fs=500Hz, t=2s):

MSE:
-White noise...
nk -> 0.024704933166503906 seconds
Costa -> 0.26683616638183594 seconds
-Pink noise...
nk -> 0.02707219123840332 seconds
Costa -> 0.30927252769470215 seconds

CMSE:
-White noise...
nk -> 0.2005162239074707 seconds
Wu -> 0.7186903953552246 seconds
-Pink noise...
nk -> 0.17482471466064453 seconds
Wu -> 0.6616179943084717 seconds

RCMSE:
-White noise...
nk -> 0.15340662002563477 seconds
Wu -> 0.7407207489013672 seconds
-Pink noise...
nk -> 0.16478848457336426 seconds
Wu -> 0.6370437145233154 seconds

For N=10000 points (fs=1000Hz, t=10s):

MSE:
-White noise...
nk -> 0.42394423484802246 seconds
Costa -> 24.500361442565918 seconds
-Pink noise...
nk -> 0.37540411949157715 seconds
Costa -> 24.14061141014099 seconds

CMSE:
-White noise...
nk -> 1.3335025310516357 seconds
Wu -> 61.81176233291626 seconds
-Pink noise...
nk -> 1.4410855770111084 seconds
Wu -> 56.11741495132446 seconds

RCMSE:
-White noise...
nk -> 1.297663927078247 seconds
Wu -> 64.44741725921631 seconds
-Pink noise...
nk -> 1.4528734683990479 seconds
Wu -> 58.56749939918518 seconds

As you can see, CMSE and RCMSE computed by nk are different from my implementations.
With my implementations, I got the expected shape as shown in the original articles, but at a high computational time cost.

I hope you can correct them.

And here's the code:

import neurokit2 as nk

"""
nk.version()
- OS: Windows (WindowsPE 64bit) 
- Python: 3.9.5     
- NeuroKit2: 0.2.10 

- NumPy: 2.0.2      
- Pandas: 2.2.3     
- SciPy: 1.13.1     
- sklearn: 1.5.2    
- matplotlib: 3.9.2 
"""
import time

import matplotlib.pyplot as plt
import numpy as np

def MSE(data, scale, r):
    E = np.zeros(scale)
    
    for i in range(1, scale + 1):  # i: scale index
        buf = coarse_grain(data[0:], i)
        E[i-1] = SampEn(buf, r)
    
    return E

def CMSE(data, scale, r):
    E = np.zeros(scale)
    
    for i in range(1, scale + 1):  # i: scale index
        for j in range(1, i + 1):  # j: coarsegrain series index
            buf = coarse_grain(data[j-1:], i)
            E[i-1] += SampEn(buf, r) / i
    
    return E

def RCMSE(data, scale, r):
    E = np.zeros(scale)

    for i in range(1, scale + 1):  # i: scale index
        sum_Nn = 0
        sum_Nd = 0
        for j in range(1, i + 1):  # j: coarsegrain series index
            buf = coarse_grain(data[j-1:], i)
            Nn, Nd = pseudoSampEn(buf, r)
            sum_Nn += Nn
            sum_Nd += Nd

        if sum_Nn > 0 and sum_Nd > 0:
            E[i-1] = -np.log(sum_Nd / sum_Nn)
        else:
            E[i-1] = np.inf

    return E

def coarse_grain(iSig, s):
    N = len(iSig)
    oSig = []
    
    for i in range(1, N // s + 1):
        oSig.append(np.mean(iSig[(i-1)*s:i*s]))
    return np.array(oSig)

def SampEn(data, r):
    l = len(data)
    Nn = 0
    Nd = 0
    
    for i in range(l - 2):
        for j in range(i + 1, l - 2):
            # m=2, [x1, x2]
            if abs(data[i] - data[j]) < r and abs(data[i + 1] - data[j + 1]) < r:
                Nn += 1
                # m=m+1, [x1, x2, x3]
                if abs(data[i + 2] - data[j + 2]) < r:
                    Nd += 1
    
    return -np.log(Nd / Nn) if Nn > 0 and Nd > 0 else np.inf

def pseudoSampEn(data, r):
    l = len(data)
    Nn = 0
    Nd = 0

    for i in range(l - 2):
        for j in range(i + 1, l - 2):
            # m=2, [x1, x2]
            if abs(data[i] - data[j]) < r and abs(data[i + 1] - data[j + 1]) < r:
                Nn += 1
                # m=m+1, [x1, x2, x3]
                if abs(data[i + 2] - data[j + 2]) < r:
                    Nd += 1

    return Nn, Nd

noise_duration = 2  # seconds
fs = 500  # Sampling frequency in Hz

print(f'{noise_duration} s at fs={fs} Hz => N = {noise_duration * fs} points\n')

white = nk.signal_noise(duration=noise_duration, sampling_rate=fs, beta=0, random_state=None)
pink = nk.signal_noise(duration=noise_duration, sampling_rate=fs, beta=1, random_state=None)
pink = pink - np.mean(pink)  # Center it (mean=0)

noises = {
    "White": white,
    "Pink": pink
}

colors = {
    "White": 'gray',
    "Pink": 'pink'
}

fig, ax = plt.subplots(2, 3, figsize=(13, 6), sharey='row')

print('* MSE:')
for name, noise in noises.items():
    print(f'-{name} noise...')
    color = colors[name]

    tolerance = 0.15 * np.std(noise)

    t_start = time.time()
    _, cmse_nk = nk.entropy_multiscale(noise, method="MSEn", dimension=2, tolerance=tolerance, show=False, scale=range(1, 21))
    t_end = time.time()
    delay = t_end - t_start
    print(f'    nk -> {delay} seconds')
    t_start = time.time()
    cmse_article = MSE(noise, 20, tolerance)
    t_end = time.time()
    delay = t_end - t_start
    print(f'    Wu -> {delay} seconds')

    ax[0, 0].plot(cmse_nk['Scale'], cmse_nk['Value'], marker='*', color=color, label=f'{name} noise')
    ax[1, 0].plot(range(1, 21), cmse_article, marker='*', color=color)

ax[0, 0].set_title(f'neurokit2')
ax[0, 0].set_xlabel('Scale factor')
ax[0, 0].set_ylabel('MSE')

ax[1, 0].set_title(f'Costa et al (2002)')
ax[1, 0].set_xlabel('Scale factor')
ax[1, 0].set_ylabel('MSE')


print('* CMSE:')
for name, noise in noises.items():
    print(f'-{name} noise...')
    color = colors[name]
    sd = np.std(noise)

    tolerance = 0.15 * np.std(noise)

    t_start = time.time()
    _, cmse_nk = nk.entropy_multiscale(noise, method="CMSEn", dimension=2, tolerance=tolerance, show=False, scale=range(1, 21))
    t_end = time.time()
    delay = t_end - t_start
    print(f'    nk -> {delay} seconds')
    t_start = time.time()
    cmse_article = CMSE(noise, 20, tolerance)
    t_end = time.time()
    delay = t_end - t_start
    print(f'    Wu -> {delay} seconds')

    ax[0, 1].plot(cmse_nk['Scale'], cmse_nk['Value'], marker='*', color=color)
    ax[1, 1].plot(range(1, 21), cmse_article, marker='*', color=color)

ax[0, 1].set_title(f'neurokit2')
ax[0, 1].set_xlabel('Scale factor')
ax[0, 1].set_ylabel('CMSE')

ax[1, 1].set_title(f'Wu et al (2013)')
ax[1, 1].set_xlabel('Scale factor')
ax[1, 1].set_ylabel('CMSE')


print('* RCMSE:')
for name, noise in noises.items():
    print(f'-{name} noise...')
    color = colors[name]
    sd = np.std(noise)

    tolerance = 0.15 * np.std(noise)

    t_start = time.time()
    _, cmse_nk = nk.entropy_multiscale(noise, method="RCMSEn", dimension=2, tolerance=tolerance, show=False, scale=range(1, 21))
    t_end = time.time()
    delay = t_end - t_start
    print(f'    nk -> {delay} seconds')
    t_start = time.time()
    cmse_article = RCMSE(noise, 20, tolerance)
    t_end = time.time()
    delay = t_end - t_start
    print(f'    Wu -> {delay} seconds')

    ax[0, 2].plot(cmse_nk['Scale'], cmse_nk['Value'], marker='*', color=color)
    ax[1, 2].plot(range(1, 21), cmse_article, marker='*', color=color)

ax[0, 2].set_title(f'neurokit2')
ax[0, 2].set_xlabel('Scale factor')
ax[0, 2].set_ylabel('RCMSE')

ax[1, 2].set_title(f'Wu et al (2014)')
ax[1, 2].set_xlabel('Scale factor')
ax[1, 2].set_ylabel('RCMSE')

fig.legend(loc='lower center', ncol=2)

fig.suptitle('Comparing three MSE methods (m=2, r=0.15*sd)')

plt.tight_layout()
plt.subplots_adjust(left=0.055, bottom=0.13, right=0.98, top=0.9, wspace=0.18, hspace=0.355)

plt.show()

welcome · 2025-02-11T16:21:36Z

Hi 👋 Thanks for reaching out and opening your first issue here! We'll try to come back to you as soon as possible. ❤️

DerAndereJohannes · 2025-02-15T21:13:19Z

Hi! Thank you for the very detailed issue along with the sample code. I will take a closer look at this and come back to you as soon as I can.

DerAndereJohannes self-assigned this Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

entropy_multiscale() generates incorrect results with the CMSE and RCMSE methods. #1071

entropy_multiscale() generates incorrect results with the CMSE and RCMSE methods. #1071

Joaq-Mar commented Feb 11, 2025

welcome bot commented Feb 11, 2025

DerAndereJohannes commented Feb 15, 2025

entropy_multiscale() generates incorrect results with the CMSE and RCMSE methods. #1071

entropy_multiscale() generates incorrect results with the CMSE and RCMSE methods. #1071

Comments

Joaq-Mar commented Feb 11, 2025

welcome bot commented Feb 11, 2025

DerAndereJohannes commented Feb 15, 2025