Spectral Fitting¶

This notebook measures aperiodic activity using various different spectral fitting measures.

There have been many variations for fitting power spectra. This notebook compares different strategies.

This notebooks covers:

Linear fits to power spectra in log-log spacing, using a simple (OLS) linear fit
Linear fits to power spectra in log-log spacing, using a robust (RLM) linear fit
Linear fits to power spectra in log-log spacing, using the RANSAC robust fitting algorithm
Exponential fits to power spectra in semi-log spacing, using a simple exponential fit
The spectral parameterization method

In addition, various studies have used simple heuristics to try and fit aperiodic activity while avoiding being biased by peaks.

In this comparison, we include the following variations of the aforementioned methods:

Fitting the entire frequency range
Using a predefined exclusion zone of the alpha oscillation (7-14 Hz)
Using exclusing zones for all oscillatory peaks*

*Note that in the oscillation exclusion approach, the spectral parameterization method is initially used to detect and exclude oscillatory regions. Therefore, these are not fully independent methods, but are instead used to evaluate if there could be any improved fitting after oscillation exclusion.

%matplotlib inline

import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.optimize import curve_fit
from sklearn.linear_model import RANSACRegressor
from scipy.stats.mstats import normaltest

from fooof import FOOOF
from fooof.core.funcs import expo_nk_function as expf
from fooof.sim import gen_power_spectrum, gen_group_power_spectra

# Import custom code
import sys; from pathlib import Path
sys.path.append(str(Path('..').resolve()))

from apm.methods import SpectralFits
from apm.methods.fit import *
from apm.plts import plot_psds, plot_psds_two, plot_psds_shades
from apm.plts.errors import plot_violin_errors, plot_boxplot_errors, plot_corr_matrix
from apm.plts.utils import color_red_or_green

from apm.utils import exclude_spectrum, print_results

from apm.core.db import APMDB
from apm.core.io import save_pickle, load_pickle

from apm.sim.peaks import *

Settings¶

# Set up project database object
db = APMDB()

# Simulation settings
f_range = [1, 50]
off_val = 0
noise = 0.0

# Notebook Settings
SAVE_FIG = False
SAVE_DATA = False
RERUN = False  # If False, reloads previously computed measures

Spectral Fitting Methods¶

Here we demonstrate spectral fitting methods for measuring the aperiodic exponent by directly fitting power spectra.

The implementations of these methods are available in apm/fit.py.

Example PSD¶

Load an example PSD, from eyes closed, resting state EEG data, extracted to the range of 3-40 Hz.

# Load an example power spectrum to check fitting with
freqs = np.load('data/freqs.npy')
psd = np.load('data/psd.npy')

# Check a plot of the loaded power spectrum
plot_psds(freqs, psd)

Methods for Fitting Power Spectra¶

Here, we introduce the various methods for fitting the aperiodic component of the power spectrum.

# Initialize for collecting example fit results
example_results = {}

# Add constant for fitting with statsmodels
fx = sm.add_constant(np.log10(freqs))

OLS Linear Fit¶

Fit a line in log-log with Ordinary Least Squares (OLS).

# Spectral fit: linear fit
ex_fit_ols = sm.OLS(np.log10(psd), fx).fit()
example_results['ols'] = ex_fit_ols.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_ols.fittedvalues)

# Plot the aperiodic fit, from the OLS linear fit
plot_psds(freqs, [psd, pred_vals], log_freqs=True)

Robust Fit with RLM¶

# Spectral fit: robust linear fit to log-log PSD
ex_fit_rlm = sm.RLM(np.log10(psd), fx).fit()
example_results['rlm'] = ex_fit_rlm.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_rlm.fittedvalues)

# Plot the aperiodic fit, from the robust linear fit
plot_psds(freqs, [psd, pred_vals], log_freqs=True)

Robust Fit with RANSAC¶

# Reshape freqs for RANSAC fit
freqs_ran = freqs.reshape([len(freqs), 1])

# Initialize and fit with RANSAC
ex_ransac_model = RANSACRegressor()
ex_ransac_model.fit(np.log10(freqs_ran), np.log10(psd))
example_results['ran'] = ex_ransac_model.estimator_.coef_[0]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_ransac_model.predict(np.log10(freqs_ran)))

# Plot the aperiodic fit, from RANSAC
plot_psds(freqs_ran.flatten(), [psd, pred_vals], log_freqs=True)

Spectral Fitting Excluding Alpha¶

Use a specific exclusion zone for the alpha oscillation.

# Exclude predefined alpha range
freqs_alph, psd_alph = exclude_spectrum(np.squeeze(freqs), psd, [7, 14])#, False)

# Reset freqs, with constant
fx_alph = sm.add_constant(np.log10(freqs_alph))

Simple OLS Linear Fit Exlucing Generic Alpha Band¶

# Linear fit without alpha range
ex_fit_ols_alph = sm.OLS(np.log10(psd_alph), fx_alph).fit()
example_results['ols_alph'] = ex_fit_ols_alph.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_ols_alph.fittedvalues)

# Plot the aperiodic fit, from an OLS linear fit with an alpha exclusion zone
plot_psds([freqs, freqs_alph], [psd, pred_vals], log_freqs=True)

Robust Fit with RLM Excluding Generic Alpha Band¶

# Robust fit without alpha range
ex_fit_rlm_alph = sm.RLM(np.log10(psd_alph), fx_alph).fit()
example_results['rlm_alph'] = ex_fit_rlm_alph.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_rlm_alph.fittedvalues)

# Plot the aperiodic fit, from the robust linear fit with an alpha exclusion zone
plot_psds([freqs, freqs_alph], [psd, pred_vals], log_freqs=True)

Robust Fit with RANSAC Excluding Generic Alpha Band¶

# Fit alpha-exclusion with RANSAC
ex_ransac_model_alph = RANSACRegressor()
ex_ransac_model_alph.fit(np.log10(freqs_alph), np.log10(psd_alph))
example_results['ran_alph'] = ex_ransac_model_alph.estimator_.coef_[0][0]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_ransac_model_alph.predict(np.log10(freqs_alph)))

# Plot the aperiodic fit, from the RANSAC fit with alpha exclusion zone
plot_psds([freqs, freqs_alph], [psd, pred_vals], log_freqs=True)

Spectral Fit Excluding Oscillations¶

Use a exclusion zones for oscillations.

# Fit spectral model for finding peaks
fm = FOOOF(peak_width_limits=[1, 8])
fm.fit(freqs, psd, [3, 40])

# Get oscillation definitions from FOOOF
cens = fm.gaussian_params_[:, 0]
bws = fm.gaussian_params_[:, 2]

# Define oscillation ranges, as a range around each center frequency
m = 2
osc_ranges = [[cen-m*bw, cen+m*bw] for cen, bw in zip(cens, bws)]

# Plot PSD with oscillatory regions shaded
plot_psds_shades(freqs, psd, osc_ranges)

# Exclude oscillation bands, as measured from FOOOF
psd_excl = psd
freqs_excl = np.squeeze(freqs)

for cen, bw in zip(cens, bws):
    freqs_excl, psd_excl = exclude_spectrum(freqs_excl, psd_excl, [cen-m*bw, cen+m*bw])

# Create a frequency vector with exlcusion zones for model fitting
fx_excl = sm.add_constant(np.log10(freqs_excl))

Simple Linear Fit Excluding Oscillatory Bands¶

# Linear fit without alpha range
ex_fit_ols_excl = sm.OLS(np.log10(psd_excl), fx_excl).fit()
example_results['ols_excl'] = ex_fit_ols_excl.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_ols_excl.fittedvalues)

# Plot the aperiodic fit, from the OLS linear fit with oscillation exclusion zones
plot_psds([freqs, freqs_excl], [psd, pred_vals], log_freqs=True)

Robust Fit with RLM Excluding Oscillatory Bands¶

# Robust fit without oscillation bands
ex_fit_rlm_excl = sm.RLM(np.log10(psd_excl), fx_excl).fit()
example_results['rlm_excl'] = ex_fit_rlm_excl.params[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_fit_rlm_excl.fittedvalues)

# Plot the aperiodic fit, from the robust linear fit with oscillation exclusion zones
plot_psds([freqs, freqs_excl], [psd, pred_vals], log_freqs=True)

Robust Fit with RANSAC Excluding Oscillatory Bands¶

# Fit alpha-exclusion with RANSAC
ex_ransac_model_excl = RANSACRegressor()
ex_ransac_model_excl.fit(np.log10(freqs_excl), np.log10(psd_excl))
example_results['ran_excl'] = ex_ransac_model_excl.estimator_.coef_[0][0]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, ex_ransac_model_excl.predict(np.log10(freqs_excl)))

# Plot the aperiodic fit, from the RANSAC fit with oscillation exclusion zones
plot_psds([freqs, freqs_excl], [psd, pred_vals], log_freqs=True)

Exponential Fits¶

Exponential Fit¶

# Fit exponential 1/f, with scipy curve_fit
ex_fit_exp, _ = curve_fit(expf, np.squeeze(freqs), np.squeeze(np.log10(psd)), p0=[1, 1])
example_results['exp'] = -ex_fit_exp[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, expf(freqs, *ex_fit_exp))

# Plot the aperiodic fit, from the exponential fit
plot_psds(freqs.flatten(), [psd, pred_vals], log_freqs=True)

Exponential Fit Excluding Alpha Region¶

# Fit exponential 1/f, with scipy curve_fit, excluding alpha range
ex_fit_exp_alph, _ = curve_fit(expf, np.squeeze(freqs_alph), np.squeeze(np.log10(psd_alph)), p0=[1, 1])
example_results['exp_alph'] = -ex_fit_exp_alph[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, expf(freqs_alph, *ex_fit_exp_alph))

# Plot the aperiodic fit, from the exponential fit, with an alpha exclusion zone
plot_psds([freqs, freqs_alph], [psd, pred_vals], log_freqs=True)

Exponential Fit Excluding Oscillation Regions¶

# Fit exponential 1/f, with scipy curve_fit, excluding oscillatory regions
ex_fit_exp_excl, _ = curve_fit(expf, np.squeeze(freqs_excl), np.squeeze(np.log10(psd_excl)), p0=[1, 1])
example_results['exp_excl'] = -ex_fit_exp_excl[1]

# Get the predicted values of the aperiodic fit
pred_vals = np.power(10, expf(freqs_excl, *ex_fit_exp_excl))

# Plot the aperiodic fit, from the exponential fit with oscillation exclusion zones
plot_psds([freqs, freqs_excl], [psd, pred_vals], log_freqs=True)

Spectral Parameterization¶

# Fit the SpecParam model
fm = FOOOF(verbose=False)
fm.report(freqs, psd, [3, 40])
example_results['fooof'] = -fm.aperiodic_params_[1]

==================================================================================================
                                                                                                  
                                   FOOOF - POWER SPECTRUM MODEL                                   
                                                                                                  
                        The model was run on the frequency range 3 - 40 Hz                        
                                 Frequency Resolution is 0.50 Hz                                  
                                                                                                  
                            Aperiodic Parameters (offset, exponent):                              
                                          0.5498, 0.6997                                          
                                                                                                  
                                       2 peaks were found:                                        
                                CF:  10.07, PW:  0.837, BW:  2.67                                 
                                CF:  18.31, PW:  0.259, BW:  3.10                                 
                                                                                                  
                                     Goodness of fit metrics:                                     
                                    R^2 of model fit is 0.9655                                    
                                    Error of the fit is 0.0510                                    
                                                                                                  
==================================================================================================

Compare Example Fits¶

# Print out results across all the different approaches
print('FIT RESULTS')
for key, val in example_results.items():
    print('  {:10s} \t {:1.3f}'.format(key, val))

FIT RESULTS
  ols        	 -0.933
  rlm        	 -0.813
  ran        	 -0.746
  ols_alph   	 -0.726
  rlm_alph   	 -0.723
  ran_alph   	 -1.020
  ols_excl   	 -0.710
  rlm_excl   	 -0.709
  ran_excl   	 -0.612
  exp        	 -0.933
  exp_alph   	 -0.726
  exp_excl   	 -0.710
  fooof      	 -0.700

# Check fits ordered by magnitude
for key, val  in dict(sorted(example_results.items(), key=lambda it: it[1])).items():
    print('{:10s} \t {:1.5f}'.format(key, val))

ran_alph   	 -1.02043
ols        	 -0.93306
exp        	 -0.93306
rlm        	 -0.81338
ran        	 -0.74555
exp_alph   	 -0.72646
ols_alph   	 -0.72646
rlm_alph   	 -0.72290
exp_excl   	 -0.70999
ols_excl   	 -0.70999
rlm_excl   	 -0.70878
fooof      	 -0.69973
ran_excl   	 -0.61167

Simulation Tests¶

Now that we have introduced the different methods, we will systematically evaluatate them across simulated data.

# Initialize and set up for simulated data testing
fits = SpectralFits()

# Check the fitting functions being used
print(fits.labels)

['OLS', 'OLS-EA', 'OLS-EO', 'RLM', 'RLM-EA', 'RLM-EO', 'RAN', 'RAN-EA', 'RAN-EO', 'EXP', 'EXP-EA', 'EXP-EO', 'SPECPARAM']

Check out PSD generation and test fitting¶

# Generate a single simulated test power spectrum
exp_val = 1
peak_gen = gen_peak_def()
peaks = next(peak_gen)

# Simulate an example power spectrum
freqs, psd = gen_power_spectrum(f_range, [off_val, exp_val], peaks, noise)

# Plot and check a simulated spectrum, plotted in semi-log and log space
plot_psds_two(freqs, psd, np.log10(freqs), np.log10(psd))

# Check example fits
print('Example fits for true exponent value of {:d}:'.format(exp_val))
for name, func in fits.fit_funcs.items():
    print('  {:5s} \t {:1.3f}'.format(name, func(freqs, psd)))

Example fits for true exponent value of 1:
  OLS   	 -1.138
  OLS-EA 	 -1.141
  OLS-EO 	 -1.039
  RLM   	 -1.089
  RLM-EA 	 -1.086
  RLM-EO 	 -1.002
  RAN   	 -1.089
  RAN-EA 	 -1.077
  RAN-EO 	 -1.039
  EXP   	 -1.138
  EXP-EA 	 -1.141
  EXP-EO 	 -1.039
  SPECPARAM 	 -1.054

Run Spectral Fits on Simulated Power Spectra¶

Run spectral fitting across multiple exponents and noise levels.

# Spectrum Settings
N_SPECTRA = 50
F_RANGE = [1, 50]

# Simulation Settings
NOISE_VALS = [0.0, 0.01, 0.05, 0.1, 0.15, 0.2, 0.25]
EXP_VALS = [0.5, 1.0, 1.5, 2.0, 2.5, 3.0]

# Set verbose setting
VERBOSE = False

# Run all fits
if RERUN:
    all_fits = SpectralFits()
    for exp in EXP_VALS:
        for noise in NOISE_VALS:

            if VERBOSE:
                print('Running sims for exponent val {}, noise-val {}'.format(exp, noise))

            # Simulate set of power spectra
            freqs, psds, sim_params = gen_group_power_spectra(\
                N_SPECTRA, F_RANGE, [off_val, exp], gen_peak_def(), noise, return_params=True)

            # Initialize SimFit object, and run fits
            fits = SpectralFits()
            fits.fit_spectra(exp, freqs, psds)

            # Collect results together
            all_fits = all_fits + fits

            # Save out fit data
            if SAVE_DATA:
                save_name = 'SpectralFits_exp' + str(exp) + '_N' + str(noise) + '.p'
                save_pickle(fits.errors, save_name, db.sims_path)
else:
    
    # Get list of simulated fit results files
    sim_files = db.check_files('sims')
    
    # Reload and combine all fit files
    all_fits = SpectralFits()

    for f_name in sim_files:
        temp = SpectralFits()
        temp.errors = load_pickle(f_name, db.sims_path)
        all_fits = all_fits + temp

# Check list of available fit files
#sim_files = db.check_files('sims')
#sim_files

# # Load specific spectral fits
# f_ind = 0
# fits = SpectralFits()
# fits.errors = load_pickle(sim_files[f_ind], db.sims_path)

Compare Spectral Fitting¶

# Check how many PSDs are included in simulated data model fits
print(len(all_fits))

# Check the average errors per method
print('Average error for each method:')
avg_errors = all_fits.compute_avg_errors(avg='mean')
print_results(avg_errors)

Average error for each method:
   RLM-EO   		 0.04816
   EXP-EO   		 0.05024
   OLS-EO   		 0.05024
   SPECPARAM 		 0.05396
   RLM-EA   		 0.06848
   RAN-EO   		 0.08225
   EXP-EA   		 0.08927
   OLS-EA   		 0.08927
   RAN-EA   		 0.09850
   RLM      		 0.10139
   RAN      		 0.11743
   OLS      		 0.13121
   EXP      		 0.13121

# Check the standard deviation of errors per method
print('Standard deviation of the errors for each method:')
std_errors = all_fits.compute_std_errors()
print_results(std_errors)

Standard deviation of the errors for each method:
   SPECPARAM 		 0.07266
   EXP-EO   		 0.07879
   OLS-EO   		 0.07879
   RLM-EO   		 0.07940
   RLM-EA   		 0.12418
   OLS-EA   		 0.13829
   EXP-EA   		 0.13829
   RAN-EO   		 0.15229
   RLM      		 0.16105
   OLS      		 0.17543
   EXP      		 0.17543
   RAN      		 0.21396
   RAN-EA   		 0.26758

# Check the number of fits that pass a threshold, per method
print('Percentage of fits below an error threshold:')
print_results(all_fits.compute_threshold(thresh=0.05))

Percentage of fits below an error threshold:
   RLM-EO   		 69.28571
   OLS-EO   		 68.09524
   EXP-EO   		 68.09524
   RLM-EA   		 64.52381
   SPECPARAM 		 64.42857
   RAN-EO   		 60.00000
   RAN-EA   		 59.04762
   OLS-EA   		 54.19048
   EXP-EA   		 54.19048
   RLM      		 53.85714
   RAN      		 49.14286
   OLS      		 43.57143
   EXP      		 43.57143

# Check the number of fits that pass a threshold, per method
print('Percentage of fits above an error threshold:')
print_results(all_fits.compute_threshold(thresh=0.5, direction='above'))

Percentage of fits above an error threshold:
   OLS      		 5.00000
   EXP      		 5.00000
   RAN      		 4.19048
   RLM      		 3.66667
   RAN-EA   		 3.09524
   OLS-EA   		 2.90476
   EXP-EA   		 2.90476
   RLM-EA   		 2.09524
   RAN-EO   		 1.90476
   RLM-EO   		 0.28571
   OLS-EO   		 0.23810
   EXP-EO   		 0.23810
   SPECPARAM 		 0.19048

Visualizations¶

# Define labels
mains = ['OLS', 'RLM', 'RAN', 'EXP', 'SPECPARAM']
alphas = ['OLS-EA', 'RLM-EA', 'RAN-EA', 'EXP-EA', 'SPECPARAM']
oscs = ['OLS-EO', 'RLM-EO', 'RAN-EO', 'EXP-EO', 'SPECPARAM']

# Split up methods by approach
main_dicts = {label : all_fits.errors[label] for label in mains}
alphas_dicts = {label : all_fits.errors[label] for label in alphas}
oscs_dicts = {label : all_fits.errors[label] for label in oscs}

Violin Plots¶

# Plot settings
ylim = [0, 0.75]

plot_violin_errors(main_dicts, ylim=ylim)

plot_violin_errors(alphas_dicts, ylim=ylim)

plot_violin_errors(oscs_dicts, ylim=ylim)

# Violin plot of error distributions
plot_violin_errors(all_fits.errors, ylim=[0, 1.5], figsize=[15, 4])

# Plot violin plots zoomed in
plot_violin_errors(all_fits.errors, ylim=[0, 0.2], figsize=[15, 4])
plt.gca().axhline(np.nanmedian(all_fits.errors['SPECPARAM']), color='red', alpha=0.5)

<matplotlib.lines.Line2D at 0x7fa1288ec640>

Statistically Compare Methods¶

# Create a dataframe of the simulation errors
errors_df = pd.DataFrame(all_fits.errors)

# Check the correlation structure between fit errors
plot_corr_matrix(errors_df.corr())

# Plot data distributions and inter-relations
pd.plotting.scatter_matrix(errors_df, figsize=[16, 12]);

# # Apply a normal test across fit-error distributions
# normalities = errors_df.apply(normaltest)

# # Check which results of normal test
# nt_df = pd.DataFrame([data[1] for data in normalities.values], index=normalities.index)
# nt_df.style.applymap(color_red_or_green)

# Run comparisons between methods
comps = all_fits.compare_errors()

# Print out color-coded dataframe of comparison results
comps_df = pd.DataFrame(comps, index=all_fits.errors.keys(), columns=all_fits.errors.keys())
comps_df.style.applymap(color_red_or_green)

	OLS	OLS-EA	OLS-EO	RLM	RLM-EA	RLM-EO	RAN	RAN-EA	RAN-EO	EXP	EXP-EA	EXP-EO	SPECPARAM
OLS	1.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000003	0.000000	0.000000	0.939383	0.000000	0.000000	0.000000
OLS-EA	0.000000	1.000000	0.000000	0.941894	0.000000	0.000000	0.000315	0.000591	0.000048	0.000000	0.939828	0.000000	0.000000
OLS-EO	0.000000	0.000000	1.000000	0.000000	0.205338	0.004267	0.000000	0.000000	0.000000	0.000000	0.000000	0.938208	0.070900
RLM	0.000000	0.941894	0.000000	1.000000	0.000000	0.000000	0.000673	0.002725	0.000293	0.000000	0.853769	0.000000	0.000000
RLM-EA	0.000000	0.000000	0.205338	0.000000	1.000000	0.000088	0.000000	0.000002	0.000050	0.000000	0.000000	0.250478	0.593834
RLM-EO	0.000000	0.000000	0.004267	0.000000	0.000088	1.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.002891	0.000004
RAN	0.000003	0.000315	0.000000	0.000673	0.000000	0.000000	1.000000	0.000000	0.000000	0.000003	0.000336	0.000000	0.000000
RAN-EA	0.000000	0.000591	0.000000	0.002725	0.000002	0.000000	0.000000	1.000000	0.460585	0.000000	0.000538	0.000000	0.000007
RAN-EO	0.000000	0.000048	0.000000	0.000293	0.000050	0.000000	0.000000	0.460585	1.000000	0.000000	0.000044	0.000000	0.000135
EXP	0.939383	0.000000	0.000000	0.000000	0.000000	0.000000	0.000003	0.000000	0.000000	1.000000	0.000000	0.000000	0.000000
EXP-EA	0.000000	0.939828	0.000000	0.853769	0.000000	0.000000	0.000336	0.000538	0.000044	0.000000	1.000000	0.000000	0.000000
EXP-EO	0.000000	0.000000	0.938208	0.000000	0.250478	0.002891	0.000000	0.000000	0.000000	0.000000	0.000000	1.000000	0.082048
SPECPARAM	0.000000	0.000000	0.070900	0.000000	0.593834	0.000004	0.000000	0.000007	0.000135	0.000000	0.000000	0.082048	0.999401

Conclusions¶

Overall, we can see the following patterns in these simulations:

Line fitting:

In general, the methods perform quite similarly in terms of average error, but have different variances of the errors
Standard robust fitting measures are not necessarily better for measuring aperiodic exponent
Overall, the spectral parameterization approach offers low average error and few outliers

Aperiodic Methods

Spectral Fitting

Contents