Simulation Comparisons

This notebook introduces how this prokect compares the different methods using simulations.

Overview

A key question for interpreting prior work is to consider how the different employed methods relate to each other. To investigate this, this project uses simulated data and applies different methods to compare their results.

In this section, the main question is the evaluate the relationship between different methods across parameters variations, to evaluate which methods are highly correlated (suggesting they reflect the same properties in the data), and which appear to be more independent (suggesting they reflect different properties in the data).

Note that due to large number of possible comparisons across different methods and different simulated parameters, in this section we necessarily restrict the comparison to a selected subset of methods, that are compared pairwise.

# Setup notebook state
from nbutils import setup_notebook; setup_notebook()
import numpy as np
from fooof.plts import plot_spectra
from neurodsp.spectral import compute_spectrum
from neurodsp.spectral.utils import trim_spectrum
from neurodsp.sim.multi import sig_sampler
from neurodsp.sim import sim_powerlaw, sim_combined
from neurodsp.sim.update import ParamSampler, create_updater, create_sampler
from neurodsp.utils import create_times, set_random_seed
from neurodsp.plts.combined import plot_timeseries_and_spectra
# Import custom project code
from apm.io import APMDB
from apm.run import run_comparisons
from apm.analysis import compute_all_corrs
from apm.plts import plot_dots
from apm.plts.utils import figsaver
from apm.plts.settings import COLORS
from apm.plts.style import custom_psd_style_no_ticks
from apm.sim.defs import SIM_SAMPLERS

Settings

# Settings for saving figures
SAVE_FIG = True
FIGPATH = APMDB().figs_path / '14_sim_comps'

# Create helper function to manage figsaver settings
fsaver = figsaver(SAVE_FIG, FIGPATH)
# Set random seed
set_random_seed(111)

Simulations

In this section, we will use simulated data to compare methods.

# Create a times vector for time series
times = create_times(**SIM_SAMPLERS.base)
# Define general settings for creating simulations
N_SIMS = 3
RETURN_PARAMS = True

Sample Simulations

# Define ranges to sample parameters from
samplers_ap = {
    create_updater('exponent') : create_sampler(np.round(np.arange(-2.5, 0.1, 0.1), 1)),
}
# Define a parameter sampler
#sampler = param_sampler(SIM_SAMPLERS.params['ap'], samplers_ap, N_SIMS)
sampler = ParamSampler(SIM_SAMPLERS.params['ap'], samplers_ap, N_SIMS)
# Check example sampled parameters
for params in sampler:
    print(params)
{'n_seconds': 30, 'fs': 250, 'exponent': -0.5, 'f_range': (0.5, None)}
{'n_seconds': 30, 'fs': 250, 'exponent': -1.3, 'f_range': (0.5, None)}
{'n_seconds': 30, 'fs': 250, 'exponent': -0.5, 'f_range': (0.5, None)}

Example signals - aperiodic

# Initialize signal generator for aperiodic signals
sig_gen_ap = sig_sampler(sim_powerlaw, sampler, RETURN_PARAMS)
# Plot example generated signals
for cur_sig_ap, cur_params_ap in sig_gen_ap:
    plot_timeseries_and_spectra(cur_sig_ap, SIM_SAMPLERS.fs, ts_range=[0, 10],
                                title=str('Exponent: {:1.2f}'.format(cur_params_ap['exponent'])))
../_images/14-SimulationComparisons_18_0.png ../_images/14-SimulationComparisons_18_1.png ../_images/14-SimulationComparisons_18_2.png

SimSamplers

# Check pre-defined samplers available in imported object
SIM_SAMPLERS.labels
['exp_sampler',
 'tscale_sampler',
 'knee_sampler',
 'comb_sampler',
 'peak_sampler']
# Reset number of samples
SIM_SAMPLERS.n_samples = 3

Example signals - combined

# Initialize signal generator for aperiodic signals
sig_gen_comb = sig_sampler(sim_combined, SIM_SAMPLERS['comb_sampler'], RETURN_PARAMS)
# Plot example generated signals
for cur_sig_comb, cur_params_comb in sig_gen_comb:
    plot_timeseries_and_spectra(cur_sig_comb, SIM_SAMPLERS.fs, ts_range=[0, 10],
                                title='Exp : {:1.2f},  CF: {:2.0f},  PW: {:1.1f}'.format(\
                                    cur_params_comb['components']['sim_powerlaw']['exponent'], 
                                    cur_params_comb['components']['sim_oscillation']['freq'], 
                                    cur_params_comb['component_variances'][1]))
../_images/14-SimulationComparisons_24_0.png ../_images/14-SimulationComparisons_24_1.png ../_images/14-SimulationComparisons_24_2.png

Plot Example Power Spectra

Here we compute and plot power spectra of example sampled simulations.

# Reset random seed
set_random_seed(88)
# Reinitialize signal generator for aperiodic signals
sig_gen_comb = sig_sampler(sim_combined, SIM_SAMPLERS['comb_sampler'], RETURN_PARAMS, N_SIMS)
# Compute and collect power spectra of some sampled simulations
psds = []
for tsig, _ in sig_gen_comb:
    freqs, psd = trim_spectrum(*compute_spectrum(tsig, fs=SIM_SAMPLERS.fs, nperseg=500), [3, 50])
    psds.append(psd)
# Plot the collection of computed power spectra of the simulations
plot_spectra(freqs, psds, log_freqs=True, log_powers=True,
             colors='black', alpha=0.5, figsize=(8.5, 6.5),
             custom_styler=custom_psd_style_no_ticks, **fsaver('psds'))
../_images/14-SimulationComparisons_29_0.png

Code Approach: run_comparisons

Here, we will briefly introduce the general strategy and code used to run the method comparisons on simulations.

The overarching function used to run simulation comparisons is the run_comparisons function.

This approach allows for:

  • defining a procedure to simulate time series

  • defining a set of measures to apply to the simulated time series

  • applying the set of measures across simulated instances, sampling from parameter ranges

# Check the documentation for `run_comparisons`
print(run_comparisons.__doc__)
Compute multiple measures of interest across the same set of simulations.

    Parameters
    ----------
    sim_func : callable
        A function to create simulated time series.
    sim_params : iterable or list of dict
        Simulation parameters for `sim_func`.
    measures : dict
        Functions to apply to the simulated data.
        The keys should be functions to apply to the data.
        The values should be a dictionary of parameters to use for the method.
    n_sims : int, optional
        The number of simulations to run.
    return_params : bool, default: False
        Whether to collect and return the parameters of all the generated simulations.
    verbose : bool, optional, default: False
        Whether to print out simulation parameters.
        Used for checking simulations / debugging.
    warnings_action : {'ignore', 'error', 'always', 'default', 'module, 'once'}
        Filter action for warnings.

    Returns
    -------
    results : dict
        Computed results for each measure across the set of simulated data.
    all_sim_params : pd.DataFrame
        Collected simulation parameters across all the simulations.
        Only returned if `return_params` is True.
    

Next, we can run an example of using run_comparisons.

To do so, we will define an example analysis to apply some measures of interest (here, computing the mean and the variance) across samples of simulations of powerlaw data.

# Define the measures to apply to the simulated signals
measures = {np.var : {}, np.mean : {}}

Run simulations across variations in exponent

# Run comparisons across samples of aperiodic noise
results1, all_sim_params1 = run_comparisons(\
    sim_powerlaw, SIM_SAMPLERS['comb_sampler'], measures, return_params=True, verbose=True) 
{'n_seconds': 30, 'fs': 250, 'components': {'sim_powerlaw': {'exponent': 0.0, 'f_range': (0.5, None)}, 'sim_oscillation': {'freq': 17}}, 'component_variances': [1, 0.2]}
{'n_seconds': 30, 'fs': 250, 'components': {'sim_powerlaw': {'exponent': -0.4, 'f_range': (0.5, None)}, 'sim_oscillation': {'freq': 35}}, 'component_variances': [1, 1.0]}
{'n_seconds': 30, 'fs': 250, 'components': {'sim_powerlaw': {'exponent': 0.0, 'f_range': (0.5, None)}, 'sim_oscillation': {'freq': 13}}, 'component_variances': [1, 0.3]}
# Check output simulation parameters
all_sim_params1
n_seconds fs exponent f_range freq var_ap var_pe has_osc
0 30 250 0.0 (0.5, None) 17 1 0.2 True
1 30 250 -0.4 (0.5, None) 35 1 1.0 True
2 30 250 0.0 (0.5, None) 13 1 0.3 True
# Check output values of computed measures
results1
{'var': array([1., 1., 1.]),
 'mean': array([-1.5158245e-17, -1.5158245e-17,  0.0000000e+00])}
# Compute the correlations between output measures
compute_all_corrs(results1)
/Users/tom/opt/anaconda3/envs/apm/lib/python3.9/site-packages/scipy/stats/_stats_py.py:5445: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
  warnings.warn(stats.ConstantInputWarning(warn_msg))
{'var': {'mean': (-1.0, 0.0, (nan, nan))},
 'mean': {'var': (-1.0, 0.0, (nan, nan))}}

Run simulations across variations in multiple parameters

# Set sampler to run indefinitely
SIM_SAMPLERS.n_samples = None
# Run comparisons across samples of aperiodic noise
results2, all_sim_params2 = run_comparisons(\
    sim_combined, SIM_SAMPLERS['comb_sampler'], measures, n_sims=25, return_params=True) 
# Check output simulation parameters
all_sim_params2.head(5)
n_seconds fs exponent f_range freq var_ap var_pe has_osc
0 30 250 -0.3 (0.5, None) 5 1 0.9 True
1 30 250 -1.2 (0.5, None) 34 1 0.8 True
2 30 250 -1.2 (0.5, None) 13 1 0.3 True
3 30 250 -1.8 (0.5, None) 6 1 0.0 False
4 30 250 -0.5 (0.5, None) 17 1 0.0 False
# Check output values of computed measures
results2
{'var': array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1.]),
 'mean': array([-1.18423789e-18,  1.89478063e-17, -6.63173220e-18, -2.36847579e-18,
         1.87109587e-17,  1.55135164e-17, -1.37371596e-17,  1.24344979e-17,
         1.89478063e-17, -1.04212935e-17, -2.94875235e-17,  1.89478063e-18,
         2.32110627e-17,  1.04212935e-17, -6.15803704e-18, -1.04212935e-17,
         1.70530257e-17, -2.51058433e-17,  3.41060513e-17, -5.56591810e-18,
        -1.13686838e-17, -1.13686838e-17, -9.47390314e-18,  1.13686838e-17,
        -1.20792265e-17])}
# Compute the correlations between output measures
compute_all_corrs(results2)
{'var': {'mean': (-0.20234118532764883,
   0.33204969861238676,
   (-0.5257729651646598, 0.15932259740004787))},
 'mean': {'var': (-0.20234118532764883,
   0.33204969861238676,
   (-0.5257729651646598, 0.15932259740004787))}}

Evaluating Results

After computing the measures, we can examine the results, comparing between different measurements.

# Collect colors for each value based on presence of an oscillation
colors = [COLORS['COMB'] if osc else COLORS['AP'] for osc in all_sim_params2.has_osc]
# Plot the computed measures against each other
plot_dots(results2['var'], results2['mean'], c=colors, alpha=0.8)
../_images/14-SimulationComparisons_47_0.png