eelbrain.boosting

eelbrain.boosting(y, x, tstart, tstop, scale_data=True, delta=0.005, mindelta=None, error='l2', basis=0, basis_window='hamming', partitions=None, model=None, ds=None, selective_stopping=0, prefit=None, debug=False)

Estimate a linear filter with coordinate descent

Parameters:
y : NDVar

Signal to predict.

x : NDVar | sequence of NDVar

Signal to use to predict y. Can be sequence of NDVars to include multiple predictors. Time dimension must correspond to y.

tstart : float

Start of the TRF in seconds.

tstop : float

Stop of the TRF in seconds.

scale_data : bool | ‘inplace’

Scale y and x before boosting: subtract the mean and divide by the standard deviation (when error='l2') or the mean absolute value (when error='l1'). Use 'inplace' to save memory by scaling the original objects specified as y and x instead of making a copy.

delta : scalar

Step for changes in the kernel.

mindelta : scalar

If the error for the training data can’t be reduced, divide delta in half until delta < mindelta. The default is mindelta = delta, i.e. delta is constant.

error : ‘l2’ | ‘l1’

Error function to use (default is l2).

  • error='l1': the sum of the absolute differences between y and h * x.
  • error='l2': the sum of the squared differences between y and h * x.

For vector y, the error is defined based on the distance in space for each data point.

basis : float

Use a basis of windows with this length for the kernel (by default, impulses are used).

basis_window : str | float | tuple

Basis window (see scipy.signal.get_window() for options; default is 'hamming').

partitions : int

Divide the data into this many partitions for cross-validation-based early stopping. In each partition, n - 1 segments are used for training, and the remaining segment is used for validation. If data is continuous, data are divided into contiguous segments of equal length (default 10). If data has cases, cases are divided with [::partitions] slices (default min(n_cases, 10); if model is specified, n_cases is the lowest number of cases in any cell of the model).

model : Categorial

If data has cases, divide cases into different categories (division for crossvalidation is done separately for each cell).

ds : Dataset

If provided, other parameters can be specified as string for items in ds.

selective_stopping : int

By default, the boosting algorithm stops when the testing error stops decreasing. With selective_stopping=True, boosting continues but excludes the predictor (one time-series in x) that caused the increase in testing error, and continues until all predictors are stopped. The integer value of selective_stopping determines after how many steps with error increases each predictor is excluded.

prefit : BoostingResult

Apply predictions from these estimated response functions before fitting the remaining predictors. This requires that x contains all the predictors that were used to fit prefit, and that they be identical to the originally used x.

debug : bool

Store additional properties in the result object (increases memory consumption).

Returns:
result : BoostingResult

Results (see BoostingResult).

Notes

In order to predict data, use the convolve() function:

>>> ds = datasets.get_uts()
>>> ds['a1'] = epoch_impulse_predictor('uts', 'A=="a1"', ds=ds)
>>> ds['a0'] = epoch_impulse_predictor('uts', 'A=="a0"', ds=ds)
>>> res = boosting('uts', ['a0', 'a1'], 0, 0.5, partitions=10, model='A', ds=ds)
>>> y_pred = convolve(res.h_scaled, ['a0', 'a1'], ds=ds)

The boosting algorithm is described in [1].

References

[1]David, S. V., Mesgarani, N., & Shamma, S. A. (2007). Estimating sparse spectro-temporal receptive fields with natural stimuli. Network: Computation in Neural Systems, 18(3), 191-212. 10.1080/09548980701609235.