eelbrain.boosting¶
-
eelbrain.
boosting
(y, x, tstart, tstop, scale_data=True, delta=0.005, mindelta=None, error='l2', basis=0, basis_window='hamming', partitions=None, model=None, ds=None, selective_stopping=0, prefit=None, debug=False)¶ Estimate a linear filter with coordinate descent
Parameters: - y : NDVar
Signal to predict.
- x : NDVar | sequence of NDVar
Signal to use to predict
y
. Can be sequence of NDVars to include multiple predictors. Time dimension must correspond toy
.- tstart : float
Start of the TRF in seconds.
- tstop : float
Stop of the TRF in seconds.
- scale_data : bool | ‘inplace’
Scale
y
andx
before boosting: subtract the mean and divide by the standard deviation (whenerror='l2'
) or the mean absolute value (whenerror='l1'
). Use'inplace'
to save memory by scaling the original objects specified asy
andx
instead of making a copy.- delta : scalar
Step for changes in the kernel.
- mindelta : scalar
If the error for the training data can’t be reduced, divide
delta
in half untildelta < mindelta
. The default ismindelta = delta
, i.e.delta
is constant.- error : ‘l2’ | ‘l1’
Error function to use (default is
l2
).error='l1'
: the sum of the absolute differences betweeny
andh * x
.error='l2'
: the sum of the squared differences betweeny
andh * x
.
For vector
y
, the error is defined based on the distance in space for each data point.- basis : float
Use a basis of windows with this length for the kernel (by default, impulses are used).
- basis_window : str | float | tuple
Basis window (see
scipy.signal.get_window()
for options; default is'hamming'
).- partitions : int
Divide the data into this many
partitions
for cross-validation-based early stopping. In each partition,n - 1
segments are used for training, and the remaining segment is used for validation. If data is continuous, data are divided into contiguous segments of equal length (default 10). If data has cases, cases are divided with[::partitions]
slices (defaultmin(n_cases, 10)
; ifmodel
is specified,n_cases
is the lowest number of cases in any cell of the model).- model : Categorial
If data has cases, divide cases into different categories (division for crossvalidation is done separately for each cell).
- ds : Dataset
If provided, other parameters can be specified as string for items in
ds
.- selective_stopping : int
By default, the boosting algorithm stops when the testing error stops decreasing. With
selective_stopping=True
, boosting continues but excludes the predictor (one time-series inx
) that caused the increase in testing error, and continues until all predictors are stopped. The integer value ofselective_stopping
determines after how many steps with error increases each predictor is excluded.- prefit : BoostingResult
Apply predictions from these estimated response functions before fitting the remaining predictors. This requires that
x
contains all the predictors that were used to fitprefit
, and that they be identical to the originally usedx
.- debug : bool
Store additional properties in the result object (increases memory consumption).
Returns: - result : BoostingResult
Results (see
BoostingResult
).
Notes
In order to predict data, use the
convolve()
function:>>> ds = datasets.get_uts() >>> ds['a1'] = epoch_impulse_predictor('uts', 'A=="a1"', ds=ds) >>> ds['a0'] = epoch_impulse_predictor('uts', 'A=="a0"', ds=ds) >>> res = boosting('uts', ['a0', 'a1'], 0, 0.5, partitions=10, model='A', ds=ds) >>> y_pred = convolve(res.h_scaled, ['a0', 'a1'], ds=ds)
The boosting algorithm is described in [1].
References
[1] David, S. V., Mesgarani, N., & Shamma, S. A. (2007). Estimating sparse spectro-temporal receptive fields with natural stimuli. Network: Computation in Neural Systems, 18(3), 191-212. 10.1080/09548980701609235.