Dataset basics

# Author: Christian Brodbeck <christianbrodbeck@nyu.edu>
from eelbrain import *
import numpy as np

A dataset can be constructed column by column, by adding one variable after another:

# initialize an empty Dataset:
ds = Dataset()
# numeric values are added as Var object:
ds['y'] = Var(np.random.normal(0, 1, 6))
# categorical data as represented in Factors:
ds['a'] = Factor(['a', 'b', 'c'], repeat=2)
# check the result:
print(ds)

Out:

y           a
-------------
-0.035148   a
-0.084978   a
0.26078     b
-0.76769    b
-1.4068     c
-0.68582    c

For larger datasets it can be more convenient to print only the first few cases…

print(ds.head())

Out:

y           a
-------------
-0.035148   a
-0.084978   a
0.26078     b
-0.76769    b
-1.4068     c
-0.68582    c

… or a summary of variables:

print(ds.summary())

Out:

Key   Type     Values
-------------------------------------------------------------------------------
y     Var      -1.40682, -0.767689, -0.685822, -0.0849778, -0.0351483, 0.260784
a     Factor   a:2, b:2, c:2
-------------------------------------------------------------------------------
Dataset: 6 cases

An alternative way of constructing a dataset is case by case (i.e., row by row):

rows = []
for i in range(6):
    subject = f'S{i}'
    y = np.random.normal(0, 1)
    a = 'abc'[i % 3]
    rows.append([subject, y, a])
ds = Dataset.from_caselist(['subject', 'y', 'a'], rows, random='subject')
print(ds)

Out:

subject   y          a
----------------------
S0        1.3183     a
S1        -0.84995   b
S2        1.755      c
S3        -0.5812    a
S4        0.54568    b
S5        0.65884    c

Gallery generated by Sphinx-Gallery