Note
Click here to download the full example code
Align datasets¶
Shows how to combine information from two datasets describing the same cases, but not necessarily in the same order.
# Author: Christian Brodbeck <christianbrodbeck@nyu.edu>
import random
import string
from eelbrain import *
# Generate a dataset with known sequence
ds = Dataset()
ds['ascii'] = Factor(string.ascii_lowercase)
# Add an index variable to the dataset to later identify the cases
ds.index()
# Generate two shuffled copies of the dataset (and print them to confirm that
# they are shuffled)
ds1 = ds[random.sample(range(ds.n_cases), 15)]
print(ds1.head())
ds2 = ds[random.sample(range(ds.n_cases), 16)]
print(ds2.head())
Out:
ascii index
-------------
k 10
f 5
g 6
l 11
i 8
e 4
u 20
j 9
a 0
r 17
ascii index
-------------
n 13
r 17
z 25
x 23
f 5
p 15
w 22
o 14
a 0
m 12
Align the datasets¶
Use the "index"
variable added above to identify cases and align the two
datasets
ds1_aligned, ds2_aligned = align(ds1, ds2, 'index')
# show the ascii sequences for the two datasets next to each other to
# demonstrate that they are aligned
ds1_aligned['ascii_ds2'] = ds2_aligned['ascii']
print(ds1_aligned)
Out:
ascii index ascii_ds2
-------------------------
f 5 f
g 6 g
i 8 i
a 0 a
r 17 r
p 15 p
d 3 d