Align datasets

Shows how to combine information from two datasets describing the same cases, but not necessarily in the same order.

# Author: Christian Brodbeck <christianbrodbeck@nyu.edu>
import random
import string

from eelbrain import *


# Generate a dataset with known sequence
ds = Dataset()
ds['ascii'] = Factor(string.ascii_lowercase)
# Add an index variable to the dataset to later identify the cases
ds.index()

# Generate two shuffled copies of the dataset (and print them to confirm that
# they are shuffled)
ds1 = ds[random.sample(range(ds.n_cases), 15)]
ds1.head()
# ascii index
0 g 6
1 k 10
2 e 4
3 b 1
4 f 5
5 y 24
6 z 25
7 n 13
8 h 7
9 w 22


# ascii index
0 n 13
1 s 18
2 z 25
3 p 15
4 b 1
5 e 4
6 g 6
7 o 14
8 k 10
9 d 3


Align the datasets

Use the "index" variable added above to identify cases and align the two datasets

ds1_aligned, ds2_aligned = align(ds1, ds2, 'index')

# show the ascii sequences for the two datasets next to each other to
# demonstrate that they are aligned
ds1_aligned['ascii_ds2'] = ds2_aligned['ascii']
ds1_aligned
# ascii index ascii_ds2
0 g 6 g
1 k 10 k
2 e 4 e
3 b 1 b
4 y 24 y
5 z 25 z
6 n 13 n
7 h 7 h
8 u 20 u
9 l 11 l
10 x 23 x


Gallery generated by Sphinx-Gallery