This module provides Polygram; a container for biological time series such as Signal and Annotation. In this respect, it is inspired from pandas TimeSeries and DataFrame. You can think about it as a dataframe where each column is a signal, or an annotation, and each row a time point.
The originality of :~pyrem.polygram.Polygram is to be able to deal with heterogeneous (between signals) sampling rates. It contains time series with the same approximate duration, but different number of points. This is typical when dealing with physiological time series because different variable will be recorded at different sampling rates (see for instance, the [EDF] data format). Another situation it which this could be useful, is when performing a wavelet decomposition of a signal. Indeed, one would obtain a set of time series (coefficients) of the same duration, but with different sampling rates (i.e. \(fs_{D_N} = 2fs_{D_{N+1}}\)).
Systematically resampling signals, and annotations, to the maximal sampling rate is not trivial, and would impact significantly computational efficiency.
[EDF] | B. Kemp and J. Olivan, “European data format ‘plus’ (EDF+), an EDF alike standard format for the exchange of physiological data,” Clinical Neurophysiology, vol. 114, no. 9, pp. 1755-1761, Sep. 2003. |
First, let us create a couple of BiolgicalTimeSeries:
>>> import numpy as np
>>> from pyrem.time_series import Annotation, Signal
>>> from pyrem.polygram import Polygram
>>>
>>> # create an Annotation with 1000 random values, sampled at 1.0Hz
>>> probs = np.random.random_sample(1000)
>>> vals = (np.random.random_sample(1000) * 4 +1).astype(np.int)
>>> annot = Annotation(vals,fs=1.0, observation_probabilities=probs, type="vigilance_state", name="state")
>>>
>>> # now a random walk signal of 100000 points at 100.0Hz
>>> rw = np.cumsum(np.random.normal(0,1,100000))
>>> sig = Signal(rw, fs=100.0,type="eeg", name="eeg1")
>>>
>>> # Once we have our time series, we can just do:
>>> pol = Polygram([annot, sig])
>>> #printing the object shows the characteristic of each channels
>>> pol
Polygram
-----
Duration: 0:16:40 (HH:mm:ss)
N signals: 1
N annotations: 1
Metadata:
None
----
Channel information:
Name Type fs(Hz) Duration
0 eeg1 eeg 100.0 0:16:40
1 state vigilance_state 1.0 0:16:40
Note
Slightly different durations are allowed
The constructor will raise an error if the provided channels do not have the same duration:
>>> Polygram([annot[:"11m"], sig[:"10m"]])
ValueError
'Channels must have approximately the same length.
The durations of the input channels are:['0:10:00', '0:11:00']'
However, in practice, it is almost impossible to obtain discrete signal of the exact same duration. Imagine, for instance that you have a first signal of 14 points at 3Hz (~ 4.667s), and a second signal of 5 points at 1Hz (5.0s). In this case, it is impossible to have exactly 14/3s of signal form a 1Hz signal. This could be represented by:
>>> 0123456789abcd- # 3Hz => one symbol/point
>>> AAABBBCCCDDDEEE # 1Hz => one LETTER/point
>>> AAABBBCCCDDD--- # 1Hz => one LETTER/point
Here, neither the second nor the third signal match, exactly, the duration of the first, but bot are approximately the same duration as the first.
A Polygram will tolerate this sort of mismatch if and only if all pairs of channels are within one period of the time series with the channel longest period.
Often, you will want to extract a channel by name:
>>> pol.channel_names
['eeg1', 'state']
>>> pol['eeg1']
Signal
----
Name: eeg1
Duration: 0:16:40 (HH:mm:ss)
Sampling freq: 100.000000 Hz
Type: eeg
N points: 100000
Metadata:
None
>>> # this is equivalent to
>>> pol[0]
You can also iterate through channels:
>>> [c.size for c in pol.channels]
[100000, 1000]
Because time series are potentially at different sampling rates, it makes no sense to index a polygram by range of integers:
>>> #does NOT work
>>> # pol[10:20]
Instead, time string and datetime.timedelta can be used for extracting a sub_polygram:
>>> pol["1m":"2m"]
Indexing rules are similar to time_series
Note
Indexing does NOT deep copy
When getting an epoch (temporal slice), of a polygram, the channel in the new polygram are views to the underlying data of the original channel. Like for numpy arrays, modifying the data in a sub-polygram will modify the parent polygram. To avoid this behaviour, one can call copy()
If you want to extract features for each epoch and each channel, you may want to use th iter_window() iterator. It works like the iter_window()
Bases: object
Parameters: |
|
---|
Deep copy of an Polygram
Returns: | a new Polygram with the same values |
---|---|
Return type: | Polygram |
Returns: | The duration total of the polygram. That is the duration of the channel with the longest duration |
---|---|
Return type: | datetime.timedelta |
Applies a function to all signal channels and returns a new Polygram with modified channels
An example of how to normalise all signal channels
>>> pol_norm = pol.map_signal_channels(
>>> lambda x: (x - np.mean(x))/np.std(x))
>>> np.mean(pol[0])
>>> np.mean(pol_norm[0])
Parameters: | fun (callable) – a function to be applied |
---|---|
Returns: | a new polygram |
Return type: | Polygram |
Adds channels from a polygram to another polygram, or append a time series to a polygram
Parameters: |
|
---|