Time Series Measures¶
The Empirical Distributions and Shannon Information Measures come together to make information measures on time series almost trivial to implement. Every such measure amounts to constructing distributions and applying an information measure.
Notation¶
Throughout this section we will denote random variables as \(X, Y, \ldots\), and let \(x_i, y_i, \ldots\) represent the \(i\)th time step of a time series drawn a random variable. Many of the measures consider \(k\)histories (a.k.a \(k\)blocks) of the time series, e.g. \(x^{(k)}_i = \{x_{ik+1}, x_{ik+2}, \ldots, x_i\}\).
For the sake of conciseness, when denoting probability distributions, we will only make the random variable explicit in situations where the notation is ambiguous. Generally, we will write \(p(x_i)\), \(p(x^{(k)}_i)\) and \(p(x^{(k)}_i, x_{i+1})\) to denote the empirical probability of obseriving the \(x_i\) state, the \(x^{(k)}_i\) \(k\)history, and the joint probability of observing \((x^{(k)}_i, x_{i+1})\).
Please report any notational ambiguities as an issue.
Subtle Details¶
The library takes several liberties in the way in which the time series measures are implemented.
The Base: States and Logarithms¶
The word “base” has two different meanings in the context of the information measures on time series. It could refer to the base of the time series itself, that is the number of unique states in the time series. For example, the time series \(\{0,2,1,0,0\}\) has a base of 3. On the other handle it could refer to the base of the logarithm used in computing the information content of the emipirical distributions. The problem is that these two meanings clash. The base of the time series affects the range of values the measure can produce, and the base of the logarithm represents a rescaling of those values.
The following measures use one of two conventions. The measures of information dynamics (e.g. Active Information, Entropy Rate and Transfer Entropy) take as an argument the base of the state and use that as the base of the logarithm. The result is that the timeaveraged values of those measures are in the unit range. An exception to this rule is the block entropy. It two uses this convention, but its value will not be in the unit range unless the block size \(k\) is 1 or the specified base is \(2^k\) (or you could just divide by \(k\)). The second convention is to take both the base of the time series and the base of the logarithm. This is about as unambiguous as it gets. This approach is used for the measures that do not make explicit use of a history length (or block size), e.g. Mutual Information, Conditional Entropy, etc…
Coming releases may revise the handling of the bases, but until then each function’s documentation will specify how the base is used.
Multiple Initial Conditions¶
PyInform tries to provide handling of multiple initial conditions. The “proper” way to handle initial conditions is a bit contested. One completely reasonable approach is to apply the information measures to each initial condition’s time series independently and then average. One can think of this approach as conditioning the measure on the inital condition. The second approach is to independently use all of the initial conditions to construct the various probability distributions. You can think of this approach as rolling the uncertainty of the initial condition into the measure. [1]
The current implementation takes the second approach. The accpeted time series can be up to 2D with each row representing the time series for a different initial condition. We chose to take the second approach because the “measure then average” method can still be done with the current implimentation. For an example of this, see the example section of Active Information.
Subsequent releases may provide a mechanism for specifying a how the user prefers the initial conditions to be handled, but at the moment the user has to make it happen manually.
[1]  There is actually at least three ways to handle multiple initial conditions, but the third method is related to the first described in the text by the addition of the entropy of the distribution over initial conditions. In this approach, the initial condition is considered as a random variable. 
Active Information¶
Active information (AI) was introduced in [Lizier2012] to quantify information storage in distributed computation. Active information is defined in terms of a temporally local variant
where the probabilities are constructed empirically from the entire time series. From the local variant, the temporally global active information as
Strictly speaking, the local and average active information are defined as
but we do not provide limiting functionality in this library (yet!).
Examples¶
A Single Initial Condition¶
The typical usage is to provide the time series as a sequence (or
numpy.ndarray
) and the history length as an integer and let the
active_info()
sort out the rest:
>>> active_info([0,0,1,1,1,1,0,0,0], k=2)
0.3059584928680419
>>> active_info([0,0,1,1,1,1,0,0,0], k=2, local=True)
array([[0.19264508, 0.80735492, 0.22239242, 0.22239242, 0.36257008,
1.22239242, 0.22239242]])
You can always override the base, but be careful:
>>> active_info([0,0,1,1,2,2], k=2)
0.6309297535714575
>>> active_info([0,0,1,1,2,2], k=2, b=3)
0.6309297535714575
>>> active_info([0,0,1,1,2,2], k=2, b=4)
0.5
>>> active_info([0,0,1,1,2,2], k=2, b=2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyinform/activeinfo.py", line 126, in active_info
File "pyinform/error.py", line 57, in error_guard
raise InformError(e,func)
pyinform.error.InformError: an inform error occurred  "unexpected state in timeseries"
Multiple Initial Conditions¶
What about multiple initial conditions? We’ve got that covered!
>>> active_info([[0,0,1,1,1,1,0,0,0], [1,0,0,1,0,0,1,0,0]], k=2)
0.35987902873686073
>>> active_info([[0,0,1,1,1,1,0,0,0], [1,0,0,1,0,0,1,0,0]], k=2, local=True)
array([[ 0.80735492, 0.36257008, 0.63742992, 0.63742992, 0.77760758,
0.80735492, 1.19264508],
[ 0.80735492, 0.80735492, 0.22239242, 0.80735492, 0.80735492,
0.22239242, 0.80735492]])
As mentioned in Subtle Details, averaging the AI for over the initial conditions does not give the same result as constructing the distributions using all of the initial conditions together.
>>> import numpy as np
>>> series = np.asarray([[0,0,1,1,1,1,0,0,0], [1,0,0,1,0,0,1,0,0]])
>>> np.apply_along_axis(active_info, 1, series, 2).mean()
0.58453953071733644
Or if you are feeling verbose:
>>> ai = np.empty(len(series))
>>> for i, xs in enumerate(series):
... ai[i] = active_info(xs, k=2)
...
>>> ai
array([ 0.30595849, 0.86312057])
>>> ai.mean()
0.58453953071733644
API Documentation¶

pyinform.activeinfo.
active_info
(series, k, b=0, local=False)[source]¶ Compute the average or local active information of a timeseries with history length k.
If the base b is not specified (or is 0), then it is inferred from the time series with 2 as a minimum. b must be at least the base of the time series and is used as the base of the logarithm.
Parameters:  series (sequence or
numpy.ndarray
) – the time series  k (int) – the history length
 b (int) – the base of the time series and logarithm
 local (bool) – compute the local active information
Returns: the average or local active information
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series has no initial conditions
 ValueError – if the time series is greater than 2D
 InformError – if an error occurs within the
inform
C call
 series (sequence or
Block Entropy¶
Block entropy, also known as Ngram entropy [Shannon1948], is the the standard Shannon entropy applied to the time series (or sequence) of \(k\)histories of a time series (or sequence):
which of course reduces to the traditional Shannon entropy for k == 1
. Much
as with Active Information, the ideal usage is to take
\(k \rightarrow \infty\).
Examples¶
A Single Initial Condition¶
The typical usage is to provide the time series as a sequence (or
numpy.ndarray
) and the block size as an integer and let the
block_entropy()
sort out the rest:
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=1)
0.9910760598382222
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=1, local=True)
array([[ 0.84799691, 0.84799691, 1.169925 , 1.169925 , 1.169925 ,
1.169925 , 0.84799691, 0.84799691, 0.84799691]])
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=2)
1.811278124459133
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=2, local=True)
array([[ 1.4150375, 3. , 1.4150375, 1.4150375, 1.4150375,
3. , 1.4150375, 1.4150375]])
You can override the base so that the entropy is in the unit interval:
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=2, b=4)
0.9056390622295665
>>> block_entropy([0,0,1,1,1,1,0,0,0], k=2, local=True)
array([[ 0.70751875, 1.5 , 0.70751875, 0.70751875, 0.70751875,
1.5 , 0.70751875, 0.70751875]])
Multiple Initial Conditions¶
Do we support multiple initial conditions? Of course we do!
>>> series = [[0,0,1,1,1,1,0,0,0], [1,0,0,1,0,0,1,0,0]]
>>> block_entropy(series, k=2)
1.936278124459133
>>> block_entropy(series, k=2, local=True)
array([[ 1.4150375, 2.4150375, 2.4150375, 2.4150375, 2.4150375,
2. , 1.4150375, 1.4150375],
[ 2. , 1.4150375, 2.4150375, 2. , 1.4150375,
2.4150375, 2. , 1.4150375]])
Or you can compute the block entropy on each initial condition and average:
>>> np.apply_along_axis(block_entropy, 1, series, 2).mean()
1.6862781244591329
API Documentation¶

pyinform.blockentropy.
block_entropy
(series, k, b=0, local=False)[source]¶ Compute the (local) block entropy of a time series with block size k.
If b is 0, then the base is inferred from the time series with a minimum value of 2. The base b must be at least the base of the time series and is used as the base of the logarithm.
Parameters:  series (sequence or numpy.ndarray) – the time series
 k (int) – the block size
 b (int) – the base of the logarithm
 local (bool) – compute the local block entropy
Returns: the average or local block entropy
Return type: float or numpy.ndarray
Raises:  ValueError – if the time series has no initial conditions
 ValueError – if the time series is greater than 2D
 InformError – if an error occurs within the
inform
C call
Conditional Entropy¶
Conditional entropy is a measure of the amount of information
required to describe a random variable \(Y\) given knowledge of another
random variable \(X\). When applied to time series, two time series are used
to construct the empirical distributions and then
conditional_entropy()
can be applied to yield
This can be viewed as the timeaverage of the local conditional entropy
See [Cover1991] for more information.
Examples¶
>>> xs = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1]
>>> ys = [0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1]
>>> conditional_entropy(xs,ys) # H(YX)
0.5971071794515037
>>> conditional_entropy(ys,xs) # H(XY)
0.5077571498797332
>>> conditional_entropy(xs, ys, local=True)
array([ 3. , 3. , 0.19264508, 0.19264508, 0.19264508,
0.19264508, 0.19264508, 0.19264508, 0.19264508, 0.19264508,
0.19264508, 0.19264508, 0.19264508, 0.19264508, 0.19264508,
0.19264508, 0.4150375 , 0.4150375 , 0.4150375 , 2. ])
>>> conditional_entropy(ys, xs, local=True)
array([ 1.32192809, 1.32192809, 0.09953567, 0.09953567, 0.09953567,
0.09953567, 0.09953567, 0.09953567, 0.09953567, 0.09953567,
0.09953567, 0.09953567, 0.09953567, 0.09953567, 0.09953567,
0.09953567, 0.73696559, 0.73696559, 0.73696559, 3.9068906 ])
API Documentation¶

pyinform.conditionalentropy.
conditional_entropy
(xs, ys, bx=0, by=0, b=2.0, local=False)[source]¶ Compute the (local) conditional entropy between two time series.
This function expects the condition to be the first argument.
The bases bx and by are inferred from their respective time series if they are not provided (or are 0). The minimum value in both cases is 2.
This function explicitly takes the logarithmic base b as an argument.
Parameters:  xs (a sequence or
numpy.ndarray
) – the time series drawn from the conditional distribution  ys (a sequence or
numpy.ndarray
) – the time series drawn from the target distribution  bx (int) – the base of the conditional time series
 by (int) – the base of the target time series
 b (double) – the logarithmic base
 local (bool) – compute the local conditional entropy
Returns: the local or average conditional entropy
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series have different shapes
 InformError – if an error occurs within the
inform
C call
 xs (a sequence or
Entropy Rate¶
Entropy rate (ER) quantifies the amount of information needed to describe the \(X\) given observations of \(X^{(k)}\). In other words, it is the entropy of the time series conditioned on the \(k\)histories. The local entropy rate
can be averaged to obtain the global entropy rate
Much as with Active Information, the local and average entropy rates are formally obtained in the limit
but we do not provide limiting functionality in this library (yet!).
See [Cover1991] for more details.
Examples¶
A Single Initial Condition¶
Let’s apply the entropy rate to a single initial condition. Typically, you will
just provide the time series and the history length, and let
entropy_rate()
take care of the rest:
>>> entropy_rate([0,0,1,1,1,1,0,0,0], k=2)
0.6792696431662095
>>> entropy_rate([0,0,1,1,1,1,0,0,0], k=2, local=True)
array([[ 1. , 0. , 0.5849625, 0.5849625, 1.5849625,
0. , 1. ]])
As with all of the time series measures, you can override the default base.
>>> entropy_rate([0,0,1,1,1,1,2,2,2], k=2)
0.24830578469386944
>>> entropy_rate([0,0,1,1,1,1,2,2,2], k=2, b=4)
0.19677767872596208
>>> entropy_rate([0,0,1,1,1,1,2,2,2], k=2, b=2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/workspace/pyinform/entropyrate.py", line 79, in entropy_rate
error_guard(e)
File "/home/ubuntu/workspace/pyinform/error.py", line 57, in error_guard
raise InformError(e,func)
pyinform.error.InformError: an inform error occurred  "unexpected state in timeseries"
Multiple Initial Conditions¶
Of course multiple initial conditions are handled.
>>> series = [[0,0,1,1,1,1,0,0,0], [1,0,0,1,0,0,1,0,0]]
>>> entropy_rate(series, k=2)
0.6253491072973907
>>> entropy_rate(series, k=2, local=True)
array([[ 0.4150375, 1.5849625, 0.5849625, 0.5849625, 1.5849625,
0. , 2. ],
[ 0. , 0.4150375, 0.5849625, 0. , 0.4150375,
0.5849625, 0. ]])
API Documentation¶

pyinform.entropyrate.
entropy_rate
(series, k, b=0, local=False)[source]¶ Compute the average or local entropy rate of a time series with history length k.
If the base b is not specified (or is 0), then it is inferred from the time series (with 2) as a minimum. b must be at least the base of the time series and is used a the base of the logarithm.
Parameters:  series (sequence or
numpy.ndarray
) – the time series  k (int) – the history length
 b (int) – the base of the time series and logarithm
 local (bool) – compute the local active information
Returns: the average or local entropy rate
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series has no initial conditions
 ValueError – if the time series is greater than 2D
 InformError – if an error occurs within the
inform
C call
 series (sequence or
Mutual Information¶
Mutual information (MI) is a measure of the amount of mutual dependence
between two random variables. When applied to time series, two time series are
used to construct the empirical distributions and then
mutual_info()
can be applied. Locally MI is defined as
The mutual information is then just the time average of \(i_{b,i}(X,Y)\).
See [Cover1991] for more details.
Examples¶
>>> xs = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1]
>>> ys = [0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,1]
>>> mutual_info(xs, ys)
0.214170945007629
>>> mutual_info(xs, ys, local=True)
array([1. , 1. , 0.22239242, 0.22239242, 0.22239242,
0.22239242, 0.22239242, 0.22239242, 0.22239242, 0.22239242,
0.22239242, 0.22239242, 0.22239242, 0.22239242, 0.22239242,
0.22239242, 1.5849625 , 1.5849625 , 1.5849625 , 1.5849625 ])
API Documentation¶

pyinform.mutualinfo.
mutual_info
(xs, ys, bx=0, by=0, b=2.0, local=False)[source]¶ Compute the (local) mutual information between two time series.
The bases bx and by are inferred from their respective time series if they are not provided (or are 0). The minimum value in both cases is 2.
This function explicitly takes the logarithmic base b as an argument.
Parameters:  xs (a sequence or
numpy.ndarray
) – a time series  ys (a sequence or
numpy.ndarray
) – a time series  bx (int) – the base of the first time series
 by (int) – the base of the second time series
 b (double) – the logarithmic base
 local (bool) – compute the local mutual information
Returns: the local or average mutual information
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series have different shapes
 InformError – if an error occurs within the
inform
C call
 xs (a sequence or
Relative Entropy¶
Relative entropy, also known as the KullbackLeibler divergence, measures the
amount of information gained in switching from a prior \(q_X\) to a
posterior distribution \(p_X\) over the same support. That is \(q_X\)
and \(P\) represent hypotheses of the distribution of some random variable
\(X.\) Time series data sampled from the posterior and prior can be used to
estiamte those distributions, and the relative entropy can the be computed via
a call to relative_entropy()
. The result is
which has as its local counterpart
Note that the average in moving from the local to the nonlocal relative entropy is taken over the posterior distribution.
See [Kullback1951] and [Cover1991] for more information.
Examples¶
>>> xs = [0,1,0,0,0,0,0,0,0,1]
>>> ys = [0,1,1,1,1,0,0,1,0,0]
>>> relative_entropy(xs, ys)
0.27807190511263774
>>> relative_entropy(ys, xs)
0.3219280948873624
>>> xs = [0,0,0,0]
>>> ys = [0,1,1,0]
>>> relative_entropy(xs, ys)
1.0
>>> relative_entropy(ys, xs)
nan
API Documentation¶

pyinform.relativeentropy.
relative_entropy
(xs, ys, b=0, base=2.0, local=False)[source]¶ Compute the local or global relative entropy between two time series treating each as observations from a distribution.
The base b is inferred from the time series if it is not provided (or is 0). The minimum value is 2.
This function explicitly takes the logarithmic base base as an argument.
Parameters:  xs (a sequence or
numpy.ndarray
) – the time series sampled from the posterior distribution  ys (a sequence or
numpy.ndarray
) – the time series sampled from the prior distribution  b (double) – the base of the time series
 b – the logarithmic base
 local (bool) – compute the local relative entropy
Returns: the local or global relative entropy
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series have different shapes
 InformError – if an error occurs within the
inform
C call
 xs (a sequence or
Transfer Entropy¶
Transfer entropy (TE) measures the amount of directed transfer of information between two random processes. The local variant of TE is defined as
Averaging in time we have
As in the case of Active Information and Entropy Rate, the transfer entropy is formally defined as the limit of the \(k\)history transfer entropy as \(k \rightarrow \infty\):
but we do not provide limiting functionality in this library (yet!).
Note
What we call “transfer entropy” is referred to as “apparent transfer entropy” in the parlance of [Lizier2008]. A related quantity, complete transfer entropy, also considers the semiinfinite histories of all other random processes associated with the system. An implementation of complete transfer entropy is planned for a future release of Inform/PyInform.
See [Schreiber2000], [Kraiser2002] and [Lizier2008] for more details.
Examples¶
A Single Initial Condition¶
Just give us a couple of time series and tell us the history length and we’ll give you a number
>>> xs = [0,0,1,1,1,1,0,0,0]
>>> ys = [0,1,1,1,1,0,0,0,1]
>>> transfer_entropy(ys, xs, k=1)
0.8112781244591329
>>> transfer_entropy(ys, xs, k=2)
0.6792696431662095
>>> transfer_entropy(xs, ys, k=1)
0.21691718668869964
>>> transfer_entropy(xs, ys, k=2) # pesky floatingpoint math
2.220446049250313e16
or an array if you ask for it
>>> transfer_entropy(ys, xs, k=1, local=True)
array([[ 0.4150375, 2. , 0.4150375, 0.4150375, 0.4150375,
2. , 0.4150375, 0.4150375]])
>>> transfer_entropy(ys, xs, k=2, local=True)
array([[ 1. , 0. , 0.5849625, 0.5849625, 1.5849625,
0. , 1. ]])
>>> transfer_entropy(xs, ys, k=1, local=True)
array([[ 0.4150375, 0.4150375, 0.169925 , 0.169925 , 0.4150375,
1. , 0.5849625, 0.4150375]])
>>> transfer_entropy(xs, ys, k=2, local=True)
array([[ 0., 0., 0., 0., 0., 0., 0.]])
Multiple Initial Conditions¶
Uhm, yes we can! (Did you really expect anything less?)
>>> xs = [[0,0,1,1,1,1,0,0,0], [1,0,0,0,0,1,1,1,0]]
>>> ys = [[1,0,0,0,0,1,1,1,1], [1,1,1,1,0,0,0,1,1]]
>>> transfer_entropy(ys, xs, k=1)
0.8828560636920486
>>> transfer_entropy(ys, xs, k=2)
0.6935361388961918
>>> transfer_entropy(xs, ys, k=1)
0.15969728512148262
>>> transfer_entropy(xs, ys, k=2)
0.0
And local too:
>>> transfer_entropy(ys, xs, k=1, local=True)
array([[ 0.4150375 , 2. , 0.67807191, 0.67807191, 0.67807191,
1.4150375 , 0.4150375 , 0.4150375 ],
[ 1.4150375 , 0.4150375 , 0.4150375 , 0.4150375 , 2. ,
0.67807191, 0.67807191, 1.4150375 ]])
>>> transfer_entropy(ys, xs, k=2, local=True)
array([[ 1.32192809, 0. , 0.73696559, 0.73696559, 1.32192809,
0. , 0.73696559],
[ 0. , 0.73696559, 0.73696559, 1.32192809, 0. ,
0.73696559, 1.32192809]])
>>> transfer_entropy(xs, ys, k=1, local=True)
array([[ 0.5849625 , 0.48542683, 0.25153877, 0.25153877, 0.48542683,
0.36257008, 0.22239242, 0.22239242],
[ 0.36257008, 0.22239242, 0.22239242, 0.5849625 , 0.48542683,
0.25153877, 0.48542683, 0.36257008]])
>>> transfer_entropy(xs, ys, k=2, local=True)
array([[ 0.00000000e+00, 2.22044605e16, 2.22044605e16,
2.22044605e16, 0.00000000e+00, 2.22044605e16,
2.22044605e16],
[ 2.22044605e16, 2.22044605e16, 2.22044605e16,
0.00000000e+00, 2.22044605e16, 2.22044605e16,
0.00000000e+00]])
API Documentation¶

pyinform.transferentropy.
transfer_entropy
(source, target, k, b=0, local=False)[source]¶ Compute the local or average transfer entropy from one time series to another with target history length k.
If the base b is not specified (or is 0), then it is inferred from the time series with 2 as a minimum. b must be at least the base of the time series and is used as the base of the logarithm.
Parameters:  source (sequence or
numpy.ndarray
) – the source time series  target (sequence or
numpy.ndarray
) – the target time series  k (int) – the history length
 b (int) – the base of the time series and logarithm
 local (bool) – compute the local transfer entropy
Returns: the average or local transfer entropy
Return type: float or
numpy.ndarray
Raises:  ValueError – if the time series have different shapes
 ValueError – if either time series has no initial conditions
 ValueError – if either time series is greater than 2D
 InformError – if an error occurs within the
inform
C call
 source (sequence or
References¶
[Cover1991]  (1, 2, 3, 4) T.M. Cover amd J.A. Thomas (1991). “Elements of information theory” (1st ed.). New York: Wiley. ISBN 0471062596. 
[Kraiser2002] 

[Kullback1951]  Kullback, S.; Leibler, R.A. (1951). “On information and sufficiency”. Annals of Mathematical Statistics. 22 (1): 7986. doi:10.1214/aoms/1177729694. MR 39968. 
[Lizier2008]  (1, 2) J.T. Lizier M. Prokopenko and A. Zomaya, “Local information transfer as a spatiotemporal filter for complex systems”, Phys. Rev. E 77, 026110, 2008. 
[Lizier2012]  J.T. Lizier, M. Prokopenko and A.Y. Zomaya, “Local measures of information storage in complex distributed computation” Information Sciences, vol. 208, pp. 3954, 2012. 
[Schreiber2000] 

[Shannon1948]  Shannon, Claude E. (JulyOctober 1948). “A Mathematical Theory of Communication”. Bell System Technical Journal. 27 (3): 379423. doi:10.1002/j.15387305.1948.tb01448.x. 