Calculating the Lyapunov Exponent of a Time Series (with python code)

(In a later post I discuss a cleaner way to calculate the Lyapunov exponent for maps and particularly the logistic map, along with Mathematica code.)
I found this method during my Masters while recreating the results of an interesting paper on how some standard tests for chaos fail to distinguish chaos from stochasticity (Stochastic neural network model for spontaneous bursting in hippocampal slices, Biswal and Dasgupta, 2002).
Procedure
The Lyapunov exponent is a measure of sensitive dependence on initial conditions, i.e. how quickly two nearby states diverge.
Now consider two points in the time-series, ti and tj, whose values are very close. That means the system reached near the same state at the ith and jth iterations. Now consider the two sequences  ti,  ti+1,  ti+2 … and  tj,  tj+1,  tj+2 … We wish to know how these two sequences diverge from each other. For this, consider the distance between the two sequences after k steps: d(k) = | ti+ktj+k |. (This is for a 1D time series. For higher dimensions, you can define this to be the Euclidean distance and modify the code accordingly.) If the system is chaotic, d(k) will initially rise exponentially with k. For this, one can plot ln d(k) vs k and apply a linear fit. The slope will be an estimate for the Lyapunov exponent.
(Since the system is bounded, the two nearby states will not diverge indefinitely though. Their exponential divergence will stop after some length. We must fit the straight line only within this region.)
Now, this was for a single pair of initial states. The Lyapunov exponent is an average of this divergence exponent over all nearby initial pairs. So for this, define d(k)>, where is averaging over all starting pairs  ti,  tj, such that the initial distance d(0) = | t– tj | is less than some fixed small value. The program finds all such initial pairs, calculates d(k)>, plots it against k, and the slope of the initial linear part gives us the Lyapunov exponent.
Python Code
The following code takes a text file with the time series, ‘timeseries.txt’, as the argument. The text file must contain only the time series values in a single column, no serial numbers or any other text before or after. It asks for the starting diameter within which to limit the initial pairs. It displays how many such pairs it is finding in the time series, so you can vary the diameter based on this.
It outputs a text file, ‘lyapunov.txt’ with two columns, k and d(k)>, which you can then plot and fit in the correct region by visual inspection.
from math import log

def d(series,i,j):
    return abs(series[i]-series[j])

f=open('timeseries.txt', 'r')
series=[float(i) for i in f.read().split()]
f.close()
N=len(series)
eps=input('Initial diameter bound: ')
dlist=[[] for i in range(N)]
n=0 #number of nearby pairs found
for i in range(N):
    for j in range(i+1,N):
        if d(series,i,j) < eps:
            n+=1
            print n
            for k in range(min(N-i,N-j)):
                dlist[k].append(log(d(series,i+k,j+k)))
f=open('lyapunov.txt','w')
for i in range(len(dlist)):
    if len(dlist[i]):
        print>>f, i, sum(dlist[i])/len(dlist[i])
f.close()

The following is the plot and fit of the resulting data from a logistic map series with an appropriately chosen initial diameter.

Lyapunov Exponent of Logistic Map

I deliberately did not automate the plotting and fitting part, because a. it’s tedious and hard to write the code in a way that runs on most installations, and b. human eyes will do a much more reliable job of identifying where the linear portion ends.

Python Script to Generate Frequency Counts of Words in a Text

The following python script takes a text file as input and produces an unsorted list of frequency counts of words in the text as an output text file. It’s pretty simple and short, and uses only the regular expressions module re of python, which is a standard library, so this script will run in any system with a standard python installation.

from re import compile
l=compile("([\w,.'\x92]*\w)").findall(open(raw_input('Input file: '),'r').read().lower())
f=open(raw_input('Output file: '),'w')
for word in set(l):
	print>>f, word, '\t', l.count(word)
f.close()

Note that ‘words’ here doesn’t mean dictionary words (with such a small script it’s not possible to check against dictionary words). Instead, ‘words’ are what you get when you split the text at regular expression word boundaries. So if you have a word like “365b1”, that’ll also be listed in the output.

Here’s an example.

Input file contains:

This is text. This is written with intentionally repeated words. This is repeated, intentionally, to produce short output. This text — with intentionally repeated words — is written to produce short text output.
This output is text. Intentionally short text.

Output file will contain:

short 			3
this 			5
text 			5
is 				5
repeated 		3
intentionally 	4
to 				2
written 		2
produce 		2
words 			2
output 			3
with 			2

The columns are tab-separated, so you can copy this into spreadsheets which should detect this as two columns and paste accordingly. Then you can draw frequency plots or whatever.
If you want, you can write a bit more code to sort the output alphabetically or by frequency count. I didn’t bother.

Locating Numbers inside Bisected Interval Sequences

I think in a real analysis course in the second semester of my first year, the teacher was discussing the nested interval theorem, when one of his examples or something he was saying struck me, and I thought of this interesting problem. Well, interesting to me.

We pick any fraction, say. Now we look at the interval [0,1]. We divide it into two halves [0,0.5] and [0.5,1] and say, ‘the fraction belongs to this half.’ Say the right half. Then we divide the right half into two halves, check again, and say ‘now it’s in the left half’. We continue like this until we hit the number bang in the middle of an interval.

Now that’s not really a problem, but I thought it would be an interesting thing to look at this sequence of ‘left’ and ‘right’ for a chosen fraction. So I wrote a python program for that. Nothing very amusing came out of that. Then I thought of something else. I took evenly spaced fractions in that interval along the horizontal axis, and plotted the fraction of rights in their respective left-right sequences, on the vertical axis, using matplotlib. Here is the python source code:

#! usr/bin/python
import matplotlib.pyplot as plt
c = 0.
x=[]
y=[]
while c<=1.:
    a = 0.
    b = 1.
    dc = c - a
    d = (b-a)/2
    R=1
    L=1
    while True:
        if dc > d:
            R+=1
            a = a + d
        elif dc < d:
            L+=1
            b = b - d
        elif dc==d:
            break
        d = float(b-a)/2
        dc = c - a
    x.append(c)
    y.append(float(R)/(R+L))
    c+=1e-4
plt.xlabel('Fraction')
plt.ylabel('''Fraction of 'Right's in sequence''')
plt.plot(x, y, marker='.', markerfacecolor='blue', linestyle='None')
plt.show()

This is what I got:

graph1

Now, for example, 0.375 = 0.5 – 0.25 + 0.125. A minus sign means an L, a plus sign is an R. So 0.375 is LR. 0.625, which is the fraction the same distance from the right as 0.375 is from the left, is 0.5 + 0.25 – 0.125. So it’s RL. So as you look at fractions equidistant from 0.5 on either side of it, all the R’s and L’s in their sequence get switched. Therefore, the fraction of R’s in one should be the fraction of L’s in the other, or 1 – fraction of R’s. Thus, you expect the graph to be symmetric about the point (0.5, 0.5). (Think about this, no hurry.) What miffed me at this point, therefore, was that this graph didn’t appear to be symmetric with respect to its center point. There’s some fuzzy mess to the left and some scattered points isolated from the main band that are not symmetric at all.

Then I ran some tests with fractions whose sequences aren’t supposed to end at all. Like what? Like 0 and 1, say. If you’ve followed the algorithm, you can tell that we can never arrive at a cleaving of an interval where the separating number is either 0 or 1, because there’s nothing on one side of these numbers. So 0 should just give me LLLLL… and 1 should give me RRRRR…, never ending. However, guess what I found when I looked at the number of L’s and R’s in their sequence.

0    L: 1074, R: 0.

1    L: 0, R: 54.

So why do the sequences end? That’s fairly simple. It’s because of the limitation of storing and computing floating point numbers in a computer. Notice that with each step of the sequence we are squeezing our number tighter and tighter, into an interval that is halving its length with each iteration. Very soon, our computer (or the interpreter) arrives at a point that numbers so infinitesimally separated in that tiny interval are no longer separate numbers to it, and so it cannot differentiate between our fraction and the mid-point of the interval, and stops.

Exactly how big is this error? It is difficult to tell from looking at these numbers above. One tells you it should be 1/2 1074, the other tells you it’s 1/2 54 (which is closer to where I’d put it, owing to other checks I did and don’t want to discuss here). The final result has to do with all the calculations it is doing at every step, and so all the floating point errors that accumulate at every step. However, I think the only way the answer could still be different for a fraction and its ‘mirror-image’ is if different floating point errors are associated with addition and subtraction, because these two operations have been switched for them.

Notice, though, that the fraction of R’s for 0 is 0, and that for 1 is 1. The symmetry is preserved. So where is the final problem in the plot? Well, we’ve been lucky with these two numbers because one of the counts is 0 for both cases. I’ll give you an example of another case:

0.1    L: 28, R: 26.

0.9    L: 25, R: 27.

In this case, obviously, the symmetry will not be maintained, because the second pair is 25,27 instead of being 26,28. Thus, the graph is no longer symmetric about the center point.

Since I was stubborn about getting a symmetric graph, I decided that I’ll cut off the process before it gets to the ambiguous stage, that is stop with a wide enough interval length, and plot a graph with the truncated sequences. I finally got a symmetric one when I set the interval length at the order of 1e-13. For this, in line 20 (highlighted), instead of elif dc==d you need to write elif abs(dc-d)<=1e-13. Here is the resulting graph:

graph2

Note, however, that this error tolerance is not something fixed. It depends on the resolution (spacing) of fractions you want to do this computation for. In the images you have so far seen, the fractions were multiples of 10-4. You get a better image with one order lower, but for that the error tolerance had to be jacked up to 1e-11:

graph3

Do you see something really interesting in this graph now, in the way that it organizes itself into parallelograms within parallelograms? It’s a highly ordered fractal. I’ve marked them out for clarity:

graph

In other words, the point symmetry is repeated on increasingly smaller scales, as it should. The whole bisected nature of the nested intervals is responsible for this. More parallelograms would be revealed if we kept making our resolution finer, and the horizontal extent of these parallelograms are only exhibiting those nested intervals.

The fraction of rights, however, doesn’t reveal a lot of information. More interesting could be to see how many such bisection steps are required before we converge onto a number. For this you need to modify the source code just a bit. On line 25 (highlighted), substitute float(R)/(R+L) with R+L, and you get this:

graph5

The black dots are the data points, joined by blue lines for clarity. Again, this should have been symmetric about x=0.5 (about a line this time, not about a point), but it isn’t. Notice that the low sequence lengths for numbers such as 0.125 or 0.375 like we discussed don’t even appear. The lowest sequence length we see here is about 35. That’s because these fractions never even arrived in the incrementing loop, although they should have. This is computational error again. I can tell because I have poked around a bit. Try out this python snippet for example:

c=0.
while c<=1.:
	c+=1e-2
	if c==.12:
		print c

By the way, one data point, corresponding to the fraction 0, had to be removed from this graph, because its sequence length was very big, 1074, as we saw before.

If you zoom into the middle of this graph a bit, however, you’ll see the kind of symmetry I had been looking for:

graph6

Do you see why we should have a picture like this? Think about it, it’s not very hard. Meanwhile you can download a wallpaper I made in Photoshop out of the above graph, because I liked it so much.

graphwallpaper

That’ll be it for now. Let me know if you have any ideas or questions about all this.