Locating Numbers inside Bisected Interval Sequences

I think in a real analysis course in the second semester of my first year, the teacher was discussing the nested interval theorem, when one of his examples or something he was saying struck me, and I thought of this interesting problem. Well, interesting to me.

We pick any fraction, say. Now we look at the interval [0,1]. We divide it into two halves [0,0.5] and [0.5,1] and say, ‘the fraction belongs to this half.’ Say the right half. Then we divide the right half into two halves, check again, and say ‘now it’s in the left half’. We continue like this until we hit the number bang in the middle of an interval.

Now that’s not really a problem, but I thought it would be an interesting thing to look at this sequence of ‘left’ and ‘right’ for a chosen fraction. So I wrote a python program for that. Nothing very amusing came out of that. Then I thought of something else. I took evenly spaced fractions in that interval along the horizontal axis, and plotted the fraction of rights in their respective left-right sequences, on the vertical axis, using matplotlib. Here is the python source code:

#! usr/bin/python
import matplotlib.pyplot as plt
c = 0.
x=[]
y=[]
while c<=1.:
    a = 0.
    b = 1.
    dc = c - a
    d = (b-a)/2
    R=1
    L=1
    while True:
        if dc > d:
            R+=1
            a = a + d
        elif dc < d:
            L+=1
            b = b - d
        elif dc==d:
            break
        d = float(b-a)/2
        dc = c - a
    x.append(c)
    y.append(float(R)/(R+L))
    c+=1e-4
plt.xlabel('Fraction')
plt.ylabel('''Fraction of 'Right's in sequence''')
plt.plot(x, y, marker='.', markerfacecolor='blue', linestyle='None')
plt.show()

This is what I got:

graph1

Now, for example, 0.375 = 0.5 – 0.25 + 0.125. A minus sign means an L, a plus sign is an R. So 0.375 is LR. 0.625, which is the fraction the same distance from the right as 0.375 is from the left, is 0.5 + 0.25 – 0.125. So it’s RL. So as you look at fractions equidistant from 0.5 on either side of it, all the R’s and L’s in their sequence get switched. Therefore, the fraction of R’s in one should be the fraction of L’s in the other, or 1 – fraction of R’s. Thus, you expect the graph to be symmetric about the point (0.5, 0.5). (Think about this, no hurry.) What miffed me at this point, therefore, was that this graph didn’t appear to be symmetric with respect to its center point. There’s some fuzzy mess to the left and some scattered points isolated from the main band that are not symmetric at all.

Then I ran some tests with fractions whose sequences aren’t supposed to end at all. Like what? Like 0 and 1, say. If you’ve followed the algorithm, you can tell that we can never arrive at a cleaving of an interval where the separating number is either 0 or 1, because there’s nothing on one side of these numbers. So 0 should just give me LLLLL… and 1 should give me RRRRR…, never ending. However, guess what I found when I looked at the number of L’s and R’s in their sequence.

0    L: 1074, R: 0.

1    L: 0, R: 54.

So why do the sequences end? That’s fairly simple. It’s because of the limitation of storing and computing floating point numbers in a computer. Notice that with each step of the sequence we are squeezing our number tighter and tighter, into an interval that is halving its length with each iteration. Very soon, our computer (or the interpreter) arrives at a point that numbers so infinitesimally separated in that tiny interval are no longer separate numbers to it, and so it cannot differentiate between our fraction and the mid-point of the interval, and stops.

Exactly how big is this error? It is difficult to tell from looking at these numbers above. One tells you it should be 1/2 1074, the other tells you it’s 1/2 54 (which is closer to where I’d put it, owing to other checks I did and don’t want to discuss here). The final result has to do with all the calculations it is doing at every step, and so all the floating point errors that accumulate at every step. However, I think the only way the answer could still be different for a fraction and its ‘mirror-image’ is if different floating point errors are associated with addition and subtraction, because these two operations have been switched for them.

Notice, though, that the fraction of R’s for 0 is 0, and that for 1 is 1. The symmetry is preserved. So where is the final problem in the plot? Well, we’ve been lucky with these two numbers because one of the counts is 0 for both cases. I’ll give you an example of another case:

0.1    L: 28, R: 26.

0.9    L: 25, R: 27.

In this case, obviously, the symmetry will not be maintained, because the second pair is 25,27 instead of being 26,28. Thus, the graph is no longer symmetric about the center point.

Since I was stubborn about getting a symmetric graph, I decided that I’ll cut off the process before it gets to the ambiguous stage, that is stop with a wide enough interval length, and plot a graph with the truncated sequences. I finally got a symmetric one when I set the interval length at the order of 1e-13. For this, in line 20 (highlighted), instead of elif dc==d you need to write elif abs(dc-d)<=1e-13. Here is the resulting graph:

graph2

Note, however, that this error tolerance is not something fixed. It depends on the resolution (spacing) of fractions you want to do this computation for. In the images you have so far seen, the fractions were multiples of 10-4. You get a better image with one order lower, but for that the error tolerance had to be jacked up to 1e-11:

graph3

Do you see something really interesting in this graph now, in the way that it organizes itself into parallelograms within parallelograms? It’s a highly ordered fractal. I’ve marked them out for clarity:

graph

In other words, the point symmetry is repeated on increasingly smaller scales, as it should. The whole bisected nature of the nested intervals is responsible for this. More parallelograms would be revealed if we kept making our resolution finer, and the horizontal extent of these parallelograms are only exhibiting those nested intervals.

The fraction of rights, however, doesn’t reveal a lot of information. More interesting could be to see how many such bisection steps are required before we converge onto a number. For this you need to modify the source code just a bit. On line 25 (highlighted), substitute float(R)/(R+L) with R+L, and you get this:

graph5

The black dots are the data points, joined by blue lines for clarity. Again, this should have been symmetric about x=0.5 (about a line this time, not about a point), but it isn’t. Notice that the low sequence lengths for numbers such as 0.125 or 0.375 like we discussed don’t even appear. The lowest sequence length we see here is about 35. That’s because these fractions never even arrived in the incrementing loop, although they should have. This is computational error again. I can tell because I have poked around a bit. Try out this python snippet for example:

c=0.
while c<=1.:
	c+=1e-2
	if c==.12:
		print c

By the way, one data point, corresponding to the fraction 0, had to be removed from this graph, because its sequence length was very big, 1074, as we saw before.

If you zoom into the middle of this graph a bit, however, you’ll see the kind of symmetry I had been looking for:

graph6

Do you see why we should have a picture like this? Think about it, it’s not very hard. Meanwhile you can download a wallpaper I made in Photoshop out of the above graph, because I liked it so much.

graphwallpaper

That’ll be it for now. Let me know if you have any ideas or questions about all this.

2 thoughts on “Locating Numbers inside Bisected Interval Sequences

  1. Hi. This is interesting! I had this doubt though – if one end point of every blue vertical line (black dot) represents sequence length, what does the other do? Also, if I’m thinking correctly, the sequence length for 0.5 should be 1, was it?

    Like

    1. The sequence lengths are marked by black dots. These black dots are joined sequentially by blue lines. If the next number has a large sequence length, the next black dot will be way up, and the blue line will jump up to meet it. For most numbers in the last few plots, the sequence length was long (around 54 or so, like I mentioned), and was only ended by the error cutoff I specified. So the blue line keeps jumping very frequently up to that height.
      And yes, ideally it should have been 1 for 0.5. However, due to weirdnesses in floating point representations, the algorithm wasn’t able to latch on to 0.5 in just 1 step.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s