Saturday, May 29, 2010

Constructing Intermediate Values

Introduction
The Intermediate Value Theorem, as it is usually told, tells us that if we have a continuous real-valued function f on the closed interval [0,1], with f(0)<0 and f(1)>0 then there is a point x in (0,1) such that f(x)=0. Here's a sketch of a proof:

Define three sequences li, ui, xi recursively as follows:

xi+1 = (li+ui)/2
(li+1, ui+1) = (li, xi+1) if f(xi+1)>0
(li+1, ui+1) = (xi+1, ui) otherwise

Set l0=0, u0=1.
The xi+1 are bracketed by the li and ui, and |ui-li|=2-i so the xi form a Cauchy sequence with some limit x. Because f is continuous, we get f(x)=0.

This proof not only shows that we have an intermediate value x where f crosses the x-axis, it also seems to describe a procedure for computing x. Let's use this strategy to write a numerical method.

A Numerical Method

> import Ratio
> infixl 6 .*

> ivt0 :: (Float -> Float) -> Float
> ivt0 f = ivt' 0 1 where
>    ivt' l u =
>          let z = 0.5*(l+u)
>          in if z==l || z==u || f z == 0
>               then z
>               else if f z < 0
>                   then ivt' z u
>                   else ivt' l z


Here are some simple functions we can use to test it:

> f0, g0 :: Float -> Float
> f0 x = 3*x-1
> g0 x = 2*x-1


ivt0 f0 and ivt0 f1 give the expected results.

Of course, it might not actually be the case that f (ivt f) == 0. We're dealing with the type Float here and there might not actually be a solution to f x == 0 to the precision of a Float. And it's not clear that the word continuous means anything here anyway because the limited precision of a Float precludes us from constructing the deltas and epsilons the definition of continuity demands.

Switching to Exact Real Arithmetic
So can we do better? Can we implement a function to compute intermediate values for exact real arithmetic? I recently talked about computable real numbers, and they are what I intend to use here. I'll represent the real number x using a sequence of rationals, xi, such that |xi-x|<2-i. Here's a suitable type:

> data R = R { runR :: Integer -> Rational }


(I'll only be calling it for non-negative arguments, but alas, Haskell has no natural number type.)

We'll need to display our numbers. Here's something cheap and cheerful that's good enough for this article:

> instance Show R where
>    show (R f) = show (fromRational (f 30) :: Float)


It displays a Float that is formed from a rational approximation that is within 2-30 of our value.

We can inject the rationals into the computable reals by using constant sequences:

> instance Fractional R where
>    fromRational x = R $ const $ fromRational x


An important point here is that two different sequences might define the same real. So when we construct functions mapping reals to reals we must be sure that they take different representations of the same real to representations of the same real. I'll call these 'well-defined'.

Now our problems begin. Haskell wants us to implement Eq before Num, but equality of computable reals is not decidable. Still, it is semi-decidable in that if two reals differ, we can detect this. Here's a partial equality function:

> instance Eq R where
>    R f == R g = eq' 2 0 f g where
>        eq' delta i f g = 
>               let fi = f i
>                   gi = g i
>               in if fi<gi-delta
>                   then False
>                   else if fi>gi+delta
>                       then False
>                       else eq' (delta/2) (i+1) f g


The idea is that if f i and g i can be separated by more than 2*2-i then the reals that f and g represent can't be equal.

We have to implement Ord too. If one computable real is bigger than another then they must differ by more than some power of 1/2. This means we can eventually distinguish between them. As a result, compare will always terminate if we compare distinct numbers:

> instance Ord R where
>    compare (R f) (R g) = compare' 2 0 f g where
>        compare' delta i f g =
>               let fi = f i
>                   gi = g i
>               in if fi<gi-delta
>                   then LT
>                   else if fi>gi+delta
>                       then GT
>                       else compare' (delta/2) (i+1) f g


We can define min and max straightforwardly:

>    min (R f) (R g) = R $ \i -> min (f i) (g i)
>    max (R f) (R g) = R $ \i -> max (f i) (g i)


And now we can define addition. We use the fact that if |x-x'|<ε and |y-y'|<ε then |(x+y)-(x'+y')|<2ε:

> instance Num R where
>    fromInteger n = R $ const $ fromInteger n
>    R f + R g = R $ \i -> let j = i+1 in f j + g j
>    negate (R f) = R $ negate . f


I'm going to skip multiplication of two reals. But it's easy to define multiplication of a rational with a real. Here I'm using the fact that if |x-x'|<ε, then |ax-ax'|<|a|ε:

> x .* R f = R $ mul x x f where
>    mul x x' f i | abs x' < 1 = x * f i
>                 | otherwise  = mul x (x'/2) f (i+1)


Now we're ready! It's straightforward to translate our previous algorithm. Note that we don't need a function to take the limit of a sequence. By definition we use sequences of rational approximations that are accurate to within 2-i so our ivt1 function merely needs to compute an approximation to our intermediate value that is accurate to within 2-i.

> ivt1 :: (R -> R) -> R
> ivt1 f = R $ \i -> ivt' 0 1 i where
>    ivt' :: Rational -> Rational -> Integer -> Rational
>    ivt' x y i =
>           let z = (1%2)*(x+y)
>           in if i==0
>                then z
>                else if f (fromRational z) < 0
>                    then ivt' z y (i-1)
>                    else ivt' x z (i-1)


Some new test functions:

> f, g :: R -> R
> f x = 3.*x-1
> g x = 2.*x-1


Note that (amazingly!) any well-defined function R -> R we implement is continuous because of the argument I sketched here.

And you'll find that ivt1 f gives the result you expect. But sadly, ivt1 g never returns. What went wrong?

Clutching Defeat from the Jaws of Victory
After one iteration, ivt1 sets z to the value we want. This seems like a particularly easy case. But the algorithm immediately fails because it then proceeds to compare f z with zero. We already know that our equality test must fail for this case. How do we fix our algorithm?

Consider the actions of any higher order function that operates on a continuous function f type R -> R. It will evaluate f at various real values. The return values from f will then be sampled at various values of i. Each sample corresponds to a bracket around a value of f. Consider a higher order function acting on the function below:



If our higher order function terminates, it can only compute a finite number of these brackets. So I can choose an ε small enough that I could nudge the graph up or down by ε without invalidating the brackets. I could provide the higher order function with a different function R -> R that responds in exactly the same way to the finite number of samples.

Now consider the case of an ivt function. It will respond the same way to two functions differing only by ε. The function in the graph above has been designed so that it is flat from 0.4 to 0.6. Nudging the function above by ε>0 up and down causes the crossing of the x-axis to move from x<0.4 to x>0.6. In other words, no algorithm could possibly return a valid intermediate value for all possible inputs because I can always contrive a pair of functions, to which ivt would respond identically, and yet which have widely separated points where they cross the x-axis. This shows that the problem of determining a crossing point of a continuous function that starts off less than zero, and ends up greater than zero, is uncomputable. We can do it for some functions. But no function can solve this problem for all continuous inputs. What's really weird about this is that it fails precisely when the problem seems easiest - when there are lots of zeroes!

A Revised Approach
But we don't need to give up. If we could ensure that at each bisection we could pick a point that wasn't exactly a zero, then we'd be fine. But how can we do this? Well here's a cheating way: we don't bother. We let that problem be the caller's responsibility. Instead of bisecting at each stage, we'll trisect. We'll pick a point that isn't a zero in the middle third by using a function pased in by the caller. Call this function g. g is of type Rational -> Rational -> Rational and the specification is that g x y must be a rational number between x and y that isn't a zero. Here's our new ivt2 function:

> ivt2 :: (Rational -> Rational -> Rational) -> (R -> R) -> R
> ivt2 g f = R $ \i ->
>     let delta = (1%2)^i
>         ivt' :: Rational -> Rational -> Rational
>         ivt' x y =
>               let l = (1%3)*(2*x+y)
>                   u = (1%3)*(x+2*y)
>                   z = g l u
>               in if u-l < delta
>                    then z
>                    else if f (fromRational z) < 0
>                        then ivt' z y
>                        else ivt' x z
>     in ivt' 0 1


How can we make use of this? Well suppose we want to find where a polynomial, f, crosses the x-axis. If the polynomial has degree n, it can only cross at n points. So if we pick n+1 values, f must evaluate to a non-zero value at one of them. We can examine the n+1 values of f(x) in parallel to find one that is non-zero. This must terminate. Here's an implementation for the case n+1=2. If we pass two real values to pickNonZero it will True or False according to whether the first or second argument is non-zero. It must only be called with a pair of numbers where one is non-zero:

> pickNonZero (R f) (R g) = pickNonZero' 0 1 where
>     pickNonZero' i e = let fi = f i
>                            gi = g i
>                        in if fi > e || fi < -e
>                           then True
>                           else if gi > e || gi < -e
>                               then False
>                               else pickNonZero' (i+1) ((1%2)*e)


Note how it examines the two values in parallel. If it simply tested the two numbers in turn against zero it would fail if the first were zero. (Interesting side note: we could write this using par.)

Suppose the function f is strictly monotonically increasing. Then we know that if y>x, then f(y)>f(x) so we can't have that both f(x) and f(y) are zero. We can use this to write a suitable helper function to write an intermediate value function that is 100% guaranteed to work for all monotonically increasing functions (assuming we don't run out of memory):

> monotonicPicker :: (R -> R) -> Rational -> Rational -> Rational
> monotonicPicker f x y = if pickNonZero
>                               (f (fromRational x)) (f (fromRational y))
>                             then x
>                             else y

> ivt3 :: (R -> R) -> R
> ivt3 f = ivt2 (monotonicPicker f) f


By varying monotonicPicker we can find crossing points for all manner of continuous function. Try ivt3 g. But no matter how hard we try, we can never write one that works for all continuous functions.

What went wrong?
So why did the Intermediate Value Theorem, despite its appearance of constructing an intermediate value, fail to yield an algorithm? For an answer to that, you'll have to wait until my next article. The surprising thing is that it's not a problem of analysis, but a problem of logic.

Acknowledgements
I have borrowed liberally from Paul Taylor, especially from here. However, I've chosen to phrase everything in terms of classical reasoning about Haskell programs. Any errors are, of course, mine. It was probably Andrej Bauer's awesome blog that got me interested in this stuff.

Afterword
I want to stress that I have made absolutely no mention of intuitionism or constructivism in what I've said above. I have simply used ordinary classical mathematics to reason about some Haskell programs. I'm leaving that for the next article.






9 Comments:

Anonymous Anonymous said...

The Eq and Ord instances seem of. They don't even use the delta and they make strong assumptions about the representation of the numbers (two numbers are inequal when their ith aproximation differ - even when the difference is less than delta)

Saturday, 29 May, 2010  
Blogger sigfpe said...

Oops. That's the non-working version of those functions from a draft when I wanted to talk about problems with implementing those functions! Trouble with rcs. Let me sort it out...

Saturday, 29 May, 2010  
Blogger sigfpe said...

Put in replacement code. This code hasn't been tested much so I may have to update it. Thanks for pointing out the problem.

Saturday, 29 May, 2010  
Blogger Unknown said...

Excellent! So are you going to implement Theorem 14.15 from Paul's paper for us? That would be cool.

Sunday, 30 May, 2010  
Anonymous Paul Taylor said...

To amplify Andrej's comment, the paper has been revised a lot, and in particular now contains a version of the IVT with the samer generality as the classical theorem (but a different formulation of the result) and a sketch proof of (essentially) the Brouwer fixed point theorem in R^n.

Sunday, 30 May, 2010  
Blogger sigfpe said...

@Andrej,

I tried implementing all the usual arithmetic operations for exact reals a while back. There were a lot of deltas and epsilons to keep track of. The main conclusion I came to was that I'd rather write this kind of code in cooperation with a theorem checker.

Sunday, 30 May, 2010  
Anonymous Paul Taylor said...

My paper "A lambda calculus for real analysis" has at last appeared in the Journal of Logic and Analysis, vol 2, no 5.

Friday, 20 August, 2010  
Blogger Unknown said...

Um, ivt3 g produced 0.5 as expected when I ran your code just now. What problem did you actually have in mind?

Sunday, 01 May, 2011  
Blogger sigfpe said...

Alexey,

ivt3 g works as g is monotonically increasing. I should move that sentence earlier because, as you point out, it leads people to expect failure.

Sunday, 01 May, 2011  

Post a Comment

<< Home