Sunday, February 05, 2017

Logarithms and exponentials of functions


A popular question in mathematics is this: given a function , what is its "square root" in the sense that . There are many questions about this on mathoverflow but it's also a popular subject in mathematics forums for non-experts. This question seems to have a certain amount of notoriety because it's easy to ask but hard to answer fully. I want to look at an approach that works nicely for formal power series, following from the Haskell code I wrote here. There are some methods for directly finding "functional square roots" for formal power series that start as , but I want to approach the problem indirectly. When working with real numbers we can find square roots, say, by using . I want to use an analogue of this for functions. So my goal is to make sense of the idea of the logarithm and exponential of a formal power series as composable functions. Warning: the arguments are all going to be informal.


There's potential for a lot of ambiguous notation here, especially as the usual mathematical notation for th powers of trig functions is so misleading. I'm going to use for composition of functions and power series, and I'm going to use the notation to mean the th iterate of . So and . As I'll be working mostly in the ring of formal power series for some ring , I'll reserve the variable to refer only to the corresponding element in this ring. I'll also use formal power series somewhat interchangeably with functions. So can be thought of as representing the identity function. To make sure we're on the same page, here are some small theorems in this notation:

  1. .
That last one simply says that adding one times is the same as adding .

As I'm going to have ordinary logarithms and exponentials sitting around, as well as functional logarithms and exponentials, I'm going to introduce the notation for functional logarithm and for functional exponentiation.


The first goal is to define a non-trivial function with the fundamental property that

First, let's note some basic algebraic facts. The formal power series form a commutative ring with operations and (ordinary multiplication) and with additive identity and multiplicative identity . The formal power series form a ring-like algebraic structure with operation and partial operation with additive identity and multiplicative identity . But it's not actually ring or even a near-ring. Composition isn't defined for all formal power series and even when it's defined, we don't have distributivity. For example, in general , after all there's no reason to expect to equal . We do have right-distributivity however, i.e.

more or less by definition of .

We can't use power series on our power series

There's an obvious approach, just use power series of power series. So we might tentatively suggest that

Note that I consider rather than because is the multiplicative identity in our ring-like structure.

Unfortunately this doesn't work. The reason is this: if we try to use standard reasoning to show that the resulting function has the fundamental property we seek we end up using distributivity. We don't have distributivity.

Sleight of hand

There's a beautiful trick I spotted on mathoverflow recently that allows us to bring back distributivity. (I can't find the trick again, but when I do I'll come back and add a link and credit here.) Consider the function defined by . In other words is right-composition by . (Ambiguity alert, I'm using here to mean right. It has nothing to do with the ring underlying our formal power series.) Because we have right-distributivity, is a bona fide linear operator on the space of formal power series. If you think of formal power series as being infinitely long vectors of coefficients then can be thought of as an infinitely sized matrix. This means that as long as we have convergence, we can get away with using power series to compute with the property that . Define:

We have:
where I'm using to mean the identity linear operator. And now have:
But does it converge? Suppose is of the form . Then . The leading term in is the same as the leading term in . So kills the first term of whatever it is applied to, which means that when we sum the terms in , we only need to get a power series correct to coefficients. Reusing my code from here, I call by the name flog. Here is its implementation:

> import Data.Ratio

> flog :: (Eq a, Fractional a) => [a] -> [a] > flog f@(0 : 1 : _) = > flog' 1 (repeat 0) (0 : 1 : repeat 0) > where flog' n total term = take (n+1) total ++ ( > drop (n+1) $ > let pz = p term > in flog' (n+1) (total-map (((-1)^n / fromIntegral n) *) pz) pz) > p total = (total ○ f) - total

The take and drop are how I tell Haskell when the first coefficients have been exactly computed and so no more terms are necessary.

Does it work?

Here's an example using the twice iterated sin function:

> ex1 = do
>   let lhs = flog (sin (sin z))
>   let rhs = 2*flog (sin z)
>   mapM_ print $ take 20 (lhs-rhs)

Works to 20 coefficients. Dare we try an inverse function?

> ex2 = do
>   let lhs = flog (sin z)
>   let rhs = flog (asin z)
>   mapM_ print $ take 20 (lhs+rhs)

Seems to work!


It's no good having logarithms if we can't invert them. One way to think about the exponential function is that

We get better and better approximations by writing the expression inside the limit as a product of more and more terms. We can derive the usual power series for from this, but only if right-distributivity holds. So let's try to use the above expression directly:
and get
Unfortunately, even though is linear, itself isn't. So it's going to take some extra work to raise to the power of .

The good news is that we're dealing with the special case where is something small. We have

So is actually modulo higher order terms. This gives us
This is something we can implement using the power series for ordinary :
In code that becomes:

> fexp f@(0 : 0 : _) = fexp' f 0 z 1
> fexp' f total term n = take (n-1) total ++ drop (n-1)
>           (fexp' f (total+term) (map (/fromIntegral n) (f*d term)) (n+1))

Note how when we differentiate a power series we shift the coefficients down by one place. To counter the effect of that so as to ensure convergence we need to look like . Luckily this is exactly the kind of series gives us.

But does it successfully invert ? Let's try:

> ex3 = do
>   let lhs = sin z
>   let rhs = fexp (flog (sin z))
>   mapM_ print $ take 20 (lhs-rhs)

Now we can start computing fractional iterates. Square root first:

> ex4 = do
>   mapM_ print $ take 20 $ fexp (flog (sin z)/2)

That matches the results at A048602 and A048603.

Cube root:

> ex5 = do
>   mapM_ print $ take 20 $ fexp (flog (sin z)/3)

Matches A052132 and A052135.

And this gives an alternative to Lagrange inversion for computing power series for inverse functions:

> ex6 = do
>   let lhs = fexp (-flog (sin z))
>   let rhs = asin z
>   mapM_ print $ take 20 (lhs-rhs)

What's really going on with ?

Let's approach in a slightly different way. In effect, is the composition of lots of with . So let's try composing these one at a time, with one composition every seconds. After one second we should have our final result. We can write this as:

and to first order.
So we're solving the differential equation:
with .

So is the function that solves one of the most fundamental differential equations. This also means I can use Mathematica to solve symbolically and check my results. For example, Mathematica says that the solution to

at is
so let's check:

> ex7 = do
>   let lhs = fexp ((sin z)^2)
>   let rhs = atan (tan z/(1-tan z))
>   mapM_ print $ take 20 (lhs-rhs)

I like this example because it leads to the generalized Catalan numbers A004148:

> ex8 = do
>     mapM_ print $ take 20 $ fexp (z^2/(1-z^2))

That suggests this question: what does mean combinatorially? I don't have a straightforward answer but solving this class of differential equation motivated the original introduction, by Cayley, of the abstract notion of a tree. See here.

What is going on geometrically?

For those who know some differential geometry, The differential equation

describes a flow on the real line (or complex plane). You can think of as being a one-dimensional vector field describing how points move from time to . When we solve the differential equation we get integral curves that these points follow and tells us where the points end up after one unit of time. So is the exponential map. In fact, is essentially the exponential of the vector field where we're now using the differential geometer's notion of a vector field as a differential operator.

Final word

Unfortunately the power series you get from using and don't always have good convergence properties. For example, I'm not sure but I think the series for has radius of convergence zero. If you truncate the series you get a half-decent approximaion to a square root in the vicinity of the origin, but the approximation gets worse, not better, if you use more terms.

And the rest of the code

> (*!) _ 0 = 0
> (*!) a b = a*b
> (!*) 0 _ = 0
> (!*) a b = a*b
> (^+) a b = zipWith (+) a b
> (^-) a b = zipWith (-) a b

> ~(a:as) ⊗ (b:bs) = (a *! b): > ((map (a !*) bs) ^+ (as ⊗ (b:bs))) > (○) (f:fs) (0:gs) = f:(gs ⊗ (fs ○ (0:gs))) > inverse (0:f:fs) = x where x = map (recip f *) (0:1:g) > _:_:g = map negate ((0:0:fs) ○ x) > invert x = r where r = map (/x0) ((1:repeat 0) ^- (r ⊗ (0:xs))) > x0:xs = x

> (^/) (0:a) (0:b) = a ^/ b > (^/) a b = a ⊗ (invert b)

> z :: [Rational] > z = 0:1:repeat 0

> d (_:x) = zipWith (*) (map fromInteger [1..]) x

> integrate x = 0 : zipWith (/) x (map fromInteger [1..])

> instance (Eq r, Num r) => Num [r] where > x+y = zipWith (+) x y > x-y = zipWith (-) x y > ~x*y = x ⊗ y > fromInteger x = fromInteger x:repeat 0 > negate x = map negate x > signum (x:_) = signum x : repeat 0 > abs (x:xs) = error "Can't form abs of a power series"

> instance (Eq r, Fractional r) => Fractional [r] where > x/y = x ^/ y > fromRational x = fromRational x:repeat 0

> sqrt' x = 1 : rs where rs = map (/2) (xs ^- (rs ⊗ (0:rs))) > _ : xs = x > instance (Eq r, Fractional r) => Floating [r] where > sqrt (1 : x) = sqrt' (1 : x) > sqrt _ = error "Can only find sqrt when leading term is 1" > exp x = e where e = 1+integrate (e * d x) > log x = integrate (d x/x) > sin x = integrate ((cos x)*(d x)) > cos x = [1] ... negate (integrate ((sin x)*(d x))) > asin x = integrate (d x/sqrt(1-x*x)) > atan x = integrate (d x/(1+x*x)) > acos x = error "Unable to form power series for acos" > sinh x = integrate ((cosh x)*(d x)) > cosh x = [1] ... integrate ((sinh x)*(d x)) > asinh x = integrate (d x/sqrt(1+x*x)) > atanh x = integrate (d x/(1-x*x)) > acosh x = error "Unable to form power series for acosh" > pi = error "There is no formal power series for pi"

> lead [] x = x > lead (a:as) x = a : (lead as (tail x)) > a ... x = lead a x

> (//) :: Fractional a => [a] -> (Integer -> Bool) -> [a] > (//) a c = zipWith (\a-> \b->(if (c a :: Bool) then b else 0)) [(0::Integer)..] a

A direct functional square root that doesn't use and :

> fsqrt (0 : 1 : fs) =
>     let gs = (fs-(0 : gs*((0 : delta gs gs)+((2 : gs)*(gs*g)))))/2
>         g = 0 : 1 : gs
>         delta (g : gs) h = let g' = delta gs h
>                    in (0 : ((1 : h) * g')) + gs
>     in g


Blogger Unknown said...

I just spotted this on mathoverflow: which points out some of the connections between combinatorics and some of the things I've been computing.

Monday, 06 February, 2017  
Blogger Unknown said...

Wonderful post.

I think there's a typo in eq. 2 in the Notation section. Should be
z^{\circ m}\circ z^{\circ n} = z^{\circ m + n}

Monday, 06 February, 2017  
Blogger sigfpe said...

I did a search on formal power series on arxiv and found this:

It also computes the "functional" logarithm and uses a series like mine, but they ultimately do something different with it and I haven't read it properly yet. They also end by saying they're working on Haskell code.

Thursday, 09 February, 2017  
Blogger Blair A said...

Hi, I stumbled across your blog and it looked interesting. Although, I'm not sure I follow your theorem two, you write z^n o z^m = z^(nm); should the lhs really be (z^n)^m or perhaps the rhs should be z^(n+m)?

Tuesday, 14 February, 2017  
Blogger sigfpe said...

Blair S,

You are correct to point out that there is a problem with law 2. But it might not be the problem you think :-) As it stands, I think it's correct, but for trivial reasons.

z is the identity function so z^oa o z^ob = z^oc for any a, b and c.

That wasn't my intention so I'll delete it when I have a moment.

Wednesday, 15 February, 2017  
Blogger sigfpe said...

Added some updates to the "small theorems"

Thursday, 16 February, 2017  
Blogger Unknown said...

In the Preliminaries section I believe you meant the equation before the last to be (f+g) o h = foh + goh

Wednesday, 08 March, 2017  

Post a Comment

<< Home