Sunday, January 30, 2022

Anonymous Recursion

Anonymous Recursion

I've been reading Denotational Semantics, a textbook describing a theory of programming languages. It was recommended to me years ago by a colleague. He said something along the lines of "this book taught me everything I know about programming." He seemed to know a lot about programming.

The first few sections introduce a formula rewriting system known as the lambda calculus, which was familiar to me due to my previous dabbling with Scheme, a programming language invented to study the lambda calculus.

In particular, one of the ideas introduced in the lambda calculus section of Denotational Semantics is illustrated differently in another book I read, The Little Schemer. The idea is how an anonymous procedure can nonetheless refer to itself by name.

This post is my attempt to explain that idea.

Scheme

Instead of using the lambda calculus, I'll use Scheme. I like Scheme.

Here's the definition of a procedure:

(define collatz-end
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (collatz-end (/ n 2))]
      [(odd? n) (collatz-end (+ 1 (* 3 n)))])))

Here n is any nonzero natural number, so n could be any of 1, 2, 3, 4, ..., 44989933, etc., but not -4, and not 0.

The procedure collatz-end is equivalent to the following procedure, provided that the Collatz conjecture is true:

(λ (n) 1)

The Collatz conjecture is almost certainly true, but it might be impossible to prove.

Anyway, in Scheme, (λ (x) Expr) is a procedure of one parameter, x, that evaluates to whatever Expr would be if xs within it were substituted for whatever we supplied for x.

For example, (λ (x) (+ 3 x)), when applied to 7, is (+ 3 7), which is 10. The expression that says "apply (λ (x) (+ 3 x)) to 7" is the procedure and its argument next to each other in parentheses. So, ((λ (x) (+ 3 x)) 7) is (+ 3 7) is 10.

It's a bit trickier than that, though, because the body of the λ could contain other λs, and those might introduce a parameter called x, too. In that case, the inner x is really a different x from our x, and so we'd have to rename one of them to accommodate the other. Let's not worry about that here.

cond is a shorthand for nested if expressions. An if expression has the form

(if Predicate Consequent Alternative)

Each of Predicate, Consequent, and Alternative is an arbitrary expression. If the result of evaluating Predicate is not falsy (I won't bother describing falsiness in Scheme), then the value of the if expression is the result of evaluating Consequent. In that case, Alternative is not evaluated. On the other hand, if the result of evaluating Predicate is falsy, then the value of the if expression is the result of evaluating Alternative. In that case, Consequent is not evaluated.

cond, then, is just a macro that nests its [Predicate Consequent] arguments. I'll define it here using syntax-rules:

(define-syntax cond
  (syntax-rules (else)
    [(cond)
     (raise-user-error "Unmatched cond alternative")]

    [(cond [else expr])
     expr]

    [(cond [predicate expr] rest ...)
     (if predicate expr (cond rest ...))]))

Note that square brackets ([]) and parentheses (()) have the same meaning, and are varied as a matter of style.

Now let's look at the definition of collatz-end again:

(define collatz-end
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (collatz-end (/ n 2))]
      [(odd? n) (collatz-end (+ 1 (* 3 n)))])))

If n is 1, then we finish at 1. If instead n is any even number, then we divide n in half and see what collatz-end gives for that number. So, collatz-end is defined in terms of itself. If n is an odd number, then we take triple n and add 1, and see what collatz-end gives for that number.

It's not at all obvious that this procedure always evaluates to 1, but the smart money says it does.

Recursion

I defined procedure application in terms of parameter rewriting, e.g. ((λ (x) (+ 3 x)) 9) means "use 9 instead of x in the body of the λ", i.e. (+ 3 9), which is 12.

Then what is define?

define introduces a name (collatz-end), and then associates a value with that name ((λ (n) (cond ...))), where the name is visible inside the definition of the value.

define is a facility provided by the execution environment. We cannot get that behavior using λs only. Or can we?

Consider this anonymous cousin of collatz-end:

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (f (/ n 2))]
      [(odd? n) (f (+ 1 (* 3 n)))])))

This is not the same as collatz-end. Instead, it's one "step" in the execution of collatz-end. Give me a "next" procedure (f), and I'll give you a procedure that takes an n, does one Collatz step on it, and then passes the resulting number to f.

Observe how this entire expression is nearly what we want for f.

Here's another cousin:

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) ((f f) (/ n 2))]
      [(odd? n) ((f f) (+ 1 (* 3 n)))])))

You are forgiven if your head is starting to hurt. This is like the previous example, except that now the f that we accept is a procedure that takes a procedure and returns a procedure that takes a nonzero natural number. That is, f is something that looks like the most recent example.

If we apply this procedure to itself, do we get collatz-end?

((λ (f) (λ (n) (cond [(= n 1) 1] [(even? n) ((f f) (/ n 2))] [(odd? n) ((f f) (+ 1 (* 3 n)))])))
 (λ (f) (λ (n) (cond [(= n 1) 1] [(even? n) ((f f) (/ n 2))] [(odd? n) ((f f) (+ 1 (* 3 n)))]))))

This evaluates to a procedure that takes a nonzero natural number, and ...

does the same thing that collatz-end does.

Whether an execution environment will demonstrate this fact depends on the evaluation order used in the execution. Thanks to the "short-circuit" evaluation property of if, which underlies our cond, this procedure really is the same as collatz-end.

Combinators

The self-applicative monstrosity above is not yet satisfactory, though. Let's look at the original procedure again:

(define collatz-end
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (collatz-end (/ n 2))]
      [(odd? n) (collatz-end (+ 1 (* 3 n)))])))

This entire snippet of code is the same no matter which name we choose for collatz-end, as long as our choice is not already taken (λ, cond, even?, ...).

So, collatz-end is reasonably thought of as a parameter. Let's call it f.

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (f (/ n 2))]
      [(odd? n) (f (+ 1 (* 3 n)))])))

We're now back to where we were at the beginning of the previous section. This procedure is not collatz-end, but it's the natural non-recursive analog to collatz-end. Let's call this procedure E.

Can we define a procedure that, given E, returns collatz-end?

Earlier, we modified E, replacing appearances of f with (f f). This allowed us to describe collatz-end as the modified procedure applied to itself.

It would be nicer if such modification weren't necessary. Maybe there is a way to encode that transformation in the operation that takes this anonymous procedure and gives us collatz-end.

Well, we have a procedure that looks like this:

(λ (f) (λ (n) ...))

where the ... uses f as a procedure taking a natural number, i.e. some (λ (n) ...). We want that inner f to be based on the overall procedure.

Consider this procedure:

(λ (f) (λ (n) ((f f) n)))

This is an attempt to encapsulate the transformation we performed in the previous section; namely, to take a procedure that returns a Collatz-like procedure, and transform it into a Collatz-like procedure whose inner "next" procedure is the original procedure applied to itself.

It's hard to say in English.

Let's call this procedure Sb, for "antimony." No, "self-bind."

Then, we can combine E and Sb to produce an anonymous procedure that might get us closer to collatz-end:

(E (Sb E))

Remember that E is

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) (f (/ n 2))]
      [(odd? n) (f (+ 1 (* 3 n)))])))

So, we're looking at (expanding E and Sb)

((λ (f) (λ (n) (cond [(= n 1) 1] [(even? n) ((f f) (/ n 2))] [(odd? n) (f (+ 1 (* 3 n)))])))
 ((λ (f) (λ (n) ((f f) n)))
  (λ (f) (λ (n) (cond [(= n 1) 1] [(even? n) ((f f) (/ n 2))] [(odd? n) (f (+ 1 (* 3 n)))])))))

Is this collatz-end?

Not quite. I made a mistake. Close, but no cigar. Look at (E (Sb E)) again, expanding only Sb:

(E
  ((λ (f) (λ (n) ((f f) n))) E))

That is

(E (λ (n) (E E) n))

That λ (n) ... is passing E as the f argument to E. But E and f have incompatible types!

This construction "(E (Sb E))" only works "one layer deep," and then becomes invalid.

The thing that we pass as the f argument to E can't be E itself, it has to be the thing we transformed E into.

Mind bender!

Alright, let's go back to the thing that works, but is not ideal:

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) ((f f) (/ n 2))]
      [(odd? n) ((f f) (+ 1 (* 3 n)))])))

Remember that this thing applied to itself is collatz-end. We need to get that inner (f f) working for us without having it in there.

Here's a transformation to consider:

(λ (f) (f f))

Well, we want to double up the fs and then pass that to E, so

(λ (f) (E (f f)))

Let's expand E in that. Do we get the same thing as "the thing that works," above?

(λ (f) (E (f f)))

(λ (f)
  ((λ (f)
     (λ (n)
       (cond
         [(= n 1) 1]
         [(even? n) (f (/ n 2))]
         [(odd? n) (f (+ 1 (* 3 n)))])))
    (f f)))

(λ (f)
  (λ (n)
    (cond
      [(= n 1) 1]
      [(even? n) ((f f) (/ n 2))]
      [(odd? n) ((f f) (+ 1 (* 3 n)))])))

Yes, we do.

Ok, so the operation (λ (f) (E (f f))) is the secret sauce. That gives us something that we can apply to itself, giving us collatz-end.

That is, collatz-end is the same as:

((λ (f) (E (f f))) (λ (f) (E (f f))))

A nicer way to look at it is to consider this operation as a procedure applied to E. Let's call the procedure R for "recursive."

(define R
  (λ (E) ((λ (f) (E (f f))) (λ (f) (E (f f))))))

Note how now E is a parameter.

Then collatz-end is the same as

(R
  (λ (f)
    (λ (n)
      (cond
        [(= n 1) 1]
        [(even? n) (f (/ n 2))]
        [(odd? n) (f (+ 1 (* 3 n)))]))))

A combinator is a procedure within which every variable has a matching λ.

R is a combinator. E and collatz-end are not, because they depend on the "free" variables collatz-end, even?, =, etc.

R is a particularly famous procedure known as the Y combinator.

Conclusion

Fun stuff, right? Don't forget to floss your teeth, pay your taxes, take off your shoes, and be well!