Modelling laziness with a little bit of strictness

Evaluation of the lambda calculus is a spectrum from strictness, or call by value (CBV), to laziness, or call by name (CBN). Paul Levy’s call by push value (CBPV) subsumes both CBV and CBN. A form of this sentence is in the introduction of every paper on CBPV.

CBPV also subsumes other points along the spectrum. A CBV language with explicit thunking constructs to delay computation lazily can be modelled in CBPV with thunks using the usual CBV translation. Can CBPV model a CBN language with constructs that explicitly evaluate eagerly?

In Levy’s book¹ on CBPV, he actually defines two different translations of CBN. The usual one does not model laziness with a little bit of strictness, while the lesser-known translation (which he calls the “lazy” translation) does. To see where they differ, we start with the syntax of the simply typed CBN lambda calculus with various common constructs, along with the syntax of CBPV. You know the rules; I won’t list them out for you. The only notable difference is that we only use value pairs in CBPV and no computation pairs.

              τ ::= τ → τ | τ × τ | τ + τ | ⊤

e ::= x | λx. e | e e           [functions]
  | (e, e) | prj₁ e | prj₂ e    [pairs]
  | injₗ e | injᵣ e |           [sums]
  | case e of {injₗ x ⇒ e; injᵣ x ⇒ e}
  | ()                          [unit]

            

              A ::= U B | A × A | A + A | ⊤    [value types]
B ::= F A | A → B                [computation types]

v ::= {m} | (v, v) | injₗ v | injᵣ v | ()    [thunks, pairs, sums, unit]
m ::= return v | let x ← m in m              [returners]
  | λx. m | m v                              [functions]
  | let (x, y) ≔ v in m                      [pair matching]
  | case v of {injₗ x ⇒ m; injᵣ x ⇒ m}       [sum matching]

            

Let’s add a strict let expression, which first evaluates its bound expression to a constructor form before evaluating the body. It doesn’t evaluate inside constructors, which remain lazy; this behaviour is similar to, say, pseq in Haskell. To distinguish the effects of evaluation order, we add a single error effect as well. To distinguish this augmented language from the CBN language above, we’ll call it CBN!.

              t ::= λx. e | (e, e) | injₗ e | injᵣ e | ()    [terminals]
e ::= ... | let! x ≔ e in e | error            [strict let, error]
ℰ ::= □ e | prj₁ □ | prj₂ □                    [evaluation contexts]
  | case □ of {injₗ x ⇒ e; injᵣ x ⇒ e}
  | let! x ≔ □ in e

               e₁ ⇝ e₁'
────────────────────────────────────── let!
let! x ≔ e₁ in e₂ ⇝ let! x ≔ e₁' in e₂

──────────────────────────── ζ
let! x ≔ t in e₂ ⇝ e₂[x ↦ t]

──────────────── ε
ℰ[error] ⇝ error

Γ ⊢ e₁ : τ₁
Γ, x : τ₁ ⊢ e₂ : τ₂
──────────────────────── T-let!
Γ ⊢ let! ≔ e₁ in e₂ : τ₂

───────────── T-error
Γ ⊢ error : τ

            

Observational equivalences: two of them

Two well-typed expressions e₁, e₂ : τ are deemed observationally equivalent if, given any well-typed observer context 𝒞 (an expression with a hole of type τ in it), 𝒞[e₁] and 𝒞[e₂] CBN evaluate to the same terminal, written e₁ ≃ e₂. Levy also defines a coarser observational equivalence, written e₁ ≃ᴳ e₂, which requires that 𝒞 have a ground type when plugged in (his ground types are the booleans; ours is only the unit type).

Observational ground equivalence is coarser because it equates more expressions, namely η-equivalence of functions, even in the presence of effects. Consider, for instance, the η-expansion of error, which isn’t equivalent to itself under the empty context.

error ≄ λx. error x

However, they are ground equivalent: informally, for a ground context to have a ground type, it must either apply the function, in which case both applications run to error, or not use the function at all.

              error ⇜ error e        ≃ᴳ (λx. error x) e        ⇝* error    [𝒞 = □ e]
()    ⇜ (λ_. ()) error ≃ᴳ (λ_. ()) (λx. error x) ⇝* ()       [𝒞 = (λ_. ()) □]

The corresponding notion of observational equivalence ≃ for CBPV is defined in terms observer contexts of ground returner types F T, where CBPV ground types are non-thunk values: T ::= T × T | T + T | ⊤. Levy gives two ways to translate CBN to CBPV, distinguished by which observational equivalence is modelled by CBPV’s observational equivalence. The usual CBN translation ⟦·⟧ᴺ models ≃ᴳ: if e₁ ≃ᴳ e₂, then ⟦e₁⟧ᴺ ≃ ⟦e₂⟧ᴺ, so it must preserve η-equivalence. The “lazy” translation ⟦·⟧ᴸ models ≃: if e₁ ≃ e₂, then ⟦e₁⟧ᴸ ≃ ⟦e₂⟧ᴸ, so it must not preserve η-equivalence. The former translation is more commonly used precisely because it justifies η-equivalence.

By these properties, without going into what the translations are, we already know that ⟦·⟧ᴺ couldn’t possibly model CBN!. This is because strict let violates η-equivalence! We demonstrate this with the ground context let x ≔ □ in ().

error ⇜ let x ≔ error in () ≄ᴳ let x ≔ λx. error x in () ⇝ ()

This violation is similar to how in CBV, observational equivalence never preserves η-equivalence in the presence of effects, because the context (λ_. ()) □ strictly evaluates what’s plugged in first. In fact, Levy defines the “lazy” translation from CBN into (fine-grained) CBV, separately from the translation from (fine-grained) CBV to CBPV. Rather than roundabout translations, we’ll instead translate CBN! directly into CBPV.

A little bit of strictness, as a treat

Since the word “lazy” is quite overloaded these days, we’ll simply call ⟦·⟧ᴸ the L translation, reserving the CBN translation to refer to the usual ⟦·⟧ᴺ one. The L translation is guided by two principles:

Like the CBN translation, all variables represent thunked computations.
Like the CBV translation, all terms translate to returners.

There is then really only one way to translate CBN! types, which produce computation types, with the second property as a lemma. Translating contexts then requires thunking translated types to satisfy the first property.

              ⟦τ₁ → τ₂⟧ᴸ ≜ F (U (U ⟦τ₁⟧ᴸ → ⟦τ₂⟧))
⟦τ₁ × τ₂⟧ᴸ ≜ F (U ⟦τ₁⟧ᴸ × U ⟦τ₂⟧ᴸ)
⟦τ₁ + τ₂⟧ᴸ ≜ F (U ⟦τ₁⟧ᴸ + U ⟦τ₂⟧ᴸ)
⟦⊤⟧ᴸ       ≜ F ⊤

⟦•⟧ᴸ        ≜ •
⟦Γ, x : A⟧ᴸ ≜ ⟦Γ⟧ᴸ, x : U ⟦A⟧ᴸ

            

Lemma: For all τ, there is some A such that ⟦τ⟧ᴸ = F A.

The translation of the terms falls out from following the types so that type preservation holds. When translating strict let, we need to return the result of evaluation in a thunk, which we can either inline at the cost of duplicating thunk allocations, or we can bind the thunk at the cost of adding an extra step of computation.

              ⟦x⟧ᴸ        ≜ x!
⟦()⟧ᴸ       ≜ return ()
⟦λx. e⟧ᴸ    ≜ return {λx. ⟦e⟧ᴸ}
⟦e₁ e₂⟧ᴸ    ≜ let f ← ⟦e₁⟧ᴸ in f! {⟦e₂⟧ᴸ}
⟦(e₁, e₂)⟧ᴸ ≜ return ({⟦e₁⟧ᴸ}, {⟦e₂⟧ᴸ})
⟦prjᵢ e⟧ᴸ   ≜ let p ← ⟦e⟧ᴸ in let (x₁, x₂) ≔ p in xᵢ
⟦injᵢ e⟧ᴸ   ≜ return (injᵢ {⟦e⟧ᴸ})
⟦case e of {injᵢ x ⇒ eᵢ}⟧ᴸ ≜ let q ← ⟦e⟧ᴸ in case q of {injᵢ x ⇒ ⟦eᵢ⟧ᴸ}
⟦let! x ≔ e₁ in e₂⟧ᴸ  ≜ let x ← ⟦e₁⟧ᴸ in ⟦e₂⟧ᴸ[x ↦ {return x}]
                     ᴏʀ let y ← ⟦e₁⟧ᴸ in let x ← return {return y} in ⟦e₂⟧ᴸ
⟦error⟧ᴸ              ≜ error

            

Lemma: If Γ ⊢ e : τ then ⟦Γ⟧ᴸ ⊢ ⟦e⟧ᴸ : ⟦τ⟧ᴸ.

As expected, CBPV observational equivalence doesn’t preserve CBN η-equivalence:

              ⟦error⟧ᴸ = error ≄ return {λx. let f ← error in f! x} = ⟦λx. error x⟧ᴸ

            

The context let g ← □ in return () distinguishes them by erroring on the left and returning unit on the right. Proving that the L translation preserves observational equivalence is a bit involved and requires a logical equivalence; instead, we’ll content ourselves with knowing that reduction is preserved.

Lemma: If e₁ ⇝ e₁ then ⟦e₁⟧ᴸ ⇝* ⟦e₂⟧ᴸ, where CBPV reduction may take arbitrary force–thunk expansion steps e ⇝ {e}!.

The lemmas above have been mechanized in Lean here.

Paul Levy. Call-By-Push-Value: A Functional/Imperative Synthesis. (Semantic Structures in Computation, volume 2, 2003). ᴅᴏɪ: 10.1007/978-94-007-0954-6. ↩

laziness strictness call by name call by push value