What's the point of eta-conversion in lambda calculus?

Beta conversion is the familiar $(\lambda x.M)N \rightsquigarrow M[x:=N]$.

Eta conversion is $\lambda x.M x \rightsquigarrow M$ when $M$ does not contain $x$ free.

Eta is not a special case of beta, because there is no beta redex anywhere yet. You could say that it anticipates the beta redex that will arise if and when the left-hand side is applied to something. At that point the reduction $$ (\lambda x.M x) N \rightsquigarrow M N $$ can be justified either as a beta or an eta reduction, but the eta reduction could happen before there's even an $N$ there. The eta conversion is intuitively justified by the fact that in the end the only thing you can really do with a lambda term is to apply it to something, so at that point it's not going to matter whether you've already eta reduced or not. So in that sense, eta equivalence does not add anything fundamentally new to the calculus. The availability of eta-reduction does not change whether a term has a normal form, for example.

Eta is, however, sometimes useful as an auxiliary reasoning concept. For example, the combinators $(\lambda y.\lambda x.y x)$ and $(\lambda y.y)$ are observationally equivalent -- but how can we show that? Simply noting that they are $\eta$-equivalent packs a somewhat subtle and long-winded argument into an easily reusable concept.


Yet another way to answer this question is to provide the following quotation from “Lambda Calculus. Its Syntax and Semantics“ (Barendregt, 1981). However, since I have access only to its Russian translation at the moment, I have to translate it back to English. The following is from Section 3.3.

The point of introducing of $\beta\eta$-reduction is to provide an axiom for provable statements in extensional $\lambda$-calculus [the $\lambda + \text{ext}$ theory, where $\text{ext}$ stands for the rule $M x = N x \Rightarrow M = N$] such that it would have Church–Rosser property.

Proposition. $M =_{\beta\eta} N \Leftrightarrow \lambda\eta \vdash M = N \Leftrightarrow \lambda + \text{ext} \vdash M = N$.

[Its proof is based on the following theorem.]

Theorem (by Curry). The theories $\lambda + \text{ext}$ and $\lambda\eta$ are equivalent… [Its proof consists of two parts: $(\text{ext}) \Rightarrow (\eta)$ and vice versa.]

One of the reasons to consider the system $\lambda\eta$ is that it has a certain property of completeness… [Namely, in the sense of the following theorem.]

Theorem. Let $M$ and $N$ both have a normal form. Then, either $\lambda\eta \vdash M = N$, or $\lambda\eta + M = N$ is inconsistent…

HP-complete [after Hilbert–Post] theories correspond to maximum consistent theories in theory of models for the first-order logic.