One example of this fact is when considering CCS with the choice operator. In this language we can define a process \(P\) and a process \(Q\) as follows

\[P = \text{pay}.(\text{coffee}. 0 + \text{tea}. 0)\] \[Q = (\text{pay}.\text{coffee}.0 + \text{pay}.\text{tea}. 0)\]Now the *trace semantics* of the CCS processes can be defined by a function

where \(L\) is the finite set of actions and, for a generic set \(A\), the set \(\text{Str } A = 1 + A \times \text{Str }A\) is the set of possibly finite streams over a set \(A\).

For the processes above we have that the semantics of \(P\) is \([\![P]\!] = \{\text{pay}.\text{coffee}, \text{pay}.\text{tea}\}\) and the semantics of \(Q\) is \([\![ Q ]\!] = \{\text{pay}.\text{coffee}, \text{pay}.\text{tea}\}\) and thus the trace semantics of \(P\) and \(Q\) indicate that these processes should be equal.

However, consider the relation \(P\) *simulates* \(Q\) which is stated as

Now the *bisimulation* relation can be defined as \(P \approx Q \Leftrightarrow P
\lesssim Q \text{ and } Q \lesssim P\).

The above example is a standard example in concurrency theory that shows that bisimulation can distinguish processes where equality on the trace semantics indicate that they should be regarded as equal and that is why bisimulations turn out to be more useful relations to compare processes.

Using the example above we can prove that \(Q \lesssim P\). Let’s define half-evaluated processes as

\[P' = \text{coffee}. 0 + \text{tea}. 0\] \[P'_{1} = \text{coffee}. 0\] \[P'_{2} = \text{tea}. 0\] \[Q_{1} = \text{pay}.\text{coffee}.0\] \[Q_{2} = \text{pay}.\text{tea}.0\] \[Q'_{1} = \text{coffee}.0\] \[Q'_{2} = \text{tea}.0\]Now for all transitions of \(Q\) we have to show \(P\) simulates them. The first one is \(Q \xrightarrow{\text{pay}} Q'_{1}\). Obviously \(P \xrightarrow{\text{pay}} P'_{1}\) and so now we have to show that \(Q'_{1} \lesssim P'_{1}\) which clearly does. This works similarly if \(Q\) decides to take the other route and produce tea in the end.

All right, but \(P \lesssim Q\) does not work. This is because since \(P\) makes a transition \(P \xrightarrow{\text{pay}} P'\) we are forced to select which branch in \(Q\) is simulating this behaviour. No matter which one we choose we get stuck in one way or the other. Say \(Q \xrightarrow{\text{pay}} Q'_{1}\) we have to show \(P' \lesssim Q'_{1}\), but this latter fact does not hold because \(P'\) can make two different transitions and \(Q'_1\) can only make one.

Consider now the unfold function which takes a seed function an produces a trace
by *running* the seed at each steps

Notice that the seed function \(X \to L \times X\) can be viewed as a Labeled Transition System (LTS) where the set of states is \(X\) and the function is the function implementing the transitions.

It is a very well-known fact that the `unfold`

is a fully abstract map in the
sense if we consider the notion of bisimilarity above and set \([\![ \cdot
]\!]\) to be `unfold seed`

then we have the following theorem

Full abstraction\(\text{ for all } t_{1}, t_{2}, t_{1} \approx t_{2} \Leftrightarrow [\![ t_{1} ]\!] = [\![ t_{2}]\!]\).

This is also backed by the fact that when programming in proof assistants like (e.g.) Agda – since coinductive data types are not really final coalgebras – it is common practice to just add the following axiom to the type theory

Axiom\(\text{ for all } (s_{1}, s_{2} : \text{Str L}). s_{1} \approx s_{2} \to s_{1} = s_{2}\) .

Even more so, in some proof assistants like Isabelle coinductive data types are real final coalgebras and so the above axiom is actually a true fact in the prover’s logic.

Notice that the other direction is obvious and thus the axiom implies
bisimiliary is *logically equivalent* equality.

So why bisimulation in the above example does not correpond to equality?

The reason is that the shape behaviours for CCS+choice is not \(BX = L \times X\) but it is \(\mathcal{P}_\text{fin}(L \times X)\).

In fact, the *seed* function describing the LTS of CCS+choice has the following
type

where we use lists `[-]`

as a (rough) implementation of finite powersets.

At this point the LTS for CCS+choice can be defined roughly like this

And now the unfold on this LTS will yield a fully abstract semantics

where \(\text{Trees}\; L = \mathcal{P}_\text{fin} (L \times \text{Trees}\; L)\).

]]>Say that you want to do denotational semantics for a simply typed \(\lambda\)-calculus with a unary constructor \(\textsf{R}\) which has the following typing rule

\[\frac{\Gamma \vdash t : A}{\Gamma \vdash \textsf{R}(t) : B}\]The task is to give a semantic interpretation \([\![ \cdot ]\!]\) for the language by induction on the typing judgment \(\Gamma \vdash t : A\) such that terms are interpreted as morphisms \([\![\Gamma ]\!] \xrightarrow{[\![ t ]\!]} [\![ A ]\!]\), assuming for course \([\![ \cdot ]\!]\) is also defined separately for contexts and types.

We interpret the rule above we do induction on the typing judgment. Thus we assume there exists a morphism \([\![ \Gamma ]\!] \xrightarrow{[\![ t ]\!]} [\![ A ]\!]\) and we construct a morphism \([\![ \Gamma ]\!] \xrightarrow{[\![ \textsf{R} ]\!] } [\![ B ]\!]\).

For simplicity we remove the semantics brackets, for example, assuming \(A\) be interpretation of \([\![ A ]\!]\), \(t : \Gamma \to A\) the interpretation of \(t\) an so on.

Back to the problem we are trying to solve. It can be quite tricky sometimes to figure out what the semantics of \(\textsf{R}(t)\) are since there is some plumming needed to pass around the context. A particular instantiation of the Yoneda lemma states that given a morphism \(t : \Gamma \xrightarrow{t} A\) and a morphism \(R : A \to B\) there is a canonical way to construct a morphism \(\Gamma \xrightarrow{R(t)} B\).

To show this we instantiate the contravariant Yoneda lemma by setting \(F = \mathbb{C}(-, B)\). Then for all objects \(A : \mathbb{C}^{\text{op}}\) we have

\[\mathbb{C}(A, B) \cong \mathbb{C}(-, A) \xrightarrow{\cdot} \mathbb{C}(-, B)\]Let \(R : A \to B\) be the interpretation of \(\textsf{R}\) then, one side of the isomorphism is \(\phi (\textsf{R},t) = F(t)(\textsf{R}) = \mathbb{C}(t, B)(\textsf{R})\). In other words, the interpretation of \(\textsf{R}(t)\) is simply \(\textsf{R} \circ t\).

]]>Assume \(\Lambda_X\) is the set of closed well-typed STLC (Simply Typed \(\lambda\)-calculus) terms. Clearly, STLC can be interpreted into any Cartesian Closed category (CCC) by defining an interpretation function \([\![\cdot]\!] : \Lambda_X \to \mathcal{C}\) such that for any term \(t \in \Lambda_X\) , \([\![t]\!] \in \mathcal{C}(1, [\![\sigma]\!])\) where \(\sigma\) is the type of \(t\). We will only consider well-typed interpretations here. Moreover, it can be proved that the interpretation function is sound and complete. The completeness statement reads as follows. For all terms \(t_1\) and \(t_2\),

\[t_1 \equiv_{\beta\eta} t_2 \text{ iff } [\![t_1]\!] = [\![t_2]\!]\]where the \((\Rightarrow)\) direction is soundness whereas \((\Leftarrow)\) is *completeness of the interpretation*.

This statement is certainly true. If two terms are \(\beta\eta\) equivalent they are equal in the model, i.e. the semantics is agnostic to \(\beta\eta\)-step reductions. Conversely, all equations that hold for any two STLC-denotable terms also hold in the syntax.

However, completeness of a model is a slightly different statement:

\[t_1 =_{\beta\eta} t_2 \text{ iff for all } [\![ \cdot ]\!] : \Lambda_X \to \mathcal{C}, [\![t_1]\!] = [\![t_2]\!]\]This one states that fixed a category \(\mathcal{C}\), \(\beta\eta\)-equivalence between terms holds if and only if these two terms are equal in **every** possible interpretation.

In this sense, CCC categories are not *complete models*. The counter example is given by the preorder category \(\mathcal{P}\) with CCC structure. The preorder the category has at most one morphism (\(\sqsubset\)) between objects. If this category has the greatest element \(\top\), binary meets (\(\wedge\)) and Heyting implications (\(\to\)) then \(\mathcal{P}\) is CCC.

Now the problem is that when the category is thin every (well-typed) interpretation interprets two programs of the same type into morphisms of the same type, but since the category is thin these two morphisms are always equal. For example, consider the projection maps out of the product \(x \wedge x \xrightarrow{\pi_1} x\) and \(x \wedge x \xrightarrow{\pi_2} x\) for the particular case when the codomain of the two coincide. In \(\mathcal{P}\) these two are the same map, i.e. \(\pi_1 = \pi_2\).

Now the right-hand side of the completeness theorem is satisfied since For all well-typed interpretations \([\![\cdot]\!]\) we have \([\![\pi_1]\!] = [\![\pi_2]\!]\) (when the codomain of the two is the same). However, the projections \(\pi_1\) and \(\pi_2\) in the syntax are definitely not \(\beta\eta\)-equivalent.

I will defer the reader to the original paper for more details.

]]>First off, I do not consider myself an expert on set theory, but after having this kind of conversation with mathematicians and computer scientists I found *there are* some
misconceptions around this axiom and the reasons why it is needed.

For example, as you will see, it is indeed true that the axiom of choice is connected with the existential quantifier, it is not true, however, that we cannot pick an element out of the existential because the logic is classical.

In my mind there are two problems: the first is that

the existential quantifier does not ensure there exists

oneelement with a particular property in the domain of discourse

and the second is that

we would need to create a infinite proof that uses Existential Instantiation for each element of the indexing set

However, in order to fully understand what is going on we need to be more precise. So first let’s begin with what is the axiom of choice.

The original formulation of the AC is the following.

Given a set \(X\) and a family of non-empty sets \(\{A_x\}_{x \in X}\) over \(X\), the infinite product of these sets, namely \(\Pi_{x \in X}. A_{x}\) is non-empty

For the record, the infinite product is defined as follows

\[\Pi_{x \in X}. A_{x} = \{ f : X \to \bigcup_{x \in X} A_{x} \mid f(x) = A_{x} \}\]However, this statement is a little bit more packed than we would like it to be. An equivalent statement is skolemization.

Skolemization is what allows one to turn an existentially quantified formula into a function. Formally, skolemization is the following statement

Given a relation \(R \subseteq X \times Y\), \(\forall x \in X. \exists y \in Y. R(x,y)\) then \(\exists f \in X \to Y. \forall x \in X. R (x, f(x))\)

The AC is equivalent to Skolemization. A full discussion of this fact can be found in here

For proving that Sk \(\Rightarrow\) AC, for a family of sets \(\{A_{x}\}_{x \in X}\), we define a relation \(R(x,y) = y \in A_{x}\). For the other direction we assume a relation \(R \subseteq X \times Y\) and then we construct the family of sets \(\{A_{x}\}_{x \in X}\) such that each \(A_{x} = \{ y \mid y \in Y \text{ and } R(x,y)\}\).

Set theory is a first-order logic together with a set of axioms (9 of them
exactly including the AC) postulating the existence of certain sets. Besides the
propositional fragment of first-order logic there is also the predicate fragment
formed by *universal quantification* (\(\forall\)) and *existential
quantification* (\(\exists\)).

The Existential Instantiation rule states that if we know there exists an \(x\) that satisfies the property \(P\) and we can construct a proof from a fresh \(t\) that satisfies that property to a proposition \(R\) then we can obtain \(R\)

\[\frac{\exists x. P \qquad t, P[t/x]\cdots R }{R}\]with \(t\) free for \(x\) in \(P\).

So here we have to treat \(t\) carefully in that it is a fresh \(t\) that satisfies \(P\), but “we do not know what it is!”.

The reason why I put this sentence in quotes is because this is the explanation
that many people would use. However, to me the real reason is that *we do not
know how many other elements in the universe exist with such a property*. There
is certainly one, but there may be more.

To prove Sk we have to assume \(\forall x \in X. \exists y \in Y. R(x, y)\) and then
prove \(\exists f : X \to Y. \forall x \in X. R (x , f (x))\). Though \(f : X \to Y\) really means a relation \(f \subseteq X \times Y\) such that it is a function, *i.e.* that for all \(x \in X\) there exists only one \(y \in Y\) such that \((x,y) \in f\).

Now first we try to construct this relation \(f\). A first naive attempt is to use the axiom of comprehension as follows

\[f = \{(x, y) \mid x \in X \wedge y \in Y \wedge R(x, y)\}\]The problem is that \(f\) is clearly not a function since there may be more than one \(y\) per one \(x\) in \(R\). Notice that the above statement is very simlar to the one where we include the existential

\[f = \{(x, y) \mid x \in X \wedge \exists y'. y = y' \wedge R(x, y)\}\]But this does not change much from before since we know there exists at least one \(y\) per every \(x\) but we do not know how many. Clearly, we can prove that for all \(x \in X\) we have \(R(x, f(x))\), however, we cannot prove that \(f\) is a function. In particular, that for each \(x \in X\) we have a **unique** \(y \in Y\) we map \(x\) to.

Now the question is, couldn’t we just have picked one \(y\) for each \(x\)?

We could do this if we were able to use Existential Instantiation for each \(x \in
X\). If \(X\) was finite then we could certainly do that as we can pick an \(n \in \mathbb{N}\) and assume \(X\) assuming that \(X = \{x_0, x_1, \dots, x_n \}\).

Now we can construct a set of pairs \((x_i, y_i)_{i\in \{1,\dots,n\}}\) such that every \((x_i, y_i) \in R\) by repeatedly using existential instantiation. Once the set is created we can assign \(f\) to it

However, when \(X\) is not finite, we cannot simply *write down* the set by hand. Instead we have to create a formula and then use set comprehension. However, there is no (open) formula of the form

$(x_0,y_0) \in R \wedge (x_1,y_1) \in R \wedge \dots \wedge (x_n, y_n) \in R \wedge \dots }$$

This is because formulas and proofs in set theory are finite and the one above is an infinite formula which would need an (potentially) infinite number of applications of the Existential Instantiation rule.

Hopefully this untangles some confusion around the axiom of choice.

On the other hand, AoC is derivable in Type Theory simply because we have access to the proof that for every \(x\) there exists a \(y\) such that \(R(x,y)\). But the reason why there exists only one is because inhabitants of the dependent product \(\forall\) are functions already.

See the code below.

If you have any comment about this please feel free to drop me an email or something I would very happy to know more (especially if I said something wrong).

###

]]>Any Cartesian Closed Category (CCC) with an initial object and a fixed-point operator is trivial.

Here the word *trivial* means that every object \(A\) in the category is isomorphic to the terminal object \(1\).

To do this proof we make use of the fixed-point operator, which exists at all types.

We know that for all endomaps \(f : A \to A\) in the category there exists a map \(\text{fix}_{f} : 1 \to A\) such that \(f \circ \text{fix}_{f} = \text{fix}_{f}\). Thus, we can use the unique endomap on the initial object, namely the identity map \(id_{0}: 0 \to 0\), to get a map \(\text{fix}_{id_{0}} : 1 \to 0\). But now, because \(0\) is initial (and \(1\) is terminal), we also have a unique map into the terminal object, namely \(! : 0 \to 1\). It is easy to see that \(\text{fix}_{id_{0}}\) and \(1\) are inverses to each other, hence they form an isomorphism \(0 \cong 1\). In particular, \(\text{fix}_{id_{0}} \circ ! : 0 \to 0\) is \(id_{0}\) by initiality and \(! \circ \text{fix}_{id_{0}} : 1 \to 1\) is \(id_{1}\) by finality.

Now we compute as follows. For every object \(A\) in the category \(1 \cong 0 \cong 0 \times A \cong 1 \times A \cong A\) and the proof is concluded.

This result was shown to hold also when in the case when instead of the initial object we postulate a natural numbers object \(\mathbb{N}\).

A natural question to ask now is:

is every model of PCF trivial?

To answer this question we take as a model of PCF the category of Scott domains. This category consists of pointed directed complete partial orders (dCPPO) as objects and continuous functions as arrows (just following Thomas Streicher’s book to avoid any misunderstanding).

Now, we would like to prove that this category is cartesian closed (which we know), has a fixed-point map (which it has) and that it has an initial object. However,

there is no initial object in the category of Scott domains

This is because if this category had an initial element \(0\) it would have at least a bottom element \(\bot_0\). Notice that the subset \(\{\bot_0\}\) is indeed directed and its suprema \(\bigsqcup \{\bot_0\}\) is \(\bot_0\) itself. Now if we take any other dCPPO \(X\), a continuous function \(f : 0 \to X\) that maps \(\bot_{0}\) to any element \(x \in X\) will satisfy the equation

\[f \bigsqcup \{\bot_0\} = \bigsqcup f \{\bot_0\}\]because, for any \(x \in X\) we choose for \(f(\bot_0)\) (even the bottom element), \(\bigsqcup f \{\bot_0\} = \bigsqcup \{x\} = x\).

The only way this category had an initial element is if the arrows in the
category were *strict*, namely they preserved \(\bot\) elements, but, as we have
seen, continuous functions do not necessarily preserve it.

Is this just a coincidence that Scott’s model is not trivial?

Not really. Because if it was trivial it would have broken computational adequacy which is the statement that for every pair or well-typed terms in the language \(\Gamma \vdash t : A\) and \(\Gamma \vdash t' : A\)

if \([\![ t ]\!] = [\![ t' ]\!]\) then \(t \approx t'\)

where \(\approx\) is contextual equivalence of programs.

But if the models was trivial then all the pairs of PCF-denotable terms (pairs of maps into something isomorphic to \(1\)) would be equal (by finality) and therefore operationally equivalent.

Well nothing, because Haskell does not have a formal model.

But let’s say we make a big leap and take the fragment of Haskell consisting of “inductive data types” and recursion. Now I can craft a program that resembles what I just said above

Is this a problem? No, this is not a problem because `y id`

is the infinite
computation. In other words, sends the unit element to \(\bot\). But since
Haskell functions need not to be strict, I can send the \(\bot\) element in
`Empty`

to `One ()`

. So this map is not an isomorphism.

This is probably a very convoluted way of saying

There is no initial object (or natural numbers object) in PCF (or other “PCF-like” languages like Haskell)

this is because `Empty`

actually contains the bottom element \(\bot\).
For the same reasons, if we now consider System F with a polymorphic fixed-point operator and define the \(0\) object by setting

This object has actually an inhabitant: the non-terminating computation. Thus, it is not the initial object.

]]>