Natural Language and the Boundary of Ambiguity
What is lost when the same sort specification is written in natural language — the value of traceable definitions, and the power to localize ambiguity
Writing the same specification in natural language
So far we have descended from a VDM implicit specification into design, implementation, and tests (parts 1 and 2), and examined the premise of TDD without a specification (part 3). Finally, let us compare against the most common practice of all — a specification document in natural language. Here is the same sort specification, in plain words.
The input is a sequence of numbers, of size between 0 and 10^7. The output must be sorted.
At first glance there is nothing wrong with it. But what exactly does "sorted" mean? Ascending or descending? Are duplicate elements preserved? Must the output even be a rearrangement of the input? Look closer and "size" is ambiguous too — the length of the sequence, or the magnitude of the numbers? Nothing in this document lets you trace the answers. Just as with TDD, the reader has no choice but to infer the author's intent through common sense.
In VDM, definitions can be traced
In the VDM specification, "sorted" corresponds to the postcondition's IsPermutation(r, l) and IsOrdered(r). If either word feels ambiguous, you can follow its definition.
-- What does "sorted" mean? → look at the postcondition post IsPermutation(r, l) and IsOrdered(r); -- What is IsOrdered? → follow the definition IsOrdered : seq of int -> bool IsOrdered(l) == forall i, j in set inds l & i > j => l(i) >= l(j); -- What are forall / inds / l(i)? -- → their meaning is fixed by the VDM-SL standard (ISO/IEC 13817-1)
The forall, inds, and l(i) appearing inside IsOrdered are language constructs whose meaning is fixed by the VDM-SL language standard (ISO/IEC 13817-1). Every word in the specification, followed far enough, bottoms out in a rigorous definition. There is no dead-end word — no term used without ever being defined.
Being able to trace the specification's vocabulary all the way down, leaving no word that relies on the reader's common sense — this is the decisive difference between formal specification and natural language.
Hard to see with sorting — fatal with culture-dependent concepts
Sorting is nearly self-evident to everyone, so in this example the gap with natural language looks small. But once a concept's definition varies across cultures, the same ambiguity becomes a chasm. Consider a requirement like the following.
The organization names in the output must not be the names of organizations that are problematic by social convention.
What exactly does "problematic by social convention" refer to? The interpretation differs by country, by industry, and by individual. A human team might proceed on "common sense will cover it" — the implicit premise of shared culture we saw on the previous page. But the moment interpretations diverge, this specification endorses several mutually contradictory implementations as "correct" at the same time.
Natural language can approach rigor by piling up footnotes and glossaries. But there is no way to objectively verify where ambiguity still remains. The ambiguity stays scattered across the text — unauditable.
Formal specification localizes ambiguity
Let us write the same requirement in VDM, replacing the diffuse phrase "problematic by social convention" with a referenceable definition.
types OrgName = seq of char; functions -- blocked : the set of organization names published by -- organization A as classification list α IsAcceptable : OrgName * set of OrgName -> bool IsAcceptable(name, blocked) == not (name in set blocked);
The diffuse phrase has become the precise predicate not (name in set blocked), and what blocked refers to — the set of organization names published by organization A as classification list α — is recorded as an annotation in exactly one place. Within the formalization, blocked is treated as a given set. Of course, ambiguity has not dropped to zero. Whether list α is appropriate, and when updates propagate — those remain outside the formalization.
What matters is that the remaining ambiguity has been localized into a single name: blocked. Exactly how far the specification is rigorous, and where the external reference begins — anyone can point to that boundary objectively. The review question changes from the vague "is there ambiguity somewhere in this text?" to the concrete "are the definition and governance of list α sound?".
The value of formal specification is not the total elimination of ambiguity. It is localizing the unavoidable ambiguity into named symbols, so that the boundary with the rigorous part can be pointed to mechanically. Verification and proof are possible precisely because that boundary is explicit.
Summary: four perspectives
The specification fixed the what (part 1); it was traced into design, implementation, and tests (part 2); development without a specification stood on the premise of shared culture and common sense (part 3); and natural-language specifications could not be audited for where their ambiguity lives (part 4). A vocabulary whose definitions can be traced all the way down, and an explicit boundary of ambiguity — this is why contracts in multi-agent development, where heterogeneous agents cooperate, need a formal language.