Marc de Graauw, version 2, 12 September 2007

Previous version

Changes v2: added point 12, corrected typos

An attempt to define syntactical and semantical compatiblity of versions in a formal way. Much derives from the writings of David Orchard, especially the parts on syntactical forward and backward compatibility (though my terminology differs).

- Let U be a set of (specifications of) software processable languages {L1, L2, ... Ln}
- This axiomatization does not concern itself with natural language

- The
**extension**of a language Lx is a set of conforming document instances ELx = {Dx1, Dx2, ... Dxn}- Iff ELx = ELy then Lx and Ly are extensionally equivalent
- Lx and Ly may still be different: they may define different semantics for the same syntactical construct

- Iff ELx ⊂ ELy then Lx is an extensional sublanguage of Ly
- Iff ELx ⊃ ELy then Lx is an extensional superlanguage of Ly
- D is the set of all possible documents; or the union of all ELx where Lx ∈ U

- Iff ELx = ELy then Lx and Ly are extensionally equivalent
- For each Lx ∈ U there is a set of
**processors**Px = {Px1, Px2, ... Pxn} which implement Lx- Each Pxy exhibits behaviour as defined in Lx
- Processors can produce and consume documents
- Each Pxy produces only documents it can consume itself
- At consumption, Pxy may accept or reject documents

- The
**behaviour**of a processor Pxy of language Lx is a function Bxy- The function Bxy takes as argument a document, and it's value is a processor state
- We assume a single processor state before each function execution, alternatively we could assume a <state, document> tuple as function argument

- If for two processors Pxy and Pxz for language Lx for a document d Bxy(d) = Bxz(d) then the two processors behave similar for d
- Two processor states for language Lx are deemed equivalent if a human with thorough knowledge of language specification Lx considers the states equivalent. Details may vary insofar as the language specification allows it.
- Processor equivalence is not intended to be formally or computably decidable; though in some cases it could be.

- If ∀d ( d &isin ELx → Bxy(d) = Bxz(d) ) then Pxy and Pxz are behaviourally equivalent for Lx
- If two processors behave the same for every document which belongs to a language Lx, the processors are behaviourally equivalent for Lx.

- The function Bxy takes as argument a document, and it's value is a processor state
- An
**ideal language**specifies all aspects of desired processor behaviour completely and unambiguously; assume all languages in U are ideal - A processor Px is an
**exemplary processor**of a language Lx if it fully implements language specification Lx; assume all processors for all languages in U are exemplary- Because they are (defined to be) exemplary, every two processors for a language Lx are behaviourally equivalent
- ELx = { d is a document ∧ Px accepts d }
- The complement of ELx is the set of everything (normally, every document) which is rejected by Px
- The make set MLx = { d is a document ∧ Px can produce d }

- A language Lx is
**syntactically compatible**with Ly iff MLx ⊂ ELy and MLy ⊂ ELx- Two languages are syntactically compatible if they accept the documents produced by each other.

- A language Ln+1 is
**syntactically backward compatible**with Ln iff MLn ⊂ ELn+1 and Ln+1 is a successor of Ln- A language change is syntactically backward compatible if a new receiver accepts all documents produced by an older sender.

- A language Ln is
**syntactically forward compatible**with Ln+1 iff MLn+1 ⊂ ELn and Ln+1 is a successor of Ln- A language change is syntactically forward compatible if an old receiver accepts all documents produced by a new sender.

- A document d can be a member of the extension of any number of languages
- Px is an (exemplary) processor of Lx, Py is an (exemplary) processor of language
- Two languages Lx and Ly are
**semantically equivalent**iff ELx = ELy ∧ ∀d ( (d &isin ELx ) → Bx(d) = By(d) )- If two languages Lx and Ly take the same documents as input, and their exemplary processors behave the same for every document, the languages are semantically equivalent.
- Two languages can only be compared if their exemplary processors are similar enough to be compared.
- Not every two languages can be compared.
- "Semantic" should not be interpreted in the sense of "formal semantics".

- The
**semantical equivalence set**of a document d for Lx = { y ∈ ELx | Bx(d) = Bx(y) }- Or: SLx,d = { y ∈ ELx | Bx(d) = Bx(y) }
- The semantical equivalence set of a document d is the set of documents which make a processor behave the same as d
- Semantical equivalence occurs for expressions which are semantically equivalent, such as i = i + 1 and i += 1 for C, or different attribute order in XML etc.

- d ∈ SLx,d
- Any two semantical equivalence sets of Lx are necessarily disjunct
- If z ∈ SLx,e were also z ∈ SLx,d then every member of SLx,e would be in SLx,d and vice versa and thus SLx,d = SLx,e

- Or: SLx,d = { y ∈ ELx | Bx(d) = Bx(y) }
- A language Ly is a
**semantical superlanguage**of Lx iff ∀d ( d ∈ MLx → By(d) = Bx(d) )- For all documents produced by Px, Py behaves the same as Px
- Equivalence in this case should be decided based on Lx; if Ly makes behavioural distinctions which are not mentioned in Lx, behaviour is still the same as far as Lx is concerned

- It follows: ∀d ( d ∈ MLx → ∃SLy,d ( SLy,d ⊂ ELy ∧ ( SLx,d ∩ MLx ) ⊂ SLy,d ∧ By(d) = Bx(d) ) )
- For any document produced by Px, the part of its semantical equivalence set which Px can actually produce, is a subset of the semantical equivalence set of Py for this document

- For all d ∈ ELx ∧ d ∉ MLx there may be many equivalence sets in Ly for which By(d) ≠ Bx(d)
- In other words: for documents accepted but not produced by Px, Ly may define additional behaviours
- Lx is a semantical sublanguage of Ly iff Ly is a semantical superlanguage of Lx

- For all documents produced by Px, Py behaves the same as Px
- A language Ln+1 is
**semantically backward compatible**with Ln iff Ln+1 is a semantical superlanguage of Ln and Ln+1 is a successor of Ln- An old sender may expect a newer, but semantically backward compatible, receiver to behave as the sender intended
- A language Ln is
**semantically forward compatible**with Ln+1 iff Ln+1 is a semantical sublanguage of Ln and Ln+1 is a successor of Ln - Semantic forward compatibility is only possible if a language loses semantics; i.e. its processors exhibit less functionality, and produce less diverse documents
- A processor cannot understand what it does not know about yet

- A language Ln+1
**semantically extends**Ln iff ∃d ( d ∈ MLn+1 ∧ d ∈ ELn ∧ BLn+1 ≠ BLn )- Ln+1 introduces new behaviour for some documents accepted, but not produced by Ln

- Assume Ln is syntactically forward compatible with Ln+1, Ln+1 is semantically backward compatible with Ln, and Ln+1 semantically extends Ln
- Pn accepts all documents produced by Pn+1, or: MLn+1 ⊂ ELn
- For documents produced by Pn, Pn+1 behaves as Pn would expect, or: ∀d( d ∈ MLn → Bn+1(d) = Bn(d) )
- But for some documents accepted, but not produced by Pn, Pn does not behave as Pn+1 would, or: ∃d ( d ∈ MLn+1 ∧ d ∈ ELn ∧ d ∉ MLn ∧ BLn+1 ≠ BLn )
- New senders send documents to old receivers, which do not behave as new receivers would behave

- ∀d (d ∈ MLn+1 ∧ d ∈ ELn ∧ d ∉ MLn → ∃d'( d' ∈ MLn ∧ Bn(d') = Bn(d) )
- For every document d, accepted but not produced by Pn, there is a document d' for which Pn behaves the same as for d, or: d' ∈ SLn,d (the semantical equivalence set of d for Ln)

- We can assume Ln
**transforms document**d to document d'- If a language Ln accepts a document which it would not produce, it transforms it to a document which it could produce
- There has to be no literal transformation

- We now have: Bn(d') = Bn(d) ∧ Bn+1(d') ≠ Bn+1(d)
- Pn behaves the same for d and the transformed d'; but Pn+1 behaves differently

- Iff it is acceptable that Bn(d') = Bn(d) for transformed documents, then Pn
**may ignore**the semantical extension responsible for Bn+1(d') ≠ Bn+1(d) - The fact that Ln is syntactically forward compatible with Ln+1 enables Pn+1 to introduce new semantics without breaking existing Pn
- We could say Ln is
**partially semantically forward compatible**with Ln+1; it only ignores semantics it may ignore - New senders can interact with old receivers; old receivers may ignore what they do not understand
- Iff it is not acceptable that Bn(d') = Bn(d) for transformed documents, then Pn
**must understand**the semantical extension responsible for Bn+1(d') ≠ Bn+1(d)- If Pn must understand d, then Pn should not process d
- A Pn+1 must not send d to a Pn; or Pn should be adapted so that Pn rejects d (or: d ∉ ELn)
- If a Pn+1 can send d to a Pn, then Ln should not be (completely) syntactically forward compatible with Ln+1
- A processor should not accept what it must understand, yet cannot understand