Axioms of Versioning

Marc de Graauw, version 2, 12 September 2007
Previous version
Changes v2: added point 12, corrected typos

An attempt to define syntactical and semantical compatiblity of versions in a formal way. Much derives from the writings of David Orchard, especially the parts on syntactical forward and backward compatibility (though my terminology differs).

  1. Let U be a set of (specifications of) software processable languages {L1, L2, ... Ln}
    1. This axiomatization does not concern itself with natural language
  2. The extension of a language Lx is a set of conforming document instances ELx = {Dx1, Dx2, ... Dxn}
    1. Iff ELx = ELy then Lx and Ly are extensionally equivalent
      • Lx and Ly may still be different: they may define different semantics for the same syntactical construct
    2. Iff ELx ⊂ ELy then Lx is an extensional sublanguage of Ly
    3. Iff ELx ⊃ ELy then Lx is an extensional superlanguage of Ly
    4. D is the set of all possible documents; or the union of all ELx where Lx ∈ U
  3. For each Lx ∈ U there is a set of processors Px = {Px1, Px2, ... Pxn} which implement Lx
    1. Each Pxy exhibits behaviour as defined in Lx
    2. Processors can produce and consume documents
    3. Each Pxy produces only documents it can consume itself
    4. At consumption, Pxy may accept or reject documents
  4. The behaviour of a processor Pxy of language Lx is a function Bxy
    1. The function Bxy takes as argument a document, and it's value is a processor state
      • We assume a single processor state before each function execution, alternatively we could assume a <state, document> tuple as function argument
    2. If for two processors Pxy and Pxz for language Lx for a document d Bxy(d) = Bxz(d) then the two processors behave similar for d
      • Two processor states for language Lx are deemed equivalent if a human with thorough knowledge of language specification Lx considers the states equivalent. Details may vary insofar as the language specification allows it.
      • Processor equivalence is not intended to be formally or computably decidable; though in some cases it could be.
    3. If ∀d ( d &isin ELx → Bxy(d) = Bxz(d) ) then Pxy and Pxz are behaviourally equivalent for Lx
      • If two processors behave the same for every document which belongs to a language Lx, the processors are behaviourally equivalent for Lx.
  5. An ideal language specifies all aspects of desired processor behaviour completely and unambiguously; assume all languages in U are ideal
  6. A processor Px is an exemplary processor of a language Lx if it fully implements language specification Lx; assume all processors for all languages in U are exemplary
    1. Because they are (defined to be) exemplary, every two processors for a language Lx are behaviourally equivalent
    2. ELx = { d is a document ∧ Px accepts d }
    3. The complement of ELx is the set of everything (normally, every document) which is rejected by Px
    4. The make set MLx = { d is a document ∧ Px can produce d }
  7. A language Lx is syntactically compatible with Ly iff MLx ⊂ ELy and MLy ⊂ ELx
    1. A language Ln+1 is syntactically backward compatible with Ln iff MLn ⊂ ELn+1 and Ln+1 is a successor of Ln
      • A language change is syntactically backward compatible if a new receiver accepts all documents produced by an older sender.
    2. A language Ln is syntactically forward compatible with Ln+1 iff MLn+1 ⊂ ELn and Ln+1 is a successor of Ln
      • A language change is syntactically forward compatible if an old receiver accepts all documents produced by a new sender.
  8. A document d can be a member of the extension of any number of languages
    1. Px is an (exemplary) processor of Lx, Py is an (exemplary) processor of language
    2. Two languages Lx and Ly are semantically equivalent iff ELx = ELy ∧ ∀d ( (d &isin ELx ) → Bx(d) = By(d) )
      • If two languages Lx and Ly take the same documents as input, and their exemplary processors behave the same for every document, the languages are semantically equivalent.
      • Two languages can only be compared if their exemplary processors are similar enough to be compared.
      • Not every two languages can be compared.
      • "Semantic" should not be interpreted in the sense of "formal semantics".
  9. The semantical equivalence set of a document d for Lx = { y ∈ ELx | Bx(d) = Bx(y) }
    1. Or: SLx,d = { y ∈ ELx | Bx(d) = Bx(y) }
      • The semantical equivalence set of a document d is the set of documents which make a processor behave the same as d
      • Semantical equivalence occurs for expressions which are semantically equivalent, such as i = i + 1 and i += 1 for C, or different attribute order in XML etc.
    2. d ∈ SLx,d
    3. Any two semantical equivalence sets of Lx are necessarily disjunct
      • If z ∈ SLx,e were also z ∈ SLx,d then every member of SLx,e would be in SLx,d and vice versa and thus SLx,d = SLx,e
  10. A language Ly is a semantical superlanguage of Lx iff ∀d ( d ∈ MLx → By(d) = Bx(d) )
    1. For all documents produced by Px, Py behaves the same as Px
      • Equivalence in this case should be decided based on Lx; if Ly makes behavioural distinctions which are not mentioned in Lx, behaviour is still the same as far as Lx is concerned
    2. It follows: ∀d ( d ∈ MLx → ∃SLy,d ( SLy,d ⊂ ELy ∧ ( SLx,d ∩ MLx ) ⊂ SLy,d ∧ By(d) = Bx(d) ) )
      • For any document produced by Px, the part of its semantical equivalence set which Px can actually produce, is a subset of the semantical equivalence set of Py for this document
    3. For all d ∈ ELx ∧ d ∉ MLx there may be many equivalence sets in Ly for which By(d) ≠ Bx(d)
      • In other words: for documents accepted but not produced by Px, Ly may define additional behaviours
    4. Lx is a semantical sublanguage of Ly iff Ly is a semantical superlanguage of Lx
  11. A language Ln+1 is semantically backward compatible with Ln iff Ln+1 is a semantical superlanguage of Ln and Ln+1 is a successor of Ln
    1. An old sender may expect a newer, but semantically backward compatible, receiver to behave as the sender intended
    2. A language Ln is semantically forward compatible with Ln+1 iff Ln+1 is a semantical sublanguage of Ln and Ln+1 is a successor of Ln
    3. Semantic forward compatibility is only possible if a language loses semantics; i.e. its processors exhibit less functionality, and produce less diverse documents
    4. A processor cannot understand what it does not know about yet
  12. A language Ln+1 semantically extends Ln iff ∃d ( d ∈ MLn+1 ∧ d ∈ ELn ∧ BLn+1 ≠ BLn )
    1. Assume Ln is syntactically forward compatible with Ln+1, Ln+1 is semantically backward compatible with Ln, and Ln+1 semantically extends Ln
      • Pn accepts all documents produced by Pn+1, or: MLn+1 ⊂ ELn
      • For documents produced by Pn, Pn+1 behaves as Pn would expect, or: ∀d( d ∈ MLn → Bn+1(d) = Bn(d) )
      • But for some documents accepted, but not produced by Pn, Pn does not behave as Pn+1 would, or: ∃d ( d ∈ MLn+1 ∧ d ∈ ELn ∧ d ∉ MLn ∧ BLn+1 ≠ BLn )
      • New senders send documents to old receivers, which do not behave as new receivers would behave
    2. ∀d (d ∈ MLn+1 ∧ d ∈ ELn ∧ d ∉ MLn → ∃d'( d' ∈ MLn ∧ Bn(d') = Bn(d) )
      • For every document d, accepted but not produced by Pn, there is a document d' for which Pn behaves the same as for d, or: d' ∈ SLn,d (the semantical equivalence set of d for Ln)
    3. We can assume Ln transforms document d to document d'
      • If a language Ln accepts a document which it would not produce, it transforms it to a document which it could produce
      • There has to be no literal transformation
    4. We now have: Bn(d') = Bn(d) ∧ Bn+1(d') ≠ Bn+1(d)
      • Pn behaves the same for d and the transformed d'; but Pn+1 behaves differently
    5. Iff it is acceptable that Bn(d') = Bn(d) for transformed documents, then Pn may ignore the semantical extension responsible for Bn+1(d') ≠ Bn+1(d)
      • The fact that Ln is syntactically forward compatible with Ln+1 enables Pn+1 to introduce new semantics without breaking existing Pn
      • We could say Ln is partially semantically forward compatible with Ln+1; it only ignores semantics it may ignore
      • New senders can interact with old receivers; old receivers may ignore what they do not understand
    6. Iff it is not acceptable that Bn(d') = Bn(d) for transformed documents, then Pn must understand the semantical extension responsible for Bn+1(d') ≠ Bn+1(d)
      • If Pn must understand d, then Pn should not process d
      • A Pn+1 must not send d to a Pn; or Pn should be adapted so that Pn rejects d (or: d ∉ ELn)
      • If a Pn+1 can send d to a Pn, then Ln should not be (completely) syntactically forward compatible with Ln+1
      • A processor should not accept what it must understand, yet cannot understand