Survey of Actual Scope Use in Topic Maps


This is a small survey I have made of the way existing Topic Maps actually use scope. It is no science: the sample is rather small (17 Topic Maps) and probably not wholly representative since Topic Maps for demonstration purposes are over-represented. Nevertheless the survey yields some (perhaps unsurprising) results which my gut feeling tells me are not too far from the truth.

Almost 30 % of Topic Maps (5 in the sample) do not use scope at all. Of those Topic Maps using scope (12 in the sample) about 60 % uses scope for natural language of names. Another 40 % uses scope for controlled vocabularies. Scope is used for association name direction in 33 % of those Topic Maps. (Since a Topic Map may use scope in more than one way, the percentages do not add up to 100 %.) Other uses of scope are incidental.

Method and results

I have taken as a sample the Topic Maps listed by Jan Algermissen (Publicly Available Topic Maps). For each of those Topic Maps I have looked at the way scope is used. I skipped the first two, which are part of the standard. Some of the links were broken (or servers down when I conducted the survey), those were skipped. When a Topic Map actually consists of multiple Topic Maps which are merged, I have regarded them as a single Topic Map. I may have missed some uses of scope, or failed to distinguish between different uses of scope which look similar. Geir Ove Grønmo and Steve Pepper have done similar research in a qualitative way, for a solid theoretical background see their article Towards a General Theory of Scope.

Scope uses

I first distinguished the different uses of scope encountered. I have made a real low-level classification: if two scope uses could be meaningfully distinguished, I have done so. Several of the listed scope uses could easily end up in the same category in a higher-level classification.

natural languagelangDistinguish names of topics by langauge (i.e. "Rome" is english for topic Roma, "Roma" is Italian for topic Roma).
controlled vocabularyvocNames in a controlled vocabulary must be unique (i.e. ISO 2-letter language codes like "NL" and "EN" are unique within this vocabulary).
name typetypeDistinguish name types ("full name", "short name" et cetera).
occurrence validityoccDistinguish uses of occurrence ("online" versus "offline").
association name directiondirDisplay a different name for an association type depending on context ("born in" versus "birthplace of").
disambiguationtncDisambiguate topics with the same name for the Topic Naming Constraint ("Paris, the hero" versus "Paris, the city").
display of base namedispDisplay a different name depending on context.
association validityvalLimit the validity of associations (i.e. "CustomerName" maps onto "client_name" within context "Sales/Europe").
email addressemlEmail addresses are unique.

Full results

Topic Map langvoctypeocc dirtncdispvaleml
number of topic maps using scope in this way75214112124
percentage of topic maps which use scope (12) using scope in this way58 %42 %17 %8 %33 %8 %8 %17 %8 %

Note: I think the use of scope to avoid merging by the TNC is probably under-represented.

Copyright Marc de Graauw 2002. The right is hereby given to all to reproduce and distribute this work in its entirety as long as the authorship of Marc de Graauw is recognized and this copyright notice is included.