The #referent Convention

Update: I learned from the TAG list that Dan Connolly already proposed using #it or #this for the same purpose, and  Tim Berners-Lee proposed using #i to refer to oneself in a similar way. My idea therefore was not very original, and since I regularly read the TAG list and similar sources, it’s even possible I read the idea somewhere and (much) later thought of it as one of my own – though if this happened it was certainly unintentional.

There is a very simple solution to the entire hash-versus-slash debate: whenever you would want to identify anything with a hashless URI, suffix it with #referent. The meaning of x#referent is: I identify whatever x is about. And x is simply an information resource (about x#referent).

The httpRange-14 debate is about what hashless URI’s (without a #) refer to: can they refer only to documents (information resources) or to anything, i.e. persons or cars or concepts. Is it meaningful to say refers to ‘Marc de Graauw’. Or does it now identify both a web page and a person, and is this meaningful and/or desirable?

Hash URI’s aren’t thought to be much of a problem in this respect. They have some drawbacks however. It may be desirable to retrieve an entire information resource which describes what the referent of the URI is. And putting all identifiers in one large file makes the file large. Norman Walsh did this: identifies Norman Walsh, and the ‘who’ file got big. So Norm switched to hashless URI’s: identifies Norman Walsh. The httpRange-14 solution requires Norm to answer to a GET on this URI with a 303 redirect, in this case to, which does not identify Norman, but simply is an information resource.

If we use the #referent convention, I can say: identifies me. And is simply an information resource, which is about me. Problem solved.

If I put in a browser, I will simply get the entire resource, which is a human readable resource about Semantic Web software which understands the #referent convention will know refers to a non-information resource (except when web pages are about other web pages) and is simply an information resource. Chances of collision of the #referent fragment identifier are very small (Semantic Web jokers who do this intentionally apart) and even in the case of collision with existing #referent fragment identifiers the collision seems pretty harmless. The only thing the #referent convention does not solve is all the existing hashless URI’s out there which (are purported to) identify non-information resources.
In Semantic Web architecture, there is no need ever for hashless URI’s. The #referent convention is easier, more explicit about what is meant, retrieves a nice descriptive human-readable information resource in a browser, along with all necesssary rdf metadata for Semantic Web applications.

2 Replies to “The #referent Convention”

  1. The main problem with this solution is that it violates the principle that URIs should be opaque. That is, you have to look at the syntactic form of the URI to work out what it refers to.

    It’s better than httpRange-14, though. But then, everything is better than httpRange-14, including no solution, because when we had no solution, at least we agreed we had no solution…

  2. Lars,

    “… this solution … violates the principle that URIs should be opaque.”

    There is no principle that URIs should be opaque. As a URI-publisher I am free to define whatever non-opaque URI scheme I want, so I can say
    and tell users to fill in whatever city they want.

    where it says: “Web software MUST NOT depend on the correctness of metadata inferred from a URI, except when the encoding of such metadata is documented by applicable standards and specifications. Such standards and specifications include … documentation provided by the URI assignment authority.”

Comments are closed.