controlled vocabularies in facility reg spec

bobj · October 5, 2012, 3:42pm

I'm opening up this point Justin made earlier into its own thread
because I think its important enough that we should be able to find
this discussion again

An issue which has not been addressed either in this or any earlier
incarnations of FRED stuff is the issue of controlled vocabularies.
So long as we are talking of simple strings it is easy enough to
express validity, but there will be some elements from within which we
want to allow a selection from a controlled list.

Reproducing Justin's comment below:

<!-- BEGIN

One suggestion would be restricting the allowed values for particular
fields to a set of semantic values (this is quite restrictive and may
reduce the "permissiveness" of the schema):

  <xs:element name="status">
    <xs:simpleType >
      <xs:restriction base="xs:NCName">
        <xs:enumeration value="open">
             <xs:annotation>
                 <xs:documentation>The facility is in the OPEN state
which means ... </xs:documentation>
             </xs:annotation>
        </xs:enumeration>
        <xs:enumeration value="closed"/>
      </xs:restriction>
    </xs:simpleType>
  </xs:element>
Another option/suggestion would be using a type akin to HL7's CX
datatype (example: 1234-5^^^LN which says code 1234-5 from LOINC)
where a code and coding scheme could be selected :

  <xs:element name="status" type="fac:codeValue"/>
  <xs:complexType name="codeValue">
    <xs:attribute name="value" type="xs:NCName">
        <xs:annotation>
            <xs:documentation>The value of the codified data</xs:documentation>
         </xs:annotation>
    </xs:attribute>
    <xs:attribute name="scheme" type="xs:anyURI">
        <xs:annotation>
            <xs:documentation>The codification scheme from which the
value was selected</xs:documentation>
         </xs:annotation>
      </xs:attribute>
  </xs:complexType>

This is more permissive, closely aligned with HL7 and would allow
instances to (in the future) translate codes that are not understood
locally.

END -->

As per previous discussion, <status> is probably not the best exemplar
here. The two which come to mind as being pretty universally
implemented from our experience across many countries are Type
(clinic, health post, referral hospital, etc) and Ownership (public,
private, ngo/faith-based etc).

I like Justin's second suggestion, but I worry that whilst it works
well for well-known codesets, the allowable values for Type and
Ownership as referred above would almost always be locally defined.
We are not likely to find a LOINC-like equivalent and there is IMHO no
point in trying to define a universal list of health facility types
for example, and try and get countries to fit into it. Each time you
try you are going to come up with something unexpected.

I am thinking (out loud) that any facility registry would have to
serve up such vocabulary lists somewhere at a URL. For example in
DHIS2 you would find them at (the not-particularly-intuitive)
http://apps.dhis2.org/demo/api/organisationUnitGroupSets.
ResourceMapper I think (Ed correct me) provides them through access to
the Layer metadata. So one could get most of the way to a solution by
utilizing Justin's suggestion above, eg:

<ownership value="Public" scheme="<some-base-url>/api/vocab/ownership" />

Though I think this might be an abuse of the uri intended in his
example (which is more about identifying the well known scheme rather
than locating its possible values).

And we are still left with the problem of defining how such codelists
should be represented in a predictable way to consumers. This might
not be too difficult a problem. Either we make use of something
existing such as SDMX codelists (which I don't really like but quite a
few of our systems "know" about them) or OASIS genericode (which is
kind of cool but maybe too much of the kitchen sink) or we define some
simple format which meets our immediate use cases. I think I am in
favour of the latter, with the proviso that we build in something like
the "scheme" mechanism of Justin so that we can decouple at any stage
in the future.

Mind you, before I start thinking that this is all very clever, it
doesn't make sense for consumers to go looking up external codelists
at the point at which they are reading the facility data. A consumer
would really need to "prime" itself with the metadata available from
the facility registry (including controlled vocabularies and anything
else in the extended namespace) before it involved itself in facility
data exchange.

For the moment, and because I really have some other less interesting
things to do, I am going to stick my head back into the sand on this
one So long as there are no controlled vocabularies in the
current facility namespace elements we can hold the problem at bay for
a bit. If there are controlled vocab elements in the extended area,
that is something of a local arrangement how they would be resolved,
which might be ok to get us started. I can think of a few approaches,
but they all risk getting horribly complicated.

I'll sit back and see if anyone has any other good suggestions to
offer on this thread for now.

Have a great weekend,
Bob