georss and friends

bobj · August 10, 2012, 2:02pm

Hi Matt and all

After yesterday's call I went off and studied rss and geoss spec(s) in
some more detail last night in order to understand better what is
allowed, what is required etc. As I mentioned yesterday I am
currently more interested in the xml spec for representing a facility
than the geojson one as we will use xml for transforming and importing
data. I think if we can properly specify a facility representation in
xml, it is easy to 'port' that over to json, so I have suggested
focussing on doing this authoritatively first.

Anyway ....

1. what confused me a bit was the Atom namespace declaration in the
example. I've removed that.

2. the rss spec http://www.rssboard.org/rss-specification explains
its extension points quite clearly: "A RSS feed may contain elements
and attributes not described on this page, only if those elements and
attributes are defined in a namespace". I wish all specs were this
straightforward As long as we define a namespace for our stuff we
can do most things without fear of falling foul of RSS spec.

3. georss provides us with some geographic elements which we can
include into our rss feed items. Well its a bit confusing because it
actually refers to three different geographic schema (georss simple,
georss gml and georss - the w3c version). If we are writing a
facility spec which can be reasonably implemented then we need to be
very clear here. The example on the google doc refers to the
namespace of the older w3c version. I will leave the GIS gurus guide
us on whether this is intentional, desirable etc. Either way we need
to specify which of these geo elements are required to be understood
by consumers. Personally I suspect our requirements (to represent
points and polygons for facilities and health administrative zones)
might be more than adequately met by the point and polygon elements of
http://georss.org/simple. It certainly looks the least clunky. GIS
people please chip in ...

4. despite the fact that we have a geographic/spatial oriented tool
acting as reference implementation, the geo part of specifying a
facility is in fact a very small part of the task in hand. What we
need to define are the elements which represent the facility infoset
which we are exchanging. Those bits which are currently under
<properties> in the resourcemapper namespace.

5. I think a first step is to define a 'proper' namespace for these -
if we were working through oasis or w3c then we could piggyback off
one of their urls. Its not beyond the bounds of reason that we setup
our technical committee in one of those organisations, which is
something I would be in favour of. There is some cost involved but it
removes a lot of burden in terms of establishing proper process and
creating legitimacy. In my experience working in Oasis, the benefits
far outweigh the cost. Otherwise we should register something like
http://healthfacilityregistry.org and work off that.

6. Using that namespace, we are then free to define which elements
shall/must appear in a facility infoset, which elements should appear,
and how to incorporate country specific augmentation and and how to
implement application specific augmentation (extension points).

7. To maximize interoperability it is desirable that we have some
minimum set of elements which are required. RSS 2.0 provides few
constraints "All elements of an item are optional, however at least
one of title or description must be present." I would suggest that
our "item" should look something like:

  <channel>
    <title>Rwanda Health Facilities</title>
    <item>
      <title>Sunshine Hospital</title>
      <pubDate>2012-03-28T07:55:56+00:00</pubDate>
      <link>http://resmap-stg.instedd.org/api/features/17442.rss</link>
      <fac:facility>
         <fac:name>...<fac:name>
         <geo:lat>-1.901301</geo:lat>
         <geo:long>30.418267</geo:long>
         ... other fac elements
      </fac:facility>
    </item>
  </channel>
</rss>

Having a self contained <facility> element within the feed has the
advantage that the same element can be reused in different envelopes
.. here we define its use within an RSS 2.0 feed but it can stand on
its own feet. I've continued to use the w3c version of georss
namespace but would be equally happy to move to georss simple - which
looks like it defines a point element rather than lat and long.

8. What are the other fac elements and how should we define them? I
think there are a few time-tested ones like <fac:openingDate>,
<fac:contactPerson>, <fac:contactNumber>, <fac:operationalStatus> etc
. I would suggest that we gather together as many real world examples
from the field as possible, list them on the wiki and decide which are
sufficiently common to standardize their use - making a clear
distinction between mandatory (very few) and optional elements.

9. context specific extension elements. There are a number of ways
to handle these. One is through an inheritance mechanism where, for
example, a Rwanda facility schema would extend the base facility
schema and declare a Rwanda specific namespace. The other is to
provide type information along with the element, eg:

     <fac:facility>
         <fac:name>...<fac:name>
         <geo:lat>-1.901301</geo:lat>
         <geo:long>30.418267</geo:long>
         ....
         <fac:property name='Number of beds'
type='xsd:integer'>32</fac:property>
         <fac:property name='Alternative contact'
type='xsd:string'>Bob Jolliffe</fac:property>
    </fac:facility>

10. Coded (controlled vocabulary) extension elements - eg.
'ownership' one of 'private', 'public', 'faith based' etc. These are
tricky because we need an external mechanism for communicating the
codelists. Having an inheritance based schema as indicated above
could solve this. Otherwise there needs to be an api method to
retrieve this metadata. Matt's xforms suggestion is an alternative
(albeit roundabout) way of describing the constraints without using a
schema which could also work. If we do this we would also not need
the 'type' attribute for the other <property> elements above. Mind
you, if you are going to generate an xform for the purpose of
describing constraints, you might just as well generate a schema.

11. Identifiers - facility identifiers provide one of the most
important proposed functionalities of the registry as an identity
broker and we need to be clear how we are to handle this. In general
we tend to find that there might be any number of formal identifiers
and any number of application specific internal ids, uuids, uri's etc
in existence for a particular facility. Even within DHIS2 we maintain
3 - the internal database primary key, a random generated 14 character
uid and an arbitrary string <code> which is used for mapping against
an external identifier, typically a formal facility code such as
provided by MFL in Kenya or the FOSA id in Rwanda. So we may need
something like (off the cuff):

<fac:id>xxxx</fac:id> 
     <fac:aliases>
       <fac:alias context='fosa'>32</fac:alias>
       <fac:alias context='dhis'>ygyyt87y</fac:alias>
       <fac:alias context='hr'>21</fac:alias>
     </fac:aliases>

12. Other relationships between facilities (hierarchies). We are
agreed that is a need to express hierarchical relationships. In dhis2
we incorporate entities like the district office, provincial office,
national office etc special types of facility. What is special from a
geographical perspective is that we are more interested in the polygon
than the point. Other than that they are coded the same way. To
represent hierarchical relationships between facilities is easiest
accomplished by incorporating parent links within facility markup.
In dhis we just have one, but there are no reason why there could not
be an arbitrary number. eg.

<fac:parent context='admin' />56</fac:parent>
<fac:parent context='MVP ngo' />108</fac:parent>

The contents of this element might map against the <fac:id> referred to above.

···

------------------------

This has turned into a much longer email than expected and I should
probably post it somewhere on the wiki instead. I am on holiday for
next two weeks but happy to engage when I get back, specifically on
knocking up an authoritative schema for fac:facility.

Best
Bob