Implementing SOS 1.0.0 over large collection of sensors.

This document is to clarify an issue that we knew existed in the current SOS 1.0 spec and a way to address it in a short time frame (until we get to SOS 2.0). Obviously, this might imply that current implementation of SOS 1.0.0 be altered or mediated someway or even force implementation from scratch.

The philosophy of a SOS is to offer large datasets over a small collection of Sensors. This has ramification on how the services interacts with the user. In some situation, such as groundwater levels observed in water wells; We are in the inverse situation; We have a very large collection of "sensor" where very few measurements have been collected. This is at least an issue for the dataset we have in Canada (it will be different when we integrate more traditional sensor later in the project). Looking at the USGS WMS layers, I think they have a similar problem regarding the number of "Sensors", so this discussion would still hold.

After further reading of SOS, O&M and SensorML, there is an ambuiguity in what 'procedure' stands for. The UML model of 06-009r6 show that procedure property goes to Process. Although I did not find a place in O&M that says the Process is in fact a SensorML process, the list of example seems to suggest it. SensorML process can either be 'physical' or 'pure'. Physical process are sensors (a gizmo) while 'pure' are just methodology. The canadian dataset so far are made of 'pure' Process (ProcessModel in SensorML, see 8.9.1 of OGC 07-000 Sensor Model Language). One source of confusion when we tried to figure this out is that Deegree3 implementation assumes that all procedure are physical processes and insists that a list of those sensors be configured manually (including their location). Furthermore, Deegree 3 SOS implementation uses the sensor location to filter geographically (while it should be the feature of interest), further confusion the issue.

  • O&M extract diagram from SOS 1.0 spec:
    om.jpeg

Addendum: -- EricBoisvert - 27 Nov 2009

From my understanding of O&M, procedure can be a SensorML encoding or any 'Process' model you choose - actually similar to what we discussing the other day about use of features types. In a sense it is a union saying we have a sensor/process model if you like but you can specify your own and drop it inside a om:Process element.

So the Degree implementation uses the sensor location for spatial filtering of GetObservation calls? This seems strange as the spec is pretty clear that the filter is for the FoI; but its understandable if you are taking a sensor viewpoint which often is the battle within the SOS definitions.

-- PeterTaylor - 29 Nov 2009

This also offers an interesting challenge of merging very different dataset about the same "phenomenon" (groundwater level)

SOS logic

The first issue is the way that SOS presents the information to the client application. The first piece of information a client application gets from the service is the GetCapabilities document. This document describes a list of offerings, which are packages of data from a collection of sensors. The offerings describe what is being measured (observedProperty) about which feature (the featureOfInterest) and what procedures is used (a sensor). A procedure in this case is a sensor instance. The immediate problem here is that we have over half a millions "sensors" that would need to be reported in the GetCapabilities document. Size is one issue; the procedure reference is normally a URN that is a ID that can be used in the DescribeSensor , so we are looking into a 60-100 characters items including the tags, materializing in a 30-60 Mb file just for Ontario. More than just a document size issue, this kind of list is simply unmanageable in a user interface perspective.

The other issue is that BBOX query is made against the featureOfInterest, not the Sensor. Therefore, what you are filtering is the thing you want observation about.

There is a clear distinction between those featureOfInterest and Sensor, although in our case it might end up being the same 'location'. The Sensor is the gizmo that makes measurement and the featureOfInterest is the thing that we are making measurement about. For groundwater level measurements, we obviously make measurement of the water body that is located in the geologic unit (the aquifer), but many hydrogeologists would argue that the water level is the water level of the well, not the aquifer. There are artefacts related to the well that cannot be ignored when making such a measurement and it's not a direct measurement of the water level of the aquifer. Therefore, and its covered by the O&M spec, a sa:SamplingFeature can be the featureOfInterest of a measurement and the SamplingFeature is a proxy for the real thing it samples.

Observation -> featureOfInterest -> SamplingFeature -> sampledFeature -> Aquifer

Gwml:Water wells are SamplingFeatures and we measure the water level using a very primitive sensor called a "measuring tape" or a high tech version that 'beeps' when it touches water.

So the complete picture for Canada

USA has more sophisticated Sensors, but they still have heaps of them.

Leaves the Observation. WaterML 2.0 has a subtype of Observation called "WaterMonitoringObservation". It is hardly monitored in our case so we should either propose that a new Water observation sub-type created, such as "DiscreteWaterObservation" or just use the generic Observation.

Yes fair point, the naming needs work and I'll put this up for discussion.

Proposed solution

As far as SOS 1.0.0 is concerned, it simply does not fit our requirement (speaking strictly for Canada and for our water well database). The solution here would be to 'bend' the logic of SOS a bit.

By considering this :

  • "Sensor" (the procedure section of the GetCapabilities) could be used to designate the whole water well collection as a single Sensor
  • featureOfInterest can be the whole basin or the whole hydrogeologic region.

This is consistent with the BBOX query, because featureOfInterest are what is being filtered The spec says in Table 2.0 (p.24) about feature of interest.

Features or feature collections that represent the
identifiable object(s) on which the sensor systems
are making observations. In the case of an in-situ
sensor this may be a station to which the sensor is
attached representing the environment directly
surrounding the sensor. For remote sensors this
may be the area or volume that is being sensed,
which is not co-located with the sensor. The
feature types may be generic Sampling Features
(see O&M) or may be specific to the application
domain of interest to the SOS. However, features
should include spatial information (such as the
GML boundedBy) to allow the location to be
harvested by OGC service registries.

Issues for GetCapabilities

The "bend" we make is to report in the GetCapabilities a whole collection and report a particular instance (a well) in the GetObservation document.

For Sensor, we report a whole collection and the ID that is reported here cannot be used in a DescribeSensor to get a single well. But in our case, describe sensor can return a description of the whole collection in general terms (remember that a sensor can also be something like a satellite). So the whole well 'network' is presented as single entity that slowly (over a century and a half) took measurements only once at various locations.

Issues for DescribeSensor

As far as out dataset is concerned, we will return always the same SensorML description of a network of "measuring tapes". The user won't have a list of ID of individual "Sensors", but the ID of the whole network, so it can only request for that ID anyway.

Issues for the GetObservation by BBOX

A typical request for BBOX operation looks like this:

<sos:GetObservation 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xmlns:gml="http://www.opengis.net/gml" 
   xmlns:ogc="http://www.opengis.net/ogc" 
   xmlns:sos="http://www.opengis.net/sos/1.0" 
   xmlns:om="http://www.opengis.net/om/1.0" 
   xsi:schemaLocation="http://www.opengis.net/sos/1.0 http://schemas.opengis.net/sos/1.0.0/sosAll.xsd" 
   service="SOS" version="1.0.0" srsName="EPSG:4326">
   <sos:offering>urn:GIN:offering:groundwaterLevel:1</sos:offering>
   <sos:observedProperty>urn:ogc:def:property:OGC:GroundWaterLevel</sos:observedProperty>
   <sos:featureOfInterest>
      <ogc:BBOX>
         <ogc:PropertyName>gml:location</ogc:PropertyName>
         <gml:Envelope>
            <gml:lowerCorner>-79 44</gml:lowerCorner>
            <gml:upperCorner>-78 45</gml:upperCorner>
         </gml:Envelope>
      </ogc:BBOX>
   </sos:featureOfInterest>
   <sos:responseFormat>text/xml; subtype=&quot;om/1.0.0&quot;</sos:responseFormat>
   <sos:resultModel xmlns:wml="http://www.wron.net.au/waterml2">wml:WaterMonitoringObservation</sos:resultModel>
   <sos:responseMode>inline</sos:responseMode>
</sos:GetObservation>

So far, this is consistent with our logic, except the use won't get featureOfInterest that we listed in the GetCapabilities. The only problem is the property used to filter the location. Actually, it's a problem unrelated to our concessions, but with SOS. The name of the geometry property (gml:location here) of the feature of interest must be known by the client. I don't know where the client is supposed to get this information (did not find a spot in the GetCapabilities). If we assume that all featureOfInterest are SamplingPoint, we can assume that the property will be sa:position.

Issues for the GetObservation by Id

Request for by ID is actually a request targeting a specific sensor (procedure). This example is what a client looking into the GetCapabilies document would create (see sos:procedure)

<sos:GetObservation 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xmlns:gml="http://www.opengis.net/gml" 
   xmlns:ogc="http://www.opengis.net/ogc" 
   xmlns:sos="http://www.opengis.net/sos/1.0" 
   xmlns:om="http://www.opengis.net/om/1.0" 
   xsi:schemaLocation="http://www.opengis.net/sos/1.0 http://schemas.opengis.net/sos/1.0.0/sosAll.xsd" 
   service="SOS" version="1.0.0" srsName="EPSG:4326">
   <sos:offering>urn:GIN:offering:groundwaterLevel:1</sos:offering>
<sos:procedure xlink:href="urn:GIN:feature:MeasuringTape"/>
   <sos:observedProperty>urn:ogc:def:property:OGC:GroundWaterLevel</sos:observedProperty>
   <sos:featureOfInterest xlink:href="urn:GIN:feature:WaterWellNetwork"/>
   <sos:responseFormat>text/xml; subtype=&quot;om/1.0.0&quot;</sos:responseFormat>
   <sos:resultModel xmlns:wml="http://www.wron.net.au/waterml2">wml:WaterMonitoringObservation</sos:resultModel>
   <sos:responseMode>inline</sos:responseMode>
</sos:GetObservation>

Obviously, we need the ID of a specific sensor, but the user does not have the ID, unless it got a list from the BBOX request first. In our test demonstration, we always click on a location (because WMS only understand location) or we could click on a KML point that would have this ID. So I guess it a matter of having a user interface that performs a BBOX request before offering the client to pick one in particular.

Issues for the GetObservation by Time instant or Time Range

This one does not seem to cause any problem:

<?xml version="1.0" encoding="UTF-8"?>
<GetObservation xmlns="http://www.opengis.net/sos/1.0" xmlns:gml="http://www.opengis.net/gml" xmlns:ogc="http://www.opengis.net/ogc" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/sos/1.0 http://schemas.opengis.net/sos/1.0.0/sosAll.xsd" service="SOS" version="1.0.0">
   <offering>urn:MyOrg:SCAN_DATA</offering>
   <eventTime>
      <ogc:TM_During>
         <ogc:PropertyName>om:samplingTime</ogc:PropertyName>
         <gml:TimePeriod>
            <gml:beginPosition>2006-11-05T17:18:58.000-06:00</gml:beginPosition>
            <gml:endPosition>2006-11-05T21:18:59.000-06:00</gml:endPosition>
         </gml:TimePeriod>
      </ogc:TM_During>
   </eventTime>
   <observedProperty>urn:ogc:def:property:MyOrg:BAND1</observedProperty>
   <observedProperty>urn:ogc:def:property:MyOrg:BAND2</observedProperty>
   <responseFormat>text/xml; subtype=&quot;om/1.0.0&quot;</responseFormat>
</GetObservation>

The query targets the Observation itself and not the feature of interest or the sensor. But even if it would (or offer this option in the testbed), we could work under the same assumptions that the other cases we described.

Conclusion

Those problem were known and I think this has been addressed by SOS 2.0.0, but the spec is not public and there are no software out (maybe North 52 because they are intimately involved in the specification definition, but I did not find anything on their Twiki site).

We are holding up for now on SOS 2.0 because we don't have easy accept to the spec and there are no software out there, but if we end-up writing our own service (which I basically started to do), maybe we should try to look at SOS 2.0 and follow it after Dec 8th).

If we stick to 1.0.0 (probably the best solution in this short time frame), those small concessions to the spec should get us going. It's an experiment after all. Actually, it would be a nice result for the short-term experiment (we can conclude in Dec. that SOS 1.0.0 does not cover our needs).

-- EricBoisvert - 26 Nov 2009

Some of these issues are being addressed in SOS2.0 as you say, but I still think large sets of the sensors blowing out the caps documents will remain. In SOS2.0, the offering defintions will restrict an offering to contain one sensor or sensor system. This is to make the relationship between the Procedure-ObservedProperty-Feature explicit (it is possible in 1.0 that you can compose incorrect queries based on the capabilities metadata). The recommendation in 2.0 is to use sensor Systems for large groupings. The question then is: is it possible to query for individual sensor data using the IDs you parse from the sensorML document? If not, you always need to query across the whole system resulting in larger response documents. -- PeterTaylor - 29 Nov 2009

Is the proposed solution at least acceptable for the immediate future ?-- EricBoisvert - 30 Nov 2009

Should be fine. Let me make sure I have it right, the two issues (large number of sensors & large number of features) will be handled like this:

Sensors: List just a sensor grouping that represents all sensors. Benefit: No blow out of the capabilities document. Drawback: can't query on individual sensors so GetObservation calls will return all sensor data that are restricted by other filters (spatial or phenomenon based).

Features. Same idea but still possible to filter using BBOX but not individual FeatureID as client does not know the IDs as not listed. My problem with this one is that in the response to GetObservation we will initially just return a URN to the individual feature IDs (that were not listed originally), so there isn't any spatial location available for the individual feature. I see that using the test SOS you return this:

<om:member>
    <wml:WaterMonitoringObservation>
      <gml:location>
        <gml:Point srsName="EPGS:4326">
          <gml:pos>-78.8157932808174 44.8057250235482</gml:pos>
        </gml:Point>
      </gml:location>
      <om:samplingTime>
        <gml:TimeInstant>
          <gml:timePosition/>
        </gml:TimeInstant>
      </om:samplingTime>
      <om:procedure xlink:href="urn:ogc:object:Sensor:627620"/>
      <om:observedProperty xlink:href="urn:ogc:def:property:OGC:GroundWaterLevel"/>
      <om:featureOfInterest xlink:href="ca.on.gov.wells.627620"/>
      <om:result>
        <wml:TimeSeries>
          <wml:domainExtent>
            <gml:TimePeriod>
              <gml:beginPosition/>
              <gml:endPosition/>
            </gml:TimePeriod>
          </wml:domainExtent>
          <wml:defaultDataType codeSpace="http://www.bom.gov.au/std/water/xml/"
            >InstVal</wml:defaultDataType>
          <wml:element>
            <wml:TimeValuePair>
              <wml:time>T00:00:00-05:00</wml:time>
              <wml:value>
                <swe:Quantity>
                  <swe:uom code="m"/>
                  <swe:value/>
                </swe:Quantity>
              </wml:value>
            </wml:TimeValuePair>
          </wml:element>
        </wml:TimeSeries>
      </om:result>
    </wml:WaterMonitoringObservation>
  </om:member>

But the wml:WaterMonitoringObservation doesn't contain a gml:location property - which is by design through the O&M specification where spatial properties are pushed to the FoI (there is a bounding box inherited through AbstractFeature but I don't think that should be used here). O&M sect 6.4 - "...the generic Observation class does not have an inherent location property. Relevant location information should be provided by the feature of interest, or by the observation procedure, according to the specific scenario."

The way to serve the spatial properties would be to embed the sampling point description (which is perhaps what you plan to do once I remove the feature restriction) like this (a mocked up example based on your outputs):

<wml:WaterMonitoringObservation>
  <om:samplingTime>
   <gml:TimeInstant>
     <gml:timePosition/>
   </gml:TimeInstant>
  </om:samplingTime>
  <om:procedure xlink:href="urn:ogc:object:Sensor:627619"/>
  <om:observedProperty xlink:href="urn:ogc:def:property:OGC:GroundWaterLevel"/>
  <om:featureOfInterest>
   <gwml:WaterWell>
      <gml:description></gml:description>
      <gml:name>ca.on.gov.wells.627619</gml:name>
      <sa:position>
         <gml:Point srsName="EPGS:4326">
            <gml:pos>-78.8155303702256 44.8059900580633</gml:pos>
         </gml:Point>
      </sa:position>
      <!-- GWML properties continue as needed -->
   </gwml:WaterWell>
  </om:featureOfInterest>
  <!-- Continues with results.. -->

If we use a URN instead of an inline description then we would need to resolve the spatial properties through a GetFeatureOfInterest using the ID. You could still do this as you have an ID after the GetObservation but it makes a lot of calls and you get observations before you get spatial properties which is a bit back to front in some ways. As we are not validating the schema output at the moment then you could still do this with the restriction in, but if your client binds using any sort of automated XML beans with validation then we will hit issues.

-- PeterTaylor - 01 Dec 2009

Good point. This is why I added the gml:location to the Observation hoping to dodge the problem. The other issue I had with the query is that the filter must explictly target the geometry property. This assumes that the FoI is

  • schematically known prior to the query (the user knows which property has the geometry)
  • the FoI is always of the same type, or the user must list the alternatives with a big ogc:OR statement.

Actually this is a good argument to the restriction you had on featureOfInterest, because it limited the types of FoI. In the mean time, since WaterWell is also a SamplingPoint, sa:position works for us. So maybe we could state the FoI must be a SamplingFeature (at large) ?

-- EricBoisvert - 01 Dec 2009

Trying to make sense of the SOS 2.0 spec. Now it seems that each ObservationOffering in the GetCapabilities is a single Sensor or SensorSystem (there is no explicit Sensor reference).

Section 7.2.3.3 says: "An ObservationOffering groups collection of sensor observations produced by exactly one sensor system ".

-- EricBoisvert - 25 Feb 2010
I Attachment Action Size Date Who Comment
om.jpegjpeg om.jpeg manage 203 K 27 Nov 2009 - 20:11 EricBoisvert O&M extract diagram from SOS 1.0 spec
Topic revision: r8 - 25 Feb 2010, EricBoisvert
This site is powered by FoswikiThe information you supply is used for OGC purposes only. We will never pass your contact details to any third party without your prior consent.
If you enter content here you are agreeing to the OGC privacy policy.

Copyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding OGC Public Wiki? Send feedback