Forum

Missing data ['Transformations']

MM
Mason Marchildon, modified 1 Month ago.

Missing data ['Transformations']

Rookie Crystal gazer Posts: 11 Join Date: 5/1/17 Recent Posts
Hello again.

I'm attempting to merge three precipitation grids that have non-overlapping periods of record. The grids do not share the same spatial definition and I'd like their data averaged over a subwatershed polygon LocationSet. 

My workflow is to interpolate the grids to separate temporary timeSeriesSets (all defined with the same subwatershed LocationSet), and merge these temporary timeSeriesSets to a single continuous timeSeriesSet written to the DB. Running this gives me the Warning message: "Existing value overwritten by missing" and all that appears is the data from the Timeseries that had the highest merge priority.

Should this be occurring? I'm expecting the merge (simple) not to overwrite data with "missings".

Are "missing" values explicitly assigned in the database? If so, can they be removed such that they do not overwrite data during the merge process? Or, is there a way to ignore "missing" during the merge process?
IM
Ivo Miltenburg, modified 1 Month ago.

RE: Missing data ['Transformations']

Keen Forecaster Posts: 7 Join Date: 3/12/13 Recent Posts
Hi Mason,

What merge transformation are you using? You mention grids (timseriesType - grid) and polygons (timeSeriesType = scalar) - you cannot go directly from one to the other in a transformation. It might be that your setup with grids/polygons is messing things up - really not sure.


How I would apprach this:


1. Get all precipitation grids to the same grid definition using a "Merge Interpolation" transformation: https://publicwiki.deltares.nl/display/FEWSDOC/Merge+Interpolation . e.g. 

<transformation id="merge interpolation example">
    <merge>
        <interpolation>
            <inputVariable>
                <variableId>input1</variableId>
            </inputVariable>
            <inputVariable>
                <variableId>input2</variableId>
            </inputVariable>
            <interpolationType>closestDistance</interpolationType>
            <outputVariable>
                <variableId>output_grid</variableId>
            </outputVariable>
        </interpolation>
    </merge>
</transformation>
2. Apply a Spatial Interpolation from your merged grid to your polygon locationSet using a SpationInterpolation - Average transformation. e.g.

<interpolationSpatial>
    <average>
        <inputVariable>
            <variableId>output_grid</variableId>
        </inputVariable>
        <outputVariable>
            <variableId>output_polygon</variableId>
        </outputVariable>
    </average>
</interpolationSpatial>

  
I hope this helps. Good luck.Ivo
MM
Mason Marchildon, modified 1 Month ago.

RE: Missing data ['Transformations']

Rookie Crystal gazer Posts: 11 Join Date: 5/1/17 Recent Posts
Thank you, this is a more concise way of implimenting my goal, however it didn't work. I'm recieving a Null pointer program error at nl.wldelft.fews.system.plugin.transformationmodule.function.implementation.merge.MergeInterpolationFunction.calculate:56, which occurred during the Merge Interpolation Transformation.

Now, I took a similar approach, only that I first interpolated to the subasin (scalar) location set, then applied the merge; that way I can avoid performing two interpolations (grid->grid->scalar) to get my final product:

<transformation id="tranform_grid_basin_1">
    <interpolationSpatial>
        <average>
            <inputVariable>
                <variableId>inputGrid_1</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>temporary_basin_LocationSet_1</variableId>
            </outputVariable>
        </average>
    </interpolationSpatial>
</transformation>
<transformation id="tranform_grid_basin_2">
    <interpolationSpatial>
        <average>
            <inputVariable>
                <variableId>inputGrid_2</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>temporary_basin_LocationSet_2</variableId>
            </outputVariable>
        </average>
    </interpolationSpatial>
</transformation>

<transformation id="basin_merge">
    <merge>
        <simple>            
            <inputVariable>
                <variableId>inputGrid_1</variableId>
            </inputVariable>
            <inputVariable>
                <variableId>inputGrid_2</variableId>
            </inputVariable>
            <outputVariable>
                <variableId>ouput_basins</variableId>
            </outputVariable>
        </simple>
    </merge>
</transformation>


This method works, however with the caveat: "some existing values overwritten with missings" and I there are multiple "WARN - Existing value overwritten by missing by..." warnings throughout.

I still question why an existing value be overwritten by "missing" during a merge? Is it wrong to consider a Merge transformation a means to create a continuous scalar timeseries from multiple disparate sources?

For the moment, I have a rather tedious work around: I built it incrementally, setting T0 to the end of the last dataset import, manually changing the xml file, run transformation, repeat. I'd like to know how to automate this.

Thanks again.