# A simple observation operator example, running in eoldas¶

## A simple observation operator example, running in eoldas¶

### Purpose of this section¶

This section of the user guide will take a simple example of combining two cost functions in a variational DA sense: an Identity (or Prior) operator and a smoothness (derivative) constraint. The examples used here are of filtering noisy time series of NDVI data.

As well as learning about these concepts if you are unfamiliar with them, this section will also take you through some examples of running the eoldas to solve this sort of problem. Towards the end of the section, we delve into writing some python code to make use of eoldas, and also mention other ways of interacting with it.

We learn that at the heart of eoldas is one or more configuration files that allow us to solve problems without the need for any code writing (though you can if you want to!).

### Introducing the observation operator concept¶

In the previous sections, we have assumed that the state of the land surface can be observed directly, and that these observations are only limited by noise in their acquisition. In general, the state of the surface and the observations will be different entities. For example, the state of the surface may include parameters such as leaf area index, chlorophyll concentration or soil moisture, whereas the observations may be of directional surface reflectance, top of atmosphere radiance or microwave temperatures. The link between these magnitudes is the observation operator, which maps the land surface parameters into quantities that are measured by a sensor. Many types of observation operator are available: from the statistical to the physics based. The simplest case, however, is the identity operator. In fact, we have already used it, as we have assumed that the observations are just direct measurements of the state vector.

### A simple data assimilation example using an identity observation operator¶

The simplest observation operator is the identity observation operator, where the observations are identical to the state vector components. We can see the use of this operator as a way of optimally smoothing univariate timeseries, for example NDVI. Additionally, the use of a DA system allows to interpolate where data points are not available. The following demonstrates how the EOLDAS prototype can be used for this task, and also allows us to explore the use of the weak assimilation paradigm conveniently.

The main way to run EOLDAS is via one or more configuration files. This is partly to make sure that a record exists of a particular experimental setup and partly to allow flexible running of the system without the need for further coiding by the user (unless he/she wants to add new classes or methods).

Here is the start of the configuration file that we are going to use:

```# An EOLDAS confioguration file for
# a simple Identity operator and a regulariser

[parameter]
location = ['time']
limits = [[1,365,1]]
names = gamma_time,NDVI
solve = 0,1
datatypes = x

[parameter.result]
filename = output/Identity/NDVI_Identity.params
format = 'PARAMETERS'
help_filename='Set the output state filename'

[parameter.x]
datatype = x
names = \$parameter.names
default = 200,0
help_default="Set the default values of the states"
apply_grid = True
sd = [1.]*len(\$parameter.names)
bounds =  [[0.000001,1000000],[-1,1]]
```

This sets up the section [parameter], which described the state variables within EOLDAS.

The section must contain fields describing the locational information of the data to be used as well as the names of the state variables we will consider. In this case, we have two state variables ‘gamma_time’ and ‘NDVI’ that we wish to estimate over the (time) range 1 to 365 (inclusive) in steps of 1 (day).

The subsection parameter.solve gives a list of codes indicating whether we wish to solve for each parameter or not.

A code of 0 tells the EOLDAS not to solve, i.e. just to use the data that are read in or any other default values.

A code of 1 tells EOLDAS to solve for that state variable at all locations (times here).

A code of 2 tells EOLDAS to solve for the state variable, but to maintain a single value over all locations.

The section parameter.result gives information on any output file name and format for the state.

Finally in this section, the field parameter.x sets up data and conditions for the state vector x. Remember that it is this state vector x that we will solve for in the EOLDAS. Here, we specify that it is of datatype x (i.e. the x state vector), that it has the same names as we set up in parameter.names, that default values to be assigned are 25 and 0 for the two variables respectively. If any data are read in, these override the default values, but in this case, we simply start with the defaults.

The text ‘help_default’ allows the field parameter.x.default to be set from the command line with the option –parameter.x.default=a,b.

The flag ‘apply_grid’, which is the default, tells the EOLDAS to produce the state vector on a grid over the bounds defined in parameter.limits.

Finally, we define default uncertainty information for the state vector (this does not directly affect the running of the EOLDAS (parameter.x.sd), and define the bounds for each state vector (use None if no bound is to be used). Here, we set the lower bound as 0 and the upper bound as 1 for both parameters.

The next section sets up general conditions:

```[general]
doplot=True
help_do_plot='do plotting'
```

In this case, we set a flag to do plotting when the results are written out. Plotting will use the filenames in any state variable section e.g. parameter.result.filename to generate a series of plots with filenames the same as this data filename. We will see some examples later.

The next section sets up the operators that we want to define here.

```[operator]
modelt.name=DModel_Operator
modelt.datatypes = x
obs.name=Operator
obs.datatypes = x,y
```

Here, we define two operators, ‘DModel_Operator’ and ‘Operator’. These names refer directly to python classes for the operators in EOLDAS. The base class is ‘Operator’ which implements the Identity operator. All other classes are derived from this. The differential operator works only on the x state vector, which is equlivalent to defining . The operator ‘Operator’ access both x and y data if it to act as a Prior constraint, so we set up x and y datatypes.

Next we set the details of these operators. First, the differential operator:

```[operator.modelt.x]
names = \$parameter.names
sd = [1.0]*len(\$operator.modelt.x.names)
datatype = x

[operator.modelt.rt_model]
model_order=1
help_model_order='The differential model order'
wraparound=periodic,365
```

where we specify which state vector elements this operator has access to (all of those in parameter.names here) and set up the default uncertainty and datatype.

We then set the parameters specific to the ‘model’ , in this case the order of the differential model (2 here) and the edge conditions (periodic, with a period of 200 (days)).

Finally, we set up the operator operator.obs, specifically, parameters for its x and y state vectors.

```[operator.obs.x]
names = \$parameter.names[1:]
sd = [1.0]*len(\$operator.obs.x.names)
help_sd='Set the observation sd'
datatype = x

[operator.obs.y]
names = \$parameter.names[1:]
sd = [0.15]*len(\$operator.obs.x.names)
help_sd="set the sd for the observations"
datatype = y
state = data/Identity/random_ndvi1.dat
help_state = "Set the y state vector"

[operator.obs.y.result]
filename = output/Identity/NDVI_fwd.params
```

specifying default uncertainty information, data types and any required output files. Here, we wish to write out the results in operator.obs.y, so we specify a filename for this. Again, we see the use of a ‘help’ variable, which here allows operator.obs.y.result.filename to be set from the command line. This interfacing to the command line means that a single configuration file can generally serve for multiple experiments and the user does not need to keep generating new ones.

The main program can be accessed in various ways. One way is to write some front end code that calls the eoldas python code.

An example is solve_eoldas_identity.py that includes three sections:

First, some code to generate a synthetic dataset.

```import pdb
import numpy as np

def create_data ( n_per=4, noise=0.15, obs_off=0.33, \
window_size=0, order=4):
"""
Create synthetic "NDVI-like" data for a fictitious time series. We return
the original data, noisy data (using IID Gaussian noise), the QA flag as well
as the time axis.

Missing observations are simulated by drawing a random number between 0 and 1
and checking against obs_off.

Parameters
----------
n_per : integer
Observation periodicity. By default, assumes every 8 days

noise : float
The noise standard deviation. By default, 0.15

obs_off : float
The threshold to decide on missing observations in the time series.

window_size : integer, odd
window size for savitzky_golay filtering. A large window size will lead
to larger data gaps by correlating the noise. Set to zero by default
which applies no smoothing.

order : integer
order of the savitzky_golay filter. By default 4.

"""
from savitzky_golay import savitzky_golay
import numpy as np

doys  = np.arange ( 1, 365+1, n_per)
ndvi_clean =  np.clip(np.sin((doys-1)/72.), 0,1)
ndvi =  np.clip(np.sin(doys/72.), 0,1)
# add Gaussian noise of sd noise
ndvi = np.random.normal(ndvi,noise,ndvi.shape[0])

# set the qa flags for each sample to 1 (good data)
qa_flag = np.ones_like ( ndvi).astype( np.int32 )
passer = np.random.rand( ndvi.shape[0])
if window_size >0:
# force odd
window_size = 2*(window_size/2)+1
passer = savitzky_golay(passer, window_size=window_size, order=order)
# assign a proportion of the qa to 0 from an ordering of the smoothed
# random numbers
qa_flag[np.argsort(passer)[:passer.size * obs_off]]  = 0

return ( doys, ndvi_clean, ndvi, qa_flag )
```

Here, we generate some NDVI data which has a trajectory of a sine wave for the first half of the year and is flat at zero for the second half. The sampling is controlled by n_per and obs_off. The parameter obs_off randomly removes a proportion of the data. If window_size and order are set then a savitzky_golay filter is used to induce correlation in the timing of the samples that are removed from thie dataset (qa=0). This mimics what we practically have in Optical Earth Observation with temporal correlation in cloud cover.

The next section calculates the ‘ideal’ value of by calculating the root mean squared deviation of the original dataset. We use this ideal gamma here for to demonstrate the physical meaning of the gamma value, though in practice this would be unknown. This section also writes the dataset to a temporary file in ‘BRDF’ format.

All of this so far standard python coding, though such datasets could be generated in many other ways. The final section interfaces to the top level of the eoldas code, which is what is of immediate concern in this tutorial.

```    import sys,tempfile
this = sys.argv[0]
sys.path.append(this + '../system_confs')
sys.path.append(this + '../confs')
sys.path.append(this + '../eoldaslib')
sys.path.append(this + '../bin')
import eoldaslib as eoldas

# SD of noise
noise=0.15
# nominal sampling period
n_per=7
# proportion of missing samples
obs_off=0.33
# order of differential model (integer)
model_order=1
# sgwindow is larger to create larger data gaps
sgwindow=10

# set up data for this experiment
file, ideal_gamma,doys,ndvi_clean,ndvi,qa_flag  = \
prepare(noise=noise,n_per=n_per,\
obs_off=obs_off,model_order=model_order,\
sgwindow=sgwindow)

# set gamma to less than the the theoretical value
gamma = ideal_gamma*0.33

# set op file names
xfile = 'output/Identity/NDVI_Identity.params'
yfile = 'output/Identity/NDVI_fwd.params'

# initialise optuions for DA overriding any in config files
cmd = 'eoldas ' +  \
' --conf=eoldas_config.conf --conf=confs/Identity.conf ' + \
' --logfile=mylogs/Identity.log ' + \
' --calc_posterior_unc ' + \
' --parameter.solve=0,1 ' + \
' --parameter.result.filename=%s '%xfile +\
' --parameter.x.default=%f,0.0 '%(gamma) + \
' --operator.obs.y.result.filename=%s'%yfile +\
' --operator.obs.y.state=%s'%file+\
' --operator.modelt.rt_model.model_order=%d '%model_order

# initialise eoldas
self = eoldas.eoldas(cmd)
# solve DA
self.solve(write=True)

```

The first part of the code extends the system path for where it searches for libraries. This is done relative to where solve_eoldas_identity.py is (in the bin directory of the distribution). After that, we set up values for the parameters for generating the synthetic dataset.

The interface to the eoldas here is mainly to make a string with a number of flags. The most important flag is –conf=confs/Identity.conf which specifies the configuration file for this experiment. In addition, –conf=eoldas_config.conf is given, which specifies a system default configuration file, eoldas_config.conf.

So, the simplest ‘top level’ interface to eoldas from python code involves:

```    cmd = 'eoldas ' +  \
' --conf=eoldas_config.conf --conf=confs/Identity.conf ' + \
' --logfile=mylogs/Identity.log ' + \
' --calc_posterior_unc ' + \
' --parameter.solve=0,1 ' + \
' --parameter.result.filename=%s '%xfile +\
' --parameter.x.default=%f,0.0 '%(gamma) + \
' --operator.obs.y.result.filename=%s'%yfile +\
' --operator.obs.y.state=%s'%file+\
' --operator.modelt.rt_model.model_order=%d '%model_order
```

and

```    # initialise eoldas
self = eoldas.eoldas(cmd)
# solve DA
self.solve(write=True)
```

where we set up the text string, initiate the eoldas object (eoldas.eoldas) and then call the eoldas.solve() method.

This command string is clearly of some importance. The flags –logfile=mylogs/Identity.log and –calc_posterior_unc correspond to items in the [general] section of the configuration file. Here, eoldas_config.conf contains the lines:

```[general]
help_datadir ="Specify where the data and or conf files are"
here = os.getcwdu()
grid = True
is_spectral = True
calc_posterior_unc=False
help_calc_posterior_unc ="Switch to calculate the posterior uncertainty"
write_results=True
help_write_results="Flag to make eoldas write its results to files"
init_test=True
help_init_test="Flag to make eoldas run a test of the cost functions before proceeding with DA"
doplot=False
plotmod=100000000
plotmovie=False

[general.optimisation]
# These are the default values
iprint=1
gtol=1e-3
maxIter=1e4
maxFunEvals=2e4
plot=0
# see http://openopt.org/NLP#Box-bound_constrained
solverfn=scipy_lbfgsb
randomise=False
help_randomise='Randomise the starting point'
no_df = False
```

To understand how e.g the flag –calc_posterior_unc can be used we can look at:

```calc_posterior_unc=False
help_calc_posterior_unc ="Switch to calculate the posterior uncertainty"
```

where we set the default value of the item (False) and also have a ‘help’ statement which allows this value to be overridden. You shouldn’t normally need to change things in the system configuration file eoldas_config.conf. This flag controls whether we calculate the posterior iuncertainty or not. The default is False because it can be quite computationally expensive and is often best done in a post processing step.

The flag –operator.obs.y.state=filename refers specifically to the section operator.obs.y.state of the configuration, which is something we have set up in Identity.conf with a ‘help’ field, so we can override the default value set in the configuration file.

```[operator.obs.y]
names = \$parameter.names[1:]
sd = [0.15]*len(\$operator.obs.x.names)
help_sd="set the sd for the observations"
datatype = y
```

If we run bin/solve_eoldas_identity.py, it sends logging information to mylogs/Identity.log and reports (to the stderr) the progress of the optimisation, ‘f’ being the total of all of the terms for this configuration. It should converge to a solution within some tens of iterations and result in a final value of of around 1800. We set the name of the logfile in solve_eoldas_identity.py:

```    if window_size >0:
```

There is a lot of detail in the log file about exactly what value terms are set to and the progress of the eoldas. It also contains information on the individual terms:

```2011-12-06 11:38:37,151 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 105.861311
2011-12-06 11:38:37,153 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 48.534420
2011-12-06 11:38:37,154 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 101.322076
2011-12-06 11:38:37,156 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 42.261860
2011-12-06 11:38:37,158 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 90.318198
2011-12-06 11:38:37,160 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 43.232701
2011-12-06 11:38:37,161 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 78.191014
2011-12-06 11:38:37,163 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 48.107531
2011-12-06 11:38:37,164 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 74.263679
2011-12-06 11:38:37,166 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 46.111283
2011-12-06 11:38:37,167 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 63.478516
2011-12-06 11:38:37,169 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 48.646279
2011-12-06 11:38:37,170 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 57.174076
2011-12-06 11:38:37,172 - eoldas.solver.eoldas.solver-modelt-x - INFO -      J   = 51.503171
2011-12-06 11:38:37,173 - eoldas.solver.eoldas.solver-obs-x - INFO -      J   = 54.400634
```

A log file is important to the running of eoldas, as the processing can take quite some time for some problems.

The state vector results will be written to output/Identity/NDVI_Identity.params because we first specified this in Identity.conf:

```[parameter.result]
filename = output/Identity/NDVI_Identity.params
format = 'PARAMETERS'
```

(actually, since we also set the comamnd ‘ –operator.obs.y.result.filename=%s’%yfile in the code we have written, this will override what is in the parameter file).

The output file format is a general ‘PARAMETERS’ format that should look something like this:

```#PARAMETERS time gamma_time NDVI sd-gamma_time sd-NDVI
1.000000   42.912561    0.121577    0.000000    0.064794
2.000000   42.912561    0.133615    0.000000    0.065856
3.000000   42.912561    0.145653    0.000000    0.066659
4.000000   42.912561    0.157693    0.000000    0.067213
5.000000   42.912561    0.169733    0.000000    0.067524
6.000000   42.912561    0.181774    0.000000    0.067596
7.000000   42.912561    0.193815    0.000000    0.067428
8.000000   42.912561    0.205854    0.000000    0.067020
9.000000   42.912561    0.218111    0.000000    0.067987
```

The first line of the file is a header statement that describes the location fields (‘time’ here) and state variables (gamma_time and NDVI here) and subsequent lines give the values for those fields. This format can be used for input data, but it cannot at present be used to define a full input covariance matrix, only the standard deviation for each observation as in this example.

A graph of the result is in output/Identity/NDVI_Identity.params.plot.x.png:

This result is quite interesting for understanding how our DA system works: The plot shows the mean of the estimate of the NDVI state vector in the lower panel (the upper panel shows the value of gamma_time which we did not solve for) as a red line as we solve for the state every day (of 365 days). The 95% confidence interval is shown shaded in grey. We will show this below in a more refined plot of the results.

We also specified the ‘y’ data to be written out (in Identity.conf):

```help_state = "Set the y state vector"

[operator.obs.y.result]
filename = output/Identity/NDVI_fwd.params
```

so this goes to a file output/Identity/NDVI_fwd.params unless the flag –operator.obs.y.result.filename=somethingElse.dat is used. The format of this file is the same as above, but it shows the retrieved NDVI data since this reports .

```#PARAMETERS time mask NDVI sd-NDVI
1.000000    1.000000    0.121577    0.064794
8.000000    1.000000    0.205854    0.067020
15.000000    1.000000    0.291651    0.069036
22.000000    1.000000    0.382305    0.072210
29.000000    1.000000    0.447438    0.078439
78.000000    1.000000    0.816254    0.078975
85.000000    1.000000    0.864596    0.073198
92.000000    1.000000    0.898507    0.071030
99.000000    1.000000    0.911523    0.071268
```

The original dataset is output form convenience in the same format as the y state, being output/Identity/NDVI_fwd.params_orig here:

```#PARAMETERS time mask NDVI sd-NDVI
1.000000    1.000000   -0.097146    0.150000
8.000000    1.000000    0.196836    0.150000
15.000000    1.000000    0.262894    0.150000
22.000000    1.000000    0.533326    0.150000
29.000000    1.000000    0.521164    0.150000
78.000000    1.000000    0.842083    0.150000
85.000000    1.000000    0.950098    0.150000
92.000000    1.000000    1.022199    0.150000
99.000000    1.000000    1.169090    0.150000
```

Various other graphics are output for a ‘y’ state:

A plot at each location, in output/Identity/NDVI_fwd.params.plot.y.png (not of much relevance in this example):

A plot of the observations and modelled values as a function of location (the observations are the green dots) in output/Identity/NDVI_fwd.params.plot.y2.png:

and a plot of the x state vector associated with this operator (NDVI here).

### Example plotting data from the output files¶

The above plots are automatically generated by eoldas provided general.doplot is True but these are intended as quicklooks, and users are likely to want to form their own plots.

An example of this is implemented in example1plot.py:

```#!/usr/bin/env python
import numpy as np
import pylab as plt
#
# Some plotting of the synthetic and retrieved data

# generate the clean data
doys  = np.arange ( 1, 365+1, 1)
ndvi_clean =  np.clip(np.sin((doys-1)/72.), 0,1)

sdata = np.array([np.array(i.split()).astype(float) for i in state[1:]])
sNdvi = sdata[:,2]
sdNdvi = sdata[:,4]

# noisy sample data
ndata = np.array([np.array(i.split()).astype(float) for i in noisy[1:]])
nNdvi = ndata[:,2]
sdnNdvi = ndata[:,3]
ndoys = ndata[:,0]

# retrieved
plt.fill_between(doys,y1=sNdvi-1.96*sdNdvi,y2=sNdvi+1.96*sdNdvi,facecolor='0.8')
p1 = plt.plot(doys,sNdvi,'r')
# original
p2 = plt.plot(doys,ndvi_clean,'g')
# samples used
p3 = plt.errorbar(ndoys,nNdvi,yerr=sdnNdvi*1.96,fmt='bo')

plt.legend([p2,p3,p1],['Original state','Sampled noisy state','Retrieved state'])

#plt.show()
plt.savefig('images/example1plot.png')
```

We can see that smoothers of this sort have some difficulty maintaining sudden changes, although this is quite challenging here given the level of noise amd the rather large data gaps.

More importantly, we have used the smoother to interpolate over the missing observations and to reduce uncertainty.

If we inspect the file output/Identity/NDVI_Identity.params:

```#PARAMETERS time gamma_time NDVI sd-gamma_time sd-NDVI
1.000000   42.912561    0.121577    0.000000    0.064794
2.000000   42.912561    0.133615    0.000000    0.065856
3.000000   42.912561    0.145653    0.000000    0.066659
4.000000   42.912561    0.157693    0.000000    0.067213
5.000000   42.912561    0.169733    0.000000    0.067524
6.000000   42.912561    0.181774    0.000000    0.067596
7.000000   42.912561    0.193815    0.000000    0.067428
8.000000   42.912561    0.205854    0.000000    0.067020
9.000000   42.912561    0.218111    0.000000    0.067987
```

alongside the original data output/Identity/NDVI_fwd.params_orig:

```#PARAMETERS time mask NDVI sd-NDVI
1.000000    1.000000   -0.097146    0.150000
8.000000    1.000000    0.196836    0.150000
15.000000    1.000000    0.262894    0.150000
22.000000    1.000000    0.533326    0.150000
29.000000    1.000000    0.521164    0.150000
78.000000    1.000000    0.842083    0.150000
85.000000    1.000000    0.950098    0.150000
92.000000    1.000000    1.022199    0.150000
99.000000    1.000000    1.169090    0.150000
```

We note that the original state here (green line) lies entirely within the 95% CI. The error reduction has been of the order of 2.2 (compare sd-NDVI after the DA with that prior to it). We can see that the uncertainty at the datapoints has been reduced from 0.15 (that of the input data) to typically around 0.065. This grows slightly (to around 0.10) when the data gaps are large.

The impact of ‘filtering’ in this way (optimising a fit of the model to the observations) is to smooth the data and reduce uncertainty in the output. The reduction in uncertainty is related to the amount of smoothing that we apply. The second effect is to interpolate between observations, with uncertainty growing where we have no observations.

You might try changing the value of gamma used and seeing the effect on the results.

We could do this by modifying a few lines of solve_eoldas_identity.py to produce solve_eoldas_identity1.py:

```if __name__ == "__main__":
import sys,tempfile
this = sys.argv[0]
sys.path.append(this + '../system_confs')
sys.path.append(this + '../confs')
sys.path.append(this + '../eoldaslib')
sys.path.append(this + '../bin')
import eoldaslib as eoldas

# SD of noise
noise=0.15
# nominal sampling period
n_per=7
# proportion of missing samples
obs_off=0.33
# order of differential model (integer)
model_order=1
# sgwindow is larger to create larger data gaps
sgwindow=10

# set up data for this experiment
file, ideal_gamma,doys,ndvi_clean,ndvi,qa_flag  = \
prepare(noise=noise,n_per=n_per,\
obs_off=obs_off,model_order=model_order,\
sgwindow=sgwindow)

# set gamma to less than the the theoretical value
gamma = ideal_gamma*0.45

# set op file names
```

which write out to output/Identity/NDVI_Identity1.params and output/Identity/NDVI_fwd1.params and has a gamma value that is 0.45/0.33 of that previously used.

Now, plotting this using example1plot1.py:

This is possibly a better result, but in fact what we see is further limitation of the model that we have chosen here: we enforce wraparound (i.e. the NDVI at day 1 is expected to be the same as at day 365) and we enforce smoothness (so as we increase the gamma, the smoothness, we over-smooth at the sudden change that occurs half way through the year).

We could remove the wraparound condition, but in practice, it is better simply to weaken this constraint. We have done this in solve_eoldas_identity2.py:

which write out to output/Identity/NDVI_Identity2.params and output/Identity/NDVI_fwd2.params and has the same (higher) gamma value used above.

Now, plotting this using example1plot2.py:

If you re-run these scripts several times, so that you see different configurations for the temporal sampling, you will notice that the interpolation sometimes behaves well over the entire dataset (for some given gamma) and sometimes doesn’t. For example:

### Interfacing a little more deeply with the eoldas code¶

Whilst considering writing wrapper codes around eoldas functionality and outputs it is instructive to explore some of the data structure available.

We can re-use the example solve_eoldas_identity.py developed above and access some of the data structure as shown in solve_eoldas_identity_a.py.

```#!/usr/bin/env python
import pdb
import numpy as np

if __name__ == "__main__":
import sys,tempfile
this = sys.argv[0]
sys.path.append(this + '../system_confs')
sys.path.append(this + '../confs')
sys.path.append(this + '../eoldaslib')
sys.path.append(this + '../bin')
import eoldaslib as eoldas
from solve_eoldas_identity import *
# import the setup methods from solve_eoldas_identity
import pylab as plt

# SD of noise
noise=0.15
# nominal sampling period
n_per=7
# proportion of missing samples
obs_off=0.33
# order of differential model (integer)
model_order=1
# sgwindow is larger to create larger data gaps
sgwindow=10

# set up data for this experiment
file, ideal_gamma,doys,ndvi_clean,ndvi,qa_flag  = \
prepare(noise=noise,n_per=n_per,\
obs_off=obs_off,model_order=model_order,\
sgwindow=sgwindow)

# set gamma to thge theoretical value
gamma = ideal_gamma

# set op file names
xfile = 'output/Identity/NDVI_Identity_a.params'
yfile = 'output/Identity/NDVI_fwd_a.params'

# initialise options for DA overriding any in config files
# make sure we use some different output file names to othe scripts
cmd = 'eoldas ' +  \
' --conf=eoldas_config.conf --conf=confs/Identity.conf ' + \
' --logfile=mylogs/Identity.log ' + \
' --calc_posterior_unc ' + \
' --parameter.solve=0,1 ' + \
' --parameter.result.filename=%s '%xfile +\
' --parameter.x.default=%f,0.0 '%(gamma) + \
' --operator.obs.y.result.filename=%s'%yfile +\
' --operator.obs.y.state=%s'%file+\
' --operator.modelt.rt_model.model_order=%d '%model_order

# initialise eoldas
self = eoldas.eoldas(cmd)
# solve DA
self.solve(write=True)

# now pull some data out of the eoldas

# the 'root' of the DA data structure is in self.solver.root

root = self.solver.root

# The state vector data are stored in root.x
# with ancillary information in root.x_meta
# so the state vector is e.g. in root.x.state
# and the names are in root.x_meta.state

state_names = root.x_meta.state
state = root.x.state

# The sd representation of the posterior is in root.x.sd
# This is all set up in eoldas_Solver.py
# All storage is of type ParamStorage, an extended
# dictionary structure. You can explore it interactively
# with e.g. root.dict().keys() or self.keys() since
# self here is a straight dictionary
sd = root.x.sd

# The full inverse Hessian is in self.solver.Ihessian
Ihessian = self.solver.Ihessian

# A mask to reduce this to only the state variables
# being targeted (solve == 1) is through a call to:

# This is of shape (365,365) here.
# so now lets produce an image of it
# to visualise the structure
fig = plt.figure()
cax = ax.imshow(NDVI_Ihessian,interpolation='nearest')
ax.set_title('Posterior Uncertainty matrix for NDVI')
# Add colorbar, make sure to specify
# tick locations to match desired ticklabels
cbar = fig.colorbar(cax, ticks=[-1, 0, 1])
cbar.ax.set_yticklabels(['< -1', '0', '> 1'])# vertically oriented colorbar
# see http://matplotlib.sourceforge.net/plot_directive/mpl_examples/pylab_examples/colorbar_tick_labelling_demo.py
# save it
plt.savefig('output/IHessianNDVI_expt1.png')

```

The comments in the code should be self explanatory and anyone interested in delving much further into the eoldas codes should see the full class documentation. Here, we can see at least how to access the posterior estimate of the state and its uncertainty. We write out the uncertainty to an image using matplotlib (pylab):

Now, this is a very interesting figure for understanding how these multiple constraints are interacting. The observation uncertainty is just described by standard deviation, so lies along the leading diagonal of the a priori uncertainty. Further, it only exists where there are data points. The impact of applying the (regularisation) model constraint is to reduce the uncertainty at the observation points as we would expect. We can see a ‘sausage’ pattern from above in this figure quite clearly. The ‘pinch points’ are when the sample points are dense. Where there are large data gaps (from our simulated cloud impacts here) the uncertainty is higher, but it is ‘spread out’ from the leading diagonal by applying temporal covariance. The impact of the filtering is large where the observation impact is low (or non existant). We will see these same effects in many DA experiments.

### Running EOLDAS from the command line¶

An alternative to writing your own python code for the front end is to use eoldas.py, which can be direcly run from the command line.

Help on command line options is available by typing:

eoldaslib/eoldas.py –help

As an aid to setting up the correct python variables etc, a front end script, eoldas can also be accessed (in the bin directory).

bin/eoldas –conf=confs/Identity.conf

N.B. the double dash on the comamnd line options (even if some renderings of this document show them as a single dash)