OSIRIS Data in the ECV Initiative


As an effort by the European Space Agency to support the Climate Change Initiative and in conjunction with the United Nations Framework Convention on Climate Change and the Intergovernmental Panel on Climate Change many Essential Climate Variables were identified for the continued understanding of the climate system.  Among these ECVs are atmospheric ozone and stratospheric aerosols.
The OSIRIS mission is considered an ESA third party mission and as such it qualifies to contribute to the ECV initiative.  Inclusion necessitates a thorough and accurate random error and precision analysis, the characterization of systematic error, the validation of these data products, and the generation of Level 3 data (zonal means and climatologies).  

Level 2 Precision Analysis


The precision of OSIRIS Level 2 data is a measure of the uncertainty in the values retrieved from OSIRIS measurements of light scattering in the atmosphere.  This uncertainty comes from measurement noise and random noise generated through the SASKTRAN forward model in the SaskMART retrieval algorithm.  Using other retrieval methods the precision information is generated as a biproduct naturally in the form of a covariance matrix but by its nature SaskMART does not. The covariance matrix can be obtained after the retrieval using the original measrements' uncertainty and the retrieved result. This method has been validated using matching pair statics.  In this method stratospheric ozone in the tropics is considered to be stable over the timescale of a day so retrieved ozone profiles in close geographical proximity and within 1.01 days were compared.  The variation in these matching pairs compared favorably with the precision values.  This is the expected result because the precision should be a measure of the random error in the retrievals.  

Precision Value Retrieval


  This is a simplified explanation of the precision retrieval, for a more detailed explanation please see the JGR-Atmospheres article by A.E. Bourassa <link to final article once published>. It is difficult to measure things like atmospheric ozone and aerosol density directly because rocket and balloon measurements are expensive and localized.  Because of this we measure these things indirectly from light that is scattered from the atmosphere.  Modelling how light is scattered based on a given set of atmospheric conditions is difficult but possible (this is done using a "forward model" like SASKTRAN) .  The inverse problem is more difficult because atmospheric radiative transfer is not a linear problem so the equations are not invertable.  

Methods like SaskMART are numeric itterative solutions that use the forward model, and the observations to retrieve the desired state parameter like ozone. SaskMART takes an initial guess at the atmospheric state, runs it through SASKTRAN to predict the observed radiance at a set of wavelengths.  The guess is then updated based on the difference between the predicted radiance, and the actual OSIRIS measurements.  During this process a weighting matrix 'W' is used to determine how much the differences in the observation at one altitude affect the state parameter at other altitudes and how important each wavelength is.  We have uncertainty values associated with the OSIRIS measurements, and from this we can make a covariance matrix for each wavelength (which is specified by the index 'i'): 

The degree to which the uncertainty in our retrieved state parameter (like ozone) affects the predicted radiance given by OSIRIS is described by a kernel matrix where an element at row 'l' and column 'm' is:  

Using these things we can calculate the intermediate covariance of our retrieved state parameter based on each wavelength 'i':

To get elements of the total covariance matrix of the retrieved state parameter we use the weighting matrix 'W' from SaskMART:

Validation of Precision - Matching Pair Statistics



The obtained precision values were validated following a method presented by Piccolo and Dudhia (2007). It is generally accepted that at low latitudes ozone concentrations are relatively stable above the tropopause. Based on this we assume, as did the study by Toohey et al. (2010), that two measurements of low latitude stratospheric ozone taken a day apart should be approximately the same.  Any difference between the profiles could be attributed to random measurement and retrieval noise.   We assemble a set of the differences between matched pairs and find the variance of these differences.  This variance is related to  the random error associated with a profile.  

Mathematically we can say that the difference between two profiles is  so we find the variance in the set of differences:

For a large set of matched pairs the mean uncertainty in the profiles is

and is related to the variance of differences by the equation   

or we can write

and this value should match the average precision value.  

They are compared in the figure below using 


For this study we considered scans to be a matched pair if they were taken within a 1.01 day window, and were within 1.5° longitude  and 0.5° latitude of each other.

Sample Results of Precision Validation



This figure demonstrates that estimated precision (represented by the "Percent Mean Uncertainty") agrees well with the deviation between what should approximately represent measurements of identical atmospheres (represented by "Percent Standard Deviation").  This supports the precision retrieval process.

At lower altitudes (at and below the tropopause) the atmosphere varies too greatly over the course of 24 hours for the Percent Standard Deviation to represent the uncertainty.  This is why the red stars no longer match the grey area at low altitudes.

Precision Retrieval in the Processing Chain


The in the OSIRIS MART retrieval software a C++ class called MART_OSIRISLevel2 acts as the main driver of events.  Given a set of initial conditions (such as a specific OSIRIS scan and the desired output species) this class sets up the system and coordinates the operations necessary for the retrieval.  To carry out the precision retrievals two new functions were added called ErrorAnalysis_HybridAerosol and ErrorAnalysis_O3.  This level of the program handles the OSIRIS-specific operations such as obtaining the necessary OSIRIS Level 1 data.

When these functions are called they obtain critical information from other components of the program (species profile, observed radiance, and uncertainty in observed radiance) and call MARTRetrieval::RetrieveError.  The MARTRetrieval class uses this information to perform the precision retrieval that is outlined in previous pages.  This level of the program is designed to operate independant of the source of the input data.  This means that the RetrieveError function could be easily applied to other similar problems regardless of the source of the data.

To completely retrieve ozone and aerosol along with their precisions the following parameters are retrieved in this order:

  1. NO2
  2. Albedo
  3. Aerosol
  4. Albedo
  5. Aerosol
  6. Aerosol Precision
  7. O3
  8. O3 Precision




Level 3 Data Products


The OSIRIS Level 2 data consists as profiles of ozone and aerosol data.  This has been used to produce Level 3 data; a series of monthy zonal averages.  

Data Product Description


Time Bins

The Level 3 data products are created by averaging within monthly time bins.  This means that the time bins vary slightly in size with the days in each month.  The centers of these time bins are stored in the variable "time", which is measured in days since January 1, 1950.

In a given region of the atmosphere OSIRIS measurements may not be uniformly distributed in time.  Due to their nature as scattered sunlight measurements OSIRIS measurements can only be taken in the summer months, and at different times OSIRIS has been powered off for various reasons.  Therefore the data used to compute a monthly average may not accurately represent the entire month.  To help users to evaluate this effect a variable called "AVE_DOM" is the "average day of the month".

In each latitude bin for each month the mean, minimum, and maximum local solar time are also provided in the variable called "LST_MEAN", "LST_MIN", and "LST_MAX".  This may be useful in identifying features in the time series that might be dependant on diurnal variation and biased daily sampling. 

Latitude Bins

The Level 3 data products are created by averaging within zonal bins that are 5° of Latitude wide.    This provides 36 latitude bins in total.  The latitude centers of these bins are stored in a variable called "lat".

At the beginning of each orbit OSIRIS begins a scan the latitudinal distribution of the scans is often regularly spaced.  This can lead to averages that are latitudinally biased.  To help users to evaluate this influence a variable called "AVE_LAT" is the "average latitude" in that bin.

Altitude to Pressure Grid

The Level 3 data products are defined on a pressure grid while the Level 2 data products are retrieved on a regularly spaced altitude grid.  Each Level 2 profile is retrieved using a time and location dependant pressure profile that is subsequently stored along with the retrieved ozone and aerosol.

Prior to computing the averages in each bin the profiles ozone and aerosol values are interpolated to the Level 3 pressure grid.  These pressure grid values are constant for all time and latitude bins.  They are stored in a variable called "plev", which stands for "pressure levels".

Average Values

Within the time and latitude bins and at the positions on the pressure grid the average ozone density and aerosol extinction is calculated.

Along with this mean the standard deviation and the number of values that contributed to the average are stored.  These are also included to allow users to estimate how accurate or valuable the mean might be.

Ancillary Data

Each file header also contains other information describing its origins, versions, and contact information for the OSIRIS research group.


Download Level 3 Data Products


The Level 3 data products are available for download:

Zipped Aerosol Extinction Level 3 Data

Zipped Ozone Level 3 Data




MATLAB and the Level 3 Data File Format


File Format

The Level 3 data products are stored in Network Common Data Form (NetCDF) files.  Most recent versions of NetCDF files comply to the HDF5 format.  These files contain a self-descriptive header, that inform the reading software about the structure of the heirarchically stored data.

There are several platform indipendant software libraries that support the use of NetCDF files.  This project is maintained as part of the Unidata Program which is ran by the University Corporation for Atmospheric Research.  These software libraries are available in several programming languages from that website.

Access in MATLAB

MATLAB (from R2011 onwards for sure) starts equiped with a NetCDF function library.  This library is made up of high level and low level functions.  From the MATLAB Help documentation the high level functions are:

nccreate Create variable in NetCDF file
ncdisp Display contents of NetCDF file in Command Window
ncinfo Return information about NetCDF file
ncread Read data from variable in NetCDF file
ncreadatt Read attribute value from NetCDF file
ncwrite Write data to NetCDF file
ncwriteatt Write attribute to NetCDF file
ncwriteschema Add NetCDF schema definitions to NetCDF file


To load and display one month of data from the NetCDF file one may use the following standard MATLAB functions:

filename = 'SPARC_DI_T2Mz_AER_2009_OSIRIS_v05-05_i01.nc';
info = ncinfo( filename );
data = ncread( filename, info.Variables(4).Name );
month = 2;
set( gca, 'CLIM', [ 0  max(max(data(:,:,month))) ] );
axis xy;
xlabel( 'Latitude' );
title('Zonal Average Aerosol Extinction for February 2009');

To load and display a 1 year time series in one latitude and pressure bin from 2009 one may use the following standard MATLAB commands:

filename = 'SPARC_DI_T2Mz_AER_2009_OSIRIS_v05-05_i01.nc';
info = ncinfo( filename );
data = ncread( filename, info.Variables(4).Name );
latBinCenters = ncread( filename, info.Variables(1).Name );
pressBinCenters = ncread( filename, info.Variables(2).Name );
latBin = 12;
pressBin = 8;
hold on;
months = 1:12;
for monthctr = 1:length(months)
    plotdata(monthctr) = data(latBin,pressBin,monthctr);
plot( months(plotdata>0), plotdata(plotdata>0), 'r*:');
ylim([0 max(plotdata)]);
xlim([1 12])
xlabel('Months in 2009');
ylabel('Aerosol Extinction');
title(['Aerosol Extinction Time Series at ' num2str(latBinCenters(latBin)) '^o at ' num2str(pressBinCenters(pressBin)) 'hPa']);