Methodology
Methodology for indicator calculation
Data source: The data in Waterbase are collected through the Eurowaternet process and are therefore sub-samples of national data assembled for the purpose of providing comparable indicators of pressures, state and impact of waters on a Europe-wide scale and the data sets are not intended for assessing compliance with any European Directive or any other legal instrument. Information on the sub-national scales should be sought from other sources.
Station selection: No criteria are used for station selection (except for time series and trend analysis; see below)
Determinants: The determinants selected for the indicator and extracted from Waterbase are BOD5, BOD7, total ammonium and ammonium.
Most countries monitor BOD5. Finland monitors BOD7. Lithuania monitored BOD5 up to 1995 and started monitoring BOD7 in 1996. Latvia monitored BOD7 from 1996 to 2001. Estonia monitored BOD5 in 2010, while it monitored BOD7 up to 2009. BOD is commonly used for BOD5. For countries reporting BOD7, these values have been converted to BOD5 (BOD7 = 1.16 BOD5) for reasons of comparability.
Most countries monitor total ammonium. Lichtenstein monitors ammonium. In the latest years some countries monitored ammonium or both ammonium and total ammonium. Germany, Luxembourg and Slovenia monitored ammonium for 2008. Austria and Slovenia monitored ammonium for 2009. Austria, Bulgaria, Estonia, France, Latvia, Norway and Slovenia monitored ammonium for 2010. For these countries ammonium is selected for the indicator. Some countries monitored ammonium and total ammonium for different stations. These are Belgium for 2010 and Germany for 2009 and 2010. Both ammonium and total ammonium are selected for the indicator.
All values are labeled as BOD5/total ammonium in the graphs, but it is indicated in the graph notes for which countries BOD7/ammonium data are used.
Quality checked data: In the table on nutrients (“Waterbase_rivers_v12_Nutrients”), QA-fields are treated as follows:
- Field “QA_MVissues”: all flagged values are excluded from the indicator calculation, except for zero values (flag 103).
- Field “QA_LRviolation”: all flagged values are allowed, except for flagged values that break rule “Mean >= Minimum” (flag 201) and “Mean <= Maximum” (flag 202).
- Field “QA_outlier”: all flagged values are excluded from the indicator calculation, except for outliers confirmed by country (flags 491, 493).
- Field “QA_station_issues: all flagged values are allowed (including wrong coordinates or missing coordinates), except for “Water Category value is incompatible with this particular dataset” (flag 511) and “station is not defined in the station table” (flag 599).
- Field “QA_CR violation”: all flagged values are allowed.
Mean: Annual mean concentrations are used in the time series and present concentration graphics. Countries are asked to substitute any sample results below the limit of detection/determination by a value equivalent to half of the limit of detection/determination before calculating the station annual mean values. Mean concentration values of zero are included in the indicator calculation as zero (0).
Inter/extrapolation and consistent time series
For time series and trend analyses, only series that are complete after inter/extrapolation (i.e. no missing values in the station data series) are used. This is to ensure that the aggregated data series are consistent, i.e. including the same stations throughout the time series. In this way assessments are based on actual changes in concentration, and not changes in the number of stations.
Changes in methodology: Station selection and inter/extrapolation.
Until 2006, only complete time series (values for all years from 1992 to 2004) were included in the assessment. However, a large proportion of the stations was excluded by this criterion. To allow the use of a considerably larger part of the available data, it was in 2007 (i.e. when analysing data up until 2005) decided to include all time series with at least seven years of data. This was a trade-off between the need for statistical rigidity and the need to include as much data as possible in the assessment. However, the shorter series included might represent different parts of the whole time interval, and the overall picture may therefore not be reliable. In 2009, it was decided rather to inter/extrapolate all gaps of missing values of 1-2 year for each station. At the beginning or end of the data series 1 missing value was replaced by the first or last value of the original data series, respectively. In the middle of the data series, missing values were replaced by the values next to them for gaps of 2 years and by the average of the two neighbouring values for gaps of 1 year.
In 2010 this approach was modified, allowing for gaps of up to 3 years, both at the ends and in the middle of the data series. At the beginning or end of the data series up to 3 years of missing values are replaced by the first or last value of the original data series, respectively. In the middle of the data series, missing values are replaced by the values next to them, except for gaps of 1 year and for the middle year in gaps of 3 years, where missing values are replaced by the average of the two neighbouring values. Only time series with no missing years for the whole period 1992-2010 after such inter/extrapolation are included in the assessment. The number of gaps is unlimited, only gap length (size) of 3 years is defined. This procedure increases the number of stations that can be included in the time series/trend analysis. Still, the number of stations is markedly reduced compared to the analysis of the present situation, where all available data can be used.
Aggregation of time series
The selected time series (see above) must be aggregated in to a smaller number of groups and averaged, before the aggregated series can be displayed in a time series plot. Determinandts are grouped into five geographical regions of Europe, which contain the following countries:
Eastern: CZ, EE, HU, LT, LV, PL, SI, SK.
Northern: FI, IS, NO, SE.
Southern: CY, ES, GR, IT, MT, PT.
South-Eastern: AL, BA, BG, HR, ME, MK, RO, RS, TR, XK.
Western: AT, BE, CH, DE, DK, FR, IE, LI, LU, NL, UK.
(List of country codes can be found here )
Not all listed countries per region are included in the figures due to no data reported or no stations with complete time series after inter/extrapolation. Due to changes of monitoring network (adapting to monitoring networks under Water Directives) the time series are broken and limited number of time series is available for some countries.
Determinants are in addition grouped into six sea region catchments, which are defined not by countries but by river basin districts or river basin district subunits if consistent with catchment areas of seas. The data thus represents rivers or river basins draining into that particular sea. The sea regions are defined as Arctic Ocean, Greater North Sea, Celtic Seas, Bay of Biscay and the Iberian Coast, Baltic Sea, Black Sea and Mediterranean Sea. The sea region delineation is according to the Marine Strategy Framework Directive (MSFD) Article 4, with the Arctic Ocean added as a separate region. As the catchment area draining into what is defined as the North-east Atlantic Ocean region of the MSFD is very big, it was decided rather to use the sub-region level here, but merging the Celtic Seas and the Bay of Biscay and the Iberian Coast.
Determinants are also aggregated for the whole of Europe.
Trend analyses
Trends are analysed by the Mann-Kendall method (McLeod 2005) in the free software R (R Development Core Team 2006). The test was suggested by Mann (1945) and has been extensively used with environmental time series (Hipel and McLeod, 2005). Mann-Kendall is a test for monotonic trend in a time series y(x), which in this analysis is nutrient concentration (y) as a function of year (x). The test is based on Kendall's rank correlation, which measures the strength of monotonic association between the vectors x and y. In the case of no ties in the x and y variables, Kendall's rank correlation coefficient, tau, may be expressed as tau=S/D where S = sum_{i<j} (sign(x[j]-x[i])*sign(y[j]-y[i])) and D = n(n-1)/2. S is called the score and D, the denominator, is the maximum possible value of S. The p-value of tau under the null hypothesis of no association is computed by in the case of no ties using an exact algorithm given by Best and Gipps (1974). The tests reported here are two-sided (testing for both increasing and decreasing trends). Data series with p-value < 0.05 are reported as significantly increasing or decreasing ("strong trends"), while data series with p-value >= 0.05 and <0.10 are reported as marginally significant ("weak trends"). Data series with p-value >0.10 have no significant trend. The test is non-parametric which means that the amount of change from year to year is not considered, only the direction of the change.
Present concentration distributions
The latest year for which there are concentration data for the selected river stations are extracted from Waterbase. The number of stations with annual mean concentrations occurring in the selected concentration bands or classes are then calculated and presented. The allocation of a station to a particular class is based only on the face value concentration and not on the likely statistical distribution around the mean values.
- The new/revised class defining values for BOD5 concentrations (mg O2/l): <1.4, 1.4 to 1.99, 2 to 2.99, 3 to 3.99, 4 to 4.99, >5. The two highest classes are merged to >4.
- The new/revised class defining values for total ammonium concentrations (mg N/l): <0.04, 0.04 to 0.09, 0.1 to 0.19, 0.2 to 0.39, 0.4 to 0.99, >1. The two highest classes are merged to >0.4.
More information is given in the WISE maps on Water quality in rivers and lakes under section "Help": http://www.eea.europa.eu/themes/water/interactive/soe-rl (BOD in rivers, Total ammonium in rivers).
Methodology for gap filling
Methodology for gap filling is described under Inter/extrapolation and consistent time series
Methodology references
-
Hipel, K.W. and McLeod, A.I., (2005).
Time Series Modelling of Water Resources and Environmental Systems. Electronic reprint of our book orginally published in 1994.
-
Mann, H.B. (1945).
Nonparametric tests against trend, Econometrica, 13, 245-259.
-
McLeod, A.I. (2005).
Kendall: Kendall rank correlation and Mann-Kendall trend test. R package version 2.0.
-
R Development Core Team (2006).
R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Document Actions
Share with others