# Renewal and Update of MTO IDF Curves: Defining the Uncertainty

## Abstract

The University of Waterloo has been commissioned by the Design and Contract Standards Office of the Ministry of Transportation of Ontario (MTO) to update the intensity–duration–frequency (IDF) curves that are used to estimate design storms for drainage infrastructure. Environment Canada provides analyzed rainfall data for weather stations in Ontario. Project engineers determine the design storm parameters using a variety of methods. Thus, there is a need for a consistent interpolation procedure that can deal with the sparseness and unevenness of the Ontario station network. This paper describes the procedures that form the basis for the MTO IDF Curve Lookup system. Confidence limits assist in the design process and allow comparison with other results. These were used to assess the spatial bias in the error fields. There appears to be none, but there is room for improvement where there are sharp changes in geography, such as the highlands west of Thunder Bay. Validation curves are presented for several Canadian stations and one American station.

A single set of ten regression equations is sufficient for the whole province, eliminating the need for regional IDF curves and simplifying the design process.

## 1 Introduction

Good estimates of peak rainfall intensity are essential when designing highway drainage infrastructure such as culverts, bridges, sewer systems and roadside ditches. Quality rainfall data enables designers to make calculations that meet drainage capacity design standards and avoid the over- or under-design of drainage elements. Both can be expensive: over-design may waste resources and under-design can result in additional maintenance or repair costs.

Intensity–duration–frequency (IDF) curves are used to characterize the statistics of extreme rainfall for the design of flood protection infrastructure on small watersheds. Environment Canada (EC) periodically publishes curve parameters, in the form of *AB* coefficients, at selected meteorological station locations in Ontario and elsewhere.

Engineers are faced with the problem that data are only available for certain spots in the province. Data for other locations must be interpolated for every project. The Ontario Ministry of Transportation (MTO) IDF Curve Lookup tool uses the latest EC data available from 147 Meteorological Service of Canada (MSC) stations across Ontario to determine rainfall intensities for any location in the province. The goal of the tool is to provide a consistent and objective method.

The U. S. Geological Survey (USGS) GTOPO30 digital elevation dataset (USGS 1997) is used to derive physiographic characteristics that become independent variables in a stepwise multiple linear regression analysis with MSC station statistics. The regression produces a set of generating equations for the parameters used to produce IDF curves. The technique is modified for this study and weighs station data by length of record to ensure that more reliable data have greater influence on the interpolation.

MTO began revising its IDF curves in the late 1990s due to administrative changes, the availability of new data sets from MSC, and the concern for the impact that climate change could have on infrastructure design. Historically, MTO provided IDF curves for each MTO district in a hard copy document developed with EC data (Hogg, 1989). Ontario has the problem of varying gauge density throughout its geography. It has a densely gauged south, and a sparsely gauged north. The IDF Curve Lookup Tool uses the regional and local topography of the area of interest to interpolate the climate parameters that are relevant for a design.

The variables of interest in this study are the statistics that define the precipitation frequency.

### 1.1 IDF Curves

IDF curves summarize the annual probability of exceedance of storm intensity given a storm duration, P*(R,* *τ**)* of a volume of rainfall *R* in a single event of duration *τ* (h) and intensity *RR* (mm/h). It is common practice to express*(R, **τ**)* by return period *T*:

(1) |

In this document, rainfall intensity and volume are expressed by:

(2) |

where:

(3) |

A, B |
= | AB coefficients that vary with location and return period, |

RR |
= | rainfall rate in mm/h, |

R |
= | rainfall volume in mm, |

τ |
= | event duration in hours, and |

T |
= | return period in years. |

### 1.2 MSC Data Set

The National Climate Data Information Archive of Environment Canada (MSC 2012) provided the rainfall dataset used in this study. The values for the intensity of rainfall, *R*, and the *AB* parameters used in the modeling of IDF curves for the 147 Ontario stations are for return periods of 2 y, 5 y, 10 y, 25 y, 50 y and 100 y. One of the 147 stations was removed from the analysis as an outlier. Figure 1 shows the station locations. Record lengths varied from 9 y to 76 y with an average record length of 28 y.

Figure 1 Map of Ontario showing all weather stations used in this study.

### 1.3 Sample station data analysis

EC publishes *AB* coefficients with the data for the stations. EC uses the classical Gumbel extreme value distribution method, and finds the relevant quantiles by least squares fit to the histogram. The *AB* coefficients are found by method of moments for each return period of interest (Hogg et al. 1989).

Figure 2 shows a typical curve on a log–log plot. Log*A* determines the 1 h intercept and *B* determines the slope. Figure 3 shows sample station data with the standard error line highlighted. The standard deviation of the mean for this sample station ranges from 7.9 mm for the 2 y return period to 19.3 mm for the 100 y return period.

Figure 2 Sample IDF curve (MSC 2012).

Figure 3 Sample station statistics (MSC 2012), standard error line highlighted.

### 1.4 Spatial Interpolation

The IDF Curve Lookup tool uses the Waterloo Multiple Physiographic Parameter Regression (WATMAPPR) model (Seglenieks 2009), which is based on the square grid technique (Solomon et al. 1968), to estimate the curve parameters. The square grid technique uses UTM 10 km grid squares as elementary subcatchments. The premise of the tool is that local and regional topography strongly influenced local climate. Thus, topographic parameters are useful interpolators of surface fields of interest, such as temperature, runoff and, in this case, IDF curve *AB* parameters.

The *A* coefficients were determined using stepwise-linear regression. The independent variables were station latitude and longitude, station elevation, barrier height, local slope of *X* and *Y*, and the station’s distance to water. The dependent variables are log transformed Gumbel quantiles. *A* coefficients depend on both return period and station location. *B* typically does not show a dependence on station properties and is adequately represented by a single number. This is equivalent to having parallel IDF curves as shown in Figure 2 above.

The interpolation method uses a large domain digital elevation model (DEM) to derive a comprehensive set of secondary physiographic characteristics. The characteristics used for each grid square are east–west slope, north–south slope, distance to major water bodies, and barrier height from the west. These are combined with basic station characteristics—latitude, longitude and elevation—to generate a powerful set of independent variables that have been used in a number of climatological studies.

In the stepwise regression, each independent variable was put into the regression individually, and the resulting value of the objective function was determined. The independent variable that resulted in the lowest value of the objective function was the first variable to enter the regression. The process was repeated with the first variable and each of the remaining variables entered individually into the regression. The variable that gave the lowest value of the objective function was the second variable to enter the regression. Other variables were added to the regression until the resulting objective function no longer decreased by a certain threshold amount.

The results of this process were equations for each return period which related the value of the maximum precipitation to the physiographic parameters:

(4) |

where:

Ŷ |
= | the vector of observations (A or B), |

X |
= | the matrix of independent variables, |

β |
= | the vector of parameters to be estimated, and |

ε |
= | a vector of errors. |

These equations were used to calculate the expected value of the maximum precipitation at all points in the study area based on the values of the physiographic parameters at those points (see Table 1). The WATMAPPR technique is a non-exact interpolation, since the interpolated value at a measurement station may not necessarily be the measured value at that station.

Table 1 Typical partial regression coefficients for determining *A* parameters, standard error highlighted.

SUMMARY OUTPUT | |||||

Regression Statistics | |||||

Multiple R | 0.9996 | ||||

R Square | 0.9991 | ||||

Adjusted R Square | 0.9918 | ||||

Standard Error | 0.0437 | ||||

Observations | 146 | ||||

Coefficients | Standard Error | RMS Error | t Stat | P-value | |

Intercept | 0 | #N/A | #N/A | #N/A | #N/A |

X Variable 1 | 0.4200 | 0.1145 | 0.7692 | 3.6692 | 0.0003 |

X Variable 2 | -6.1635 | 0.7841 | 48.7802 | -7.8606 | 1.01E-12 |

X Variable 3 | 1.7353 | 0.4887 | 10.4234 | 3.5506 | 0.0005 |

X Variable 4 | 7.25E-05 | 4.14E-05 | 0.0002 | 1.7501 | 0.0823 |

X Variable 5 | 0.0020 | 0.0034 | 0.0170 | 0.5866 | 0.5585 |

X Variable 6 | -0.0003 | 0.0002 | 0.0011 | -1.3035 | 0.1946 |

X Variable 7 | -0.0010 | 0.0020 | 0.0101 | -0.5027 | 0.6160 |

X Variable 8 | 0.0570 | 0.0353 | 0.1923 | 1.6141 | 0.1088 |

X Variable 9 | 0.4599 | 0.0169 | 0.0877 | 27.2628 | 3.38E-57 |

The Cartesian standard error, *E*^{2}, is found using

(5) |

The objective function for the regression is the expected value of *E*^{2}. The multiplication of each observation by its number of years of record recovers the total square error for each station dataset. The objective function of the regression therefore includes spatial errors, temporal errors, and model errors for the entire dataset. The results for this are under the column *RMS for A Weighted Regression* of Table 2.

Table 2 MSC station standard errors.

Return Period | MSC Mean Station Value | MSC Standard Deviation | Weighted Station Mean | RMS for A Weighted Regression |

2 y | 20.20 | 2.32 | 20.35 | 0.70 |

5 y | 27.34 | 3.36 | 27.49 | 1.43 |

10 y | 32.04 | 4.16 | 32.20 | 2.27 |

25 y | 37.90 | 5.30 | 38.08 | 3.64 |

50 y | 42.38 | 6.06 | 42.56 | 5.00 |

100 y | 46.75 | 6.89 | 46.93 | 6.60 |

was used indirectly in the development of the MTO confidence intervals. , an aggregate of the mean intensity across durations, has a standard deviation as large as the value itself. Used directly, proved little help for regionalization. Instead of , the *AB* parameters were used directly, and each return period was analyzed separately. While this process removes most of the uncertainty from , some of it survives, and is carried forward in the standard error of each IDF curve.

The standard error is used to calculate confidence limits in Equation 6:

(6) |

where:

Ŷ_{i} |
= | the expected value of rainfall at point i, |

t_{0.975} |
= | the student t-distribution right tail probability, |

se |
= | the standard error of the regression, |

X_{i} |
= | a 1 × 7 column vector that contains the difference between each independent variable and its mean for the station i, and |

C |
= | the covariance matrix (Draper and Smith 1966). |

The square root term describes the curvature in the confidence limits.

In the development of confidence limits, *A* and *B* are treated as stationary in time and space. The confidence limits, therefore, reflect both temporal and spatial uncertainty. While Figure 4 and Figure 5 show that spatial trends do not exist in the residuals, there are likely time trends. The 95% confidence limits allow for these potential time trends. Their potential identification and quantification is part of the next phase of research. The investigation may allow the further reduction of the standard error.

## 2 Results

### 2.1 Examining the Residuals

The residual shown in Figures 4 and 5 overleaf indicates station *R*–positive errors (white dots) and station *R*–negative (black dots). Those stations that are between these extremes are indicated with varying degrees of shading. There is no trend in the vicinity of major topographical elements such as the Niagara Escarpment or the area west of Thunder Bay.

Figure 4 The locations of positive and negative residuals in Ontario.

### 2.2 Comparison to Station Data

There are no identifiable spatial trends in the standard error, even in unusual geographic areas. The errors occur in specific geographic areas including the northern shore of Lake Superior and the area below the escarpment on Lake Ontario. These are well-known climate anomalies in the province.

Figures 6 and 7 overleaf show the upper 95% confidence limits and the expected values for a 100 y 1 h storm in the Ontario dataset. In both of the figures, the reddest colors represent the lowest value of the plus or minus term, and the bluest colors represent the highest value of the plus or minus term.

Figure 6 Upper 95% confidence limit for 100 y 1 h storm.

Figure 7 Expected values for 100 y 1 h storm.

### 2.3 Comparison with Canadian Data

Typical results from selected MSC weather stations are in Figures 8 through 13 below. The points are the MTO database interpolation for the station. For the comparison of station data for the model validation, the original MSC data was excluded. When the data was excluded, the regression equation changed little due to the individual length of station records within the overall size of the dataset.

Figure 8 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for Chatham WPCP (6131415) and MTO values with 95% confidence limits.

Figure 9 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for Toronto City (6158355) and MTO values with 95% confidence limits.

Figure 10 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for Ottawa CDA RCS (6105978) and MTO values with 95% confidence limits.

Figure 11 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for North Bay A (6085700) and MTO values with 95% confidence limits.

Figure 12 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for Moosonee RCS (6075435) and MTO values with 95% confidence limits.

Figure 13 Comparison of MSC *A* coefficient for 2 y to 100 y 1 h storm for Kenora A (6034075) and MTO values with 95% confidence limits.

The *A* coefficient for the 1 h duration curve is presented, for ease of interpretation, in mm/h. This is possible because, in this case, where *τ* = 1 h, *RR = A*, as seen in Equation 2. The figures above illustrate how, at station locations, the MTO data is comparable to the MSC weather stations. The Kenora values show the least similarity. This is due to the orographic effect between Lake Winnipeg and Kenora. Further research will address this issue. In the meantime, the extent of the confidence limits compensates for this. The MTO dataset, with 4 073 station-years, has the advantage of having values based on the entire MSC dataset and consistency between stations.

### 2.4 Comparison with NOAA Data

The U.S. National Oceanic and Atmospheric Administration (NOAA) uses a three parameter *ABC* IDF model that plots as a curve in a log–log space rather than a straight line (Perica et al. 2013). Data is converted so it could be compared with the Canadian *AB* data. Figures 14 to 17 below show an example of an informal comparison between the two datasets. Confidence levels are at 95% for Canadian curves, and 90% for the American curves.

Figure 14 25 y storm comparison of MTO and NOAA models for Detroit Airport with confidence limits.

Figure 15 25 y storm comparison of MTO and NOAA model curves for Detroit Airport without confidence limits.

Figure 16 25 y storm comparison of the MTO interpolated curve and NOAA 90% confidence limits for Detroit Airport.

Figure 17 25 y storm comparison of MTO 95% confidence limits and NOAA interpolated curve for Detroit Airport.

The NOAA curve is determined by fitting the three-parameter *ABC* model to the data. The Canadian model uses the two-parameter *AB* model. There are advantages and disadvantages to each but in a data-sparse environment, such as Canada, having one less parameter to estimate is often helpful. The linear best fit to the American data points is also shown and it matches the Canadian *AB* curve. Based on preliminary studies, with visual comparisons, the difference between the two curves is insignificant at a 95% confidence level for individual observations.

## 3 Results and Implementation

The new IDF curves are project location based. The limits of a highway project are identified on the user interface by two representative locations. These locations are defined either by entering their latitude and longitude, or by selecting the locations from the Google Maps interface.

The system determines the IDF curve parameters representative of these areas covered by a project. The display shows the percentage error at the two identified points by assigning a representative colour to the icons. If the percentage error is greater than the acceptable value, the highway section can be divided into two sections by adding a third intermediate point. This will result in dividing the highway project into two areas each with a different IDF curve. The produced IDF curves have similar uncertainty to the curves provided at individual MSC stations. The similarity exists because the introduced spatial error offsets the eliminated temporal error. The obvious advantage is that, with the site-specific temporal error removed, the MTO model can generate IDF curves for anywhere in Ontario, not just at MSC stations. With the interpolated MTO curves, designers have calibrated rainfall extremes for projects, such as highways, that are too far from MSC stations for the MSC curves to be reliable.

## 4 Conclusions

The MTO IDF Look-Up Tool is a reliable interpolator of IDF curves in Ontario, and provides a consistent estimate of *AB* parameters. The key to its successful interpolation of extreme rainfall statistics is the integration of physiographic characteristics. One equation for each return period represents the entire province, with only one value for *B*. Despite this, the errors show no regional bias, and the standard error is 10%, which is the standard error of the individual stations. Time trends are a potential source of error, and are the focus of further study. In the meantime, the confidence limits are broad enough to accommodate any error that time trends introduce. The interpolator was validated with the use of select NOAA and MSC stations. The MTO model results match the data and RMS error from these stations.

Future work will include the addition of data from other sources from within Ontario, along with neighbouring states and provinces. It will also investigate time trends in the data for a more refined interpolation of extreme rainfall estimates.

With the MTO Look-Up Tool, engineers have a consistent set of interpolated IDF curves for the entire province. In urban areas, where existing MSC IDF curves exist, the MTO tool provides results that match the RMS error for the MSC stations. In rural setting, the tool is of particular use. There, it provides reliable IDF curves where curves do not otherwise exist.

The interface tool is available at http://www.mto.gov.on.ca/IDF_Curves/.

## References

- Draper, N. R. and H. Smith. 1966.
*Applied Regression Analysis.*New York: John Wiley and Sons. - Hogg, W. D., D. A. Carr and B. Routledge. 1989.
*Rainfall Intensity–Duration–Frequency Values for Canadian Locations*. Ottawa: Environment Canada, Atmospheric Environment Service. - Meteorological Service of Canada. 2012.
*Notes on the EC IDF V2-20-20120209_E.pdf*. Ottawa: Environment Canada. ftp.tor.ec.gc.ca/Pub/Engineering_Climate_Dataset/IDF/ - Perica, S., D. Martin, S. Pavlovic, I. Roy, M. St. Laurent, C. Trypaluk, D. Unruh, M. Yekta and G. Bonnin. 2013.
*Prediction–Frequency Atlas of the United States*, NOAA Atlas 14. Washington, DC: U.S. Department of Commerce. - Seglenieks, F. 2009.
*Creation of a Gridded Time Series of Hydrological Variables for Canada*. Waterloo, Ontario: University of Waterloo. PhD Thesis. - Solomon, S. I., J. P. Denouvilliez, E. Chart, J. Woolley and C. Cadou. 1968. “The Use of a Square Grid System for Computer Estimation of Precipitation, Temperature and Runoff.”
*Water Resources Research*4 (5): 919–29. - U.S. Geological Survey. 1997.
*GTOPO30 Global 30 Arc Second Elevation Data Set*. Sioux Falls, SD: USGS EROS Data Center. https://lta.cr.usgs.gov/GTOPO30.