Suspended Sediment Concentration Modeling Using Conventional and Machine Learning Approaches in the Thames River, London Ontario
Abstract
Water resources management, hydraulic designs, environmental conservation, reservoir operation, river navigation and hydro-electric power generation all require reliable information and data about suspended sediment concentration (SSC). To predict such data, direct sampling and sediment rating curves (SRC) are commonly used. Direct sampling can be risky during extreme weather events and SRC may not provide satisfactory or dependable results, so engineers are developing new precise forecasting approaches. Various soft computing techniques have been used to model different hydrological and environmental problems, and have showed promising results. Prediction of SSC is a site-specific phenomenon and ought to be modeled for every river and creek. In this study, adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN) models were compared with conventional SRC and linear regression methods. Using different combinations of observed SSC data and simultaneous stream discharge, water temperature, and electrical conductivity data for the Thames River at Byron Station, London Ontario from 1993 to 2016, several models were trained. Each model was evaluated using mean absolute error, root mean square error and the Nash–Sutcliffe efficiency coefficient. Results show that ANN models are more accurate than other modeling approaches for predicting SSC for this river.
1 Introduction
Suspended sediment carried in a river has an impact on various aspects of river use (e.g. water quality, navigation, fisheries and aquatic habitat). It is a site-specific problem that depends on several factors (e.g. the catchment area, rainfall intensity, vegetation cover) and should be studied for every river, creek or channel. According to Heng and Suetsugi (2013) measurement of sediment concentration is inadequate in most parts of the world. Several hydrological variables such as bedform geometry, flow rate, friction factor and discharge have been used to develop different models for predicting sediment concentration in rivers (Karim and Kennedy 1990; Lopes and Ffolliott 1993). Direct analysis of the suspended sediment concentration (SSC) and the sediment rating curve (SRC) are among a wide range of tools used to observe the suspended sediment load. Although direct analysis is reliable, it is very costly and time consuming, and in many cases problematic for inaccessible sections, especially during severe storm events, and cannot be used for all river gauge stations (Bayram et al. 2013). On the other hand, because SSC transport in a river is a complex hydrological phenomenon due to several parameters, such as the spatial variability of basin characteristics, river discharge patterns and the inherent nonlinearity of hydro-meteorological parameters, the conventional SRC method may not be suitable for estimating SSC (Joshi et al. 2015); nor are regression models (RM), in which the system is assumed to be static (Ghorbani et al. 2013). During the last two decades, artificial intelligence techniques have been used to estimate and predict various hydrological phenomena (Tachi 2017). Adaptive neuro-fuzzy inference systems (ANFIS) and artificial neural networks (ANN) are two well-known tools for prediction and simulation in hydrology and hydraulics, such as the prediction of SSC (Angabini et al. 2014). Uncertainty in suspended sediment curves was investigated by McBean and AI-Nassr (1988) who concluded that using sediment load versus discharge is misleading as the goodness of fit implied by this relation is spurious. They recommended using regression between sediment concentration and discharge as an alternative. Kisi et al. (2006) used a fuzzy logic modeling approach to predict SSC and compared the results with those given by the SRC method. The study was used on a 5 y period of continuous streamflow and SSC data from the Quebrada Blanca station operated by the U.S. Geological Survey (USGS). Nine different fuzzy logic models and two SRC models were compared and the results showed that the fuzzy logic modeling approach is more accurate than the conventional SRC method. Cigizoglu and Alp (2003) developed a feed-forward back-propagation three-layer learning ANN algorithm to simulate the relationship between suspended sediment, precipitation and river flow by using hydro-meteorological data. The ANN models performed better than multiple linear regression and the study suggested that ANN is an important tool for forecasting suspended sediment. Kisi (2004) established three different ANN modeling techniques: multi-layer perceptron (MLP), generalized regression neural networks (GRNN) and radial basis function (RBF), using the Levenberg–Marquardt algorithm to predict daily SSC at two stations on the Tongue River in Montana. The study included various combinations of inputs to better predict the daily SSC, including water discharges at both current and previous time steps, sediment concentrations at previous time steps at the station of interest, as well as data from the upstream station. The study concluded that the MLP method generally gives better SSC estimates than other neural network techniques and the conventional statistical method of multiple linear regression (MLR). Zhu et al. (2007) used ANN to model the monthly suspended sediment flux (i.e. SSC multiplied by the water discharge) from 1960 to 2011 in the Longchuanjiang River in the Upper Yangtze catchment in China. Average rainfall, rainfall intensity, temperature and streamflow discharge were taken as input parameters for constructing the various models of the study. ANN was more accurate in predicting the monthly sediment flux than two conventional models which used MLR and power relation (PR) approaches. Melesse et al. (2011) used precipitation, discharge and antecedent sediment data from three major rivers (Mississippi, Missouri and Rio Grande) as inputs and also found that ANN better simulated and predicted daily and weekly suspended sediment loads than MLR, multiple nonlinear regression (MNLR) and autoregressive integrated moving average (ARIMA) models.
Kisi (2005) evaluated the ability of ANFIS and ANN to model the relationship between streamflow and suspended sediment for two USGS stations in Puerto Rico, Quebrada Blanca at El Jagual and Rio Valenciano near Juncos, using daily time series streamflow and suspended sediment concentrations data from 1994 and 1995. The study used SRC and MLR to build the various models. However, comparison showed that an ANFIS model performed better. Cobaner et al. (2009) used hydro-meteorological data and estimated SSC using both ANFIS and ANN. They used data for different combinations of current daily rainfall, streamflow and past daily streamflow, and suspended sediment from the Mad River catchment near Arcata, California. They compared predictions from ANFIS with those of three different ANN techniques and two different SRC models. The results showed that ANFIS performed best in predicting SSC. Rajaee et al. (2009) compared ANN, ANFIS, MLR and SRC models in simulating SSC using daily river discharge and SSC data from the Little Black River and Salt River gauging stations in the United States. The ANFIS model performed best in predicting SSC. Results from Kisi et al. (2009) show the superiority of ANFIS over ANN and SRC techniques in predicting monthly suspended sediment. The study used monthly streamflow and suspended sediment time series data from Kuylus and Salur Koprusu stations in the Kizilirmak basin in Turkey. Ghorbani et al. (2013) used ANN and ANFIS models to model suspended sediment load using daily river discharge data (1994–1995) from Rio Chama, in New Mexico and Colorado. They showed that the ANN model was more accurate. Two recent studies by Rezaei and Fereydooni (2015) and Tahmoures et al. (2015) used ANFIS and ANN to simulate the suspended sediment load in the River Dalaki, Iran. Both studies concluded that ANFIS gave better results than other approaches.
We developed five different models to predict SSC for the Thames River at the Byron station in London Ontario using the data from the period 1993–2016. We used conventional approaches (SRC, simple linear regression, SLR, and MLR) and machine learning approaches (ANFIS and ANN). Lack of continuous data for SSC in the Thames River can produce errors when using SRC and RM; machine learning models can give more accurate estimates. We used three statistical criteria (MAE, RMSE and NSE) to evaluate the performance of the different models.
2 Methods
2.1 Sediment Rating Curves
The first documented example of the use of the sediment rating curve (SRC) method is a study conducted by Campbell and Bauder (1940). They developed a silt rating curve by plotting daily suspended sediment load against daily stream discharge for the Red River in Texas on a logarithmic scale (Kisi et al. 2006). The rating curve method relates the sediment concentration to the discharge (Q) in the form of a graph or equation which can then be used to simulate the relationship between SSC and Q using the documented streamflow and SSC data. This relationship is usually represented as a power function:
(1) |
The values of the constants a and b, which are different and unique for every individual river, creek, tributary and stream, depend on water stream characteristics and can be obtained by plotting the relationship between logQ on the x–axis and logSSC on the y–axis (Equation 2). In this linear relationship, the gradient of the curve represents the value of b, while log a is the y intercept.
(2) |
2.2 Simple Linear Regression
Simple linear regression is expressed by:
(3) |
where:
Y | = | dependent variable (output), |
X | = | independent variable (input), |
β_{0}, β_{1} | = | regression coefficients or regression parameters, and |
ε | = | an error term to account for the difference between the predicted data using Equation 3 and the observed data. |
The predicted value form of Equation 3 is:
(4) |
where:
= | fitted or predicted value, and | |
= | estimates of the regression coefficients. |
The regression coefficients can be found after plotting the linear relationship between SSC and Q. Values of and correspond to the y intercept and the slope of the fitted line.
2.3 Multiple Linear Regression
MLR is the generalization of the SLR model. MLR includes more than one input variable. If it is believed that the dependent variable Y is influenced by n independent variables, X_{1}, X_{2}, …, X_{n}, then the regression equation of Y can be represented as:
(5) |
where:
Y | = | dependent variable (output), |
X_{i} | = | independent variables (inputs), |
n | = | number of observations, |
β_{i} | = | regression coefficients or regression parameters, and |
ε | = | an error term to account for the difference between the predicted data. |
The predicted value form of Equation 5 is:
(6) |
where:
= | predicted value of the variable Y when the independent variables are represented by the values X_{1}, X_{2}, …, X_{n}. |
The estimated regression coefficients of are evaluated similarly to SLR by minimizing the sum of the e_{yi} distances of observation points from the plane expressed by the regression equation:
(7) |
In this study values of were determined using a Microsoft Excel 2016 spreadsheet.
2.4 Adaptive Neuro-Fuzzy Inference System
The adaptive neuro-fuzzy inference system (ANFIS) is a hybrid system developed by Jang (1993). ANFIS integrates both ANN and fuzzy logic principles using the ANN learning ability to generate the fuzzy If–Then rules that can approximate nonlinear functions, which in turn lead to the inference. ANFIS can be simply described as an adaptive network that uses type supervised learning as a learning algorithm and has a function similar to the Takagi–Sugeno fuzzy inference system model. In other words, ANFIS is a hybrid system that integrates the learning capabilities of ANN with the knowledge representation and inference abilities of fuzzy logic and which can modify their membership functions to achieve a desired performance. The structure of the fuzzy reasoning mechanism for the Takagi–Sugeno model is shown in Figure 1, and the corresponding ANFIS schema is shown in Figure 2. We assume that the system has two inputs x and y and one output f. The If–Then rules are the two rules for the Takagi–Sugeno model:
Rule 1: If x is A_{1} and y is B_{1} Then f_{1 }= p_{1}x + q_{1}y + r_{1} | |
Rule 2: If x is A_{2} and y is B_{2} Then f_{2} = p_{2}x + q_{2}y + r_{2} |
where:
A_{1}, A_{2} and B_{1}, B_{2} | = | membership functions of inputs x and y, and |
p_{1}, q_{1}, r_{1} and p_{2}, q_{2}, r_{2} | = | linear parameters of the Takagi–Sugeno fuzzy inference model. |
Figure 1 Reasoning from a two-input first order Sugeno fuzzy model with two rules (Foroozesh et al. 2013).
Figure 2 ANFIS architecture corresponding to Figure 1 (Foroozesh et al. 2013).
The ANFIS architecture shown in Figure 2, which corresponds to Figure 1, has five layers. Layers 1 and 4 contain adaptive nodes (rectangular), whereas the other layers contain fixed nodes (circular). Jang et al. (1997) provided a brief explanation of each layer as follows.
Layer 1
Every node i in this layer is an adaptive node with a node function. The degree of membership value which is given by the input of the membership functions can be derived from the output from each node. For instance, the membership function can be Gaussian (Equation 8) or a generalized bell (Equation 9) membership function. Note that there are several other types of membership function, including but not limited to, the triangular, the trapezoidal, the Pi-shaped curve and the sigmoid curve. The parameters in this layer are typically referred to as the premise parameters.
(8) | |
(9) | |
(10) | |
(11) |
where:
µ_{Ai}, µ_{Bi}_{−2} | = | degree of membership functions for the fuzzy sets A_{i} and B_{i}, and |
a_{i}, b_{i}, c_{i} | = | parameters of a membership function that can change the shape of the membership function. |
Layer 2
Nodes in this layer are fixed (non-adaptive), and the circle node is marked as Π. The output node is the result of multiplying the signal coming into the node and delivered to the next node. Each node in this layer represents the firing strength for each rule. In the second layer, the T-norm operator with general performance, such as AND, is applied to obtain the output:
(12) |
where:
w_{i} | = | the output that represents the firing strength of each rule. |
Layer 3
Nodes in this layer are fixed (non-adaptive), and the circle node is marked as N. Each node is a calculation of the ratio between the ith rule firing strength and the sum of all rule firing strengths. This result is known as the normalized firing strength.
(13) |
Layer 4
Every node in this layer is an adaptive node to an output, with a node function defined as:
(14) |
where:
= | the normalized firing strength from the previous layer (third layer) | |
p_{i} x + q_{i} y + r_{i} | = | a parameter in the node. |
The parameters in this layer are referred to as consequent parameters.
Layer 5
The single node in this layer is a fixed or non-adaptive node that computes the overall output as the summation of all incoming signals from the previous node. In this layer, a circle node is labeled as Σ.
(15) |
2.5 Artificial Neural Network (ANN)
An artificial neural network (ANN) is a massive parallel distributed system for information processing using the biological structure of the human brain and nervous system as a paradigm. Information is processed by interrelated neurons or nodes (Ghorbani et al. 2013). ANN is a powerful artificial intelligence (AI) technique that hydrologists have used for the past two decades. Researchers have used ANN to manage all sorts of data. Its capability to identify and recognize complex relationships between inputs and outputs has enabled researchers to model different nonlinear phenomena. Feed forward back propagation (FFBP) is the most common type of predictive ANN (Nagy et al. 2002). We chose this type of algorithm to train various ANN models for this study.
There are two phases in training a network using a FFBP algorithm. In the forward phase the inputs are presented and propagated forward through the network which calculates the output for every processing element. In the backward phase the recurrent difference calculation (of the forward phase) is performed in reverse. The algorithm will stop training when the value of the error function becomes insignificant.
Rojas (1996) broke down the back propagation algorithm into four steps. After randomly selecting the weights of the network, corrections are calculated using the back propagation algorithm. The four main steps forming the algorithm are as follows:
- feed forward computation;
- back propagation to the output layer;
- back propagation to the hidden layer; and
- weight updates.
This technique is a gradient descent method to process the total squared error of the output calculated by the net and make it negligible (minimum). Back propagation is a systematic method for training multiple artificial systems. Back error propagation is the most widely used neural network model and has been applied successfully in a wide range of applications. A back propagation network consists of layers, each layer being entirely linked to the layer below and above it as shown in Figure 3.
Figure 3 Architecture of MLP feed forward ANN (Nastos et al. 2011).
In a back propagation algorithm, similar to every other type of algorithm, the middle and output layers use activation functions. Typically, sigmoid activation functions are among the most used and the output of the network will be between 0 and 1. However; a Gaussian distribution can also be used as an activation function. The output value of the feed forward phase is compared with the expected output value and the distance between them is taken as an error to back propagate. The predetermined error function is:
(16) |
where:
E | = | total error, and |
t_{n}, z_{n} | = | calculated and measured outputs for the input n. |
3 Area of Study and Data Collection
The study area, shown in Figure 4, is the River Bend watershed, which is the most downstream of 28 Upper Thames River basin watersheds. Its total area is 5830 ha. The Upper Thames River basin covers an area of 3362 km². All water from the upstream watershed passes through the study area. Almost weekly time series data of river discharge (Q), water temperature (T), electrical conductivity (C) and suspended sediment concentration (SSC) from 1993 to 2016 were obtained from the water quality monitoring site at Byron (42°57’46.9” N, 81°19’54.9” W) and were used to develop various models. The data were downloaded from the City of London web server.
Figure 4 Upper Thames River basin (upper left) and River Bend subwatershed, the study area (UTRCA 2012).
The 5 y (2006–2010) mean annual flow was 46.1 m^{3}/s and over15 y the mean annual flow was 41.8 m^{3}/s. Measurements were made near Byron, shown as the Water Quality Monitoring Site on Figure 4. The River Bend watershed has a total of 76 km watercourses; 81% is natural and the rest is either buried or channelized. Flow type in the watershed is 66% permanent, ~20% of the total flow is intermittent, and nearly 13% is buried. Data for Q, T, C and SSC were collected during that period.
To model SSC using SRC and SLR, the only independent variable is the river discharge (m^{3}/s). The data contain some identical values of Q yielding different SSC values, which creates difficulties for validating and comparing the predictions the models. The single value of Q thought to be the most representative, considering the trend of observed SSC values, was used to ensure a fair comparison between conventional and machine learning models. In total 470 datasets for each of the input and output variables were used in the study to provide better estimates in scenarios where Q is the only input. Outliers were also considered and predictions were performed using the Grubbs test (Grubbs 1969) with a significance level α = 95%. Outliers constituted 0.088% of the raw data and were removed.
3.1 Training Dataset
Inputs were selected based on previous studies conducted to simulate the phenomenon. Dai et al. (2009) showed that there is a good relationship between SSC and electrical conductivity. We used electrical conductivity data as a surrogate for turbidity. Together with electrical conductivity data, temperature and discharge data were used as input to study their effect on SSC. The selection procedure of the training dataset was random and 420 weekly data records for each input and output variable of the total data were selected for the training purposes of the various models of this study.
3.2 Testing Dataset
The testing dataset was also randomly selected; 50 weekly data records of the total data were used for each input and output variable in testing the various models to determine the best model.
Statistics for temperature, discharge, electrical conductivity and suspended sediment concentration are shown in Table 1.
Table 1 Statistical parameters of input and output datasets.
Data Type | x_{mean} | S_{x} | x_{max} | x_{min} |
Training | ||||
River temperature (^{0}C) | 11.593 | 7.452 | 26.500 | −1.000 |
Flow (m^{3}/s) | 36.082 | 30.951 | 146.300 | 6.180 |
Conductivity (µS/cm) | 662.336 | 113.857 | 1037.000 | 317.000 |
SSC (mg/L) | 13.581 | 7.876 | 44.000 | 1.000 |
Testing | ||||
River temperature (^{0}C) | 12.978 | 9.136 | 26.800 | 0.200 |
Flow (m^{3}/s) | 54.262 | 57.183 | 172.700 | 4.800 |
Conductivity (µS/cm) | 635.120 | 120.836 | 890.000 | 400.000 |
SSC (mg/L) | 15.040 | 10.045 | 44.000 | 1.000 |
3.3 Model Performance Evaluation
Three statistical measures were used to indicate how well the results of each model compared with the measured data: mean absolute error, root mean square error, and Nash–Sutcliffe efficiency.
Mean Absolute Error
MAE measures the average magnitude of the errors without considering their direction. MAE is the average of the sample of the absolute values of the differences between the calculated and the corresponding observed data. MAE values range from 0 to infinity, and the smaller the value, the better the model. It is calculated by:
(17) |
where:
x_{i} | = | predicted data, |
y_{i} | = | observed data, and |
n | = | number of observed data points. |
Root Mean Square Error
RMSE measures the average magnitude of the error. RMSE is the square root of the sum of the squares of the difference between computed and corresponding observed values averaged over the sample. Because the errors are squared before they are averaged, RMSE weights large errors highly. RMSE can range from 0 to infinity; the smaller RMSE is, the better the forecasting model. RMSE is calculated by:
(18) |
Nash–Sutcliffe Efficiency
The Nash–Sutcliffe efficiency factor (NSE, Nash and Sutcliffe 1970) is:
(19) |
where:
= | the mean value of the observed data. |
NSE can range from minus infinity to one, where a value of 1.0 indicates a perfect match between modeled results and observed records. A value of 0 indicates that the model predictions are as accurate as the mean value of the observed time series data, and a negative value indicates that the mean value of the observed time series would have been a better predictor than the model.
4 Model Applications
Four different scenarios were created using different combinations of the various inputs (measured simultaneously) that affect the output, SSC. Table 2 shows the different scenarios and the data used for training and testing. The SRC and simple linear regression (SLR) form a one input–one output model which was developed using scenario S1.
Table 2 Different scenarios proposed for this study.
Scenario | Input | No. of training datasets for each variable | No. of testing datasets for each variable |
S1 | Streamflow (Q) | 420 | 50 |
S2 | Streamflow + Temperature (Q & T) | 420 | 50 |
S3 | Stream flow + Conductivity (Q & C) | 420 | 50 |
S4 | Streamflow + Temperature + Conductivity (Q, T & C) | 420 | 50 |
4.1 Training the Models for S1
Four different S1 models were developed using SRC, SLR, ANFIS and ANN. Streamflow (Q) was the only input used in the model and the only output was SSC.
SRC Model, S1
The training dataset was used to train the SRC model and the plot of log(Q) against log(SSC) is shown in Figure 5. As explained in section 2.1, the slope of the trend line represents the b value, while loga is the y intercept.
Figure 5 SRC used for the training dataset S1.
The model equation for SSC using the SRC training dataset is:
(20) |
Equation 20 was used for both the training and the testing datasets and an Excel spreadsheet was used to calculate MAE, RMSE and NSE. The results are shown in Table 4 below.
SLR Model, S1
The training dataset was used to train the SLR model. The regression add-in tool in Excel was used to analyze the model. The regression significance level P and adjusted R^{2} were respectively 2.474 36 × 10^{−43} and 0.365. A summary of various regression values is shown in Table 3.
Table 3 SLR outputs using S1 training dataset.
Coefficients | Standard Error | P-value | |
Intercept | 8.023 657 7 | 0.470 740 56 | 7.673 × 10^{−50} |
Q | 0.154 0184 7 | 0.009 907 37 | 2.474 × 10^{−43} |
The exact values of b_{0} and b_{1} that were used in the training and testing datasets were used in the Excel spreadsheet to determine the statistical indicators. Table 4 below shows the values of each performance indictor for the training and testing phases.
ANFIS Model, S1
Several models, using different numbers and structures of membership functions (MF), were trained using the ANFISEDIT toolbox in MATLAB R2016b. The various fuzzy inference systems (FIS) were trained with hybrid optimization. An MF type constant was selected for the output. After several trials, the ideal model giving minimum values of RMSE was one of the 9 MFs of a GBELL MF type. MAE, RMSE and NSE for training and testing phases are shown in Table 4 below.
ANN Model, S1
Several models were trained using the NNTOOL toolbox in MATLAB R2016b. An FFBP network type, Levenberg–Marquardt (LM) training function and two hidden layers were chosen to train various models. Different trials were performed using different numbers of neurons and different types of transfer functions. After creating the best model, the calculated output is then used to calculate various performance indicators using the same dataset that had been chosen for training and testing purposes to ensure fair comparison with the previous approaches. After several trials, the model giving minimum values of R was the one with 20 neurons in hidden layer 1 with transfer functions TANSIG and PURELIN for hidden layers 1 and 2, respectively. MAE, RMSE and NSE for the training and testing phases are shown in Table 4.
Table 4 Performance indicators for all models of S1.
MAE | RMSE | NSE | ||
SRC | Training Phase | 4.824 | 6.925 | 0.225 |
Testing Phase | 6.936 | 8.709 | 0.233 | |
SLR | Training Phase | 4.626 | 6.262 | 0.366 |
Testing Phase | 5.997 | 7.563 | 0.421 | |
ANFIS | Training Phase | 4.277 | 5.901 | 0.437 |
Testing Phase | 5.194 | 6.738 | 0.541 | |
ANN | Training Phase | 4.013 | 5.641 | 0.486 |
Testing Phase | 4.25 | 5.579 | 0.685 |
Figure 6 shows observed and calculated SSC (mg/L) in the training phase period for the best model using scenario S1. Figure 7 displays the extent of the match between measured and predicted SSC (mg/L) by the best S1 model as a scatter diagram using the testing data. Figure 8 shows the observed and calculated SSC (mg/L) of some selected peaks from the testing phase using SRC, SLR, ANFIS and ANN in scenario S1. ANFIS and ANN show the better performance of the machine learning approaches over the conventional approaches (SRC and SLR). The ANN model is better than the ANFIS model.
Figure 6 Observed and calculated SSC (mg/L), the training period using ANN, S1.
Figure 7 Scatter plot comparing predicted and observed SSC (mg/L) using ANN testing data, S1.
Figure 8 Observed and calculated selected peaks of SSC (mg/L), testing phase of S1.
4.2 Training Various Models for S2
Three different models were developed using MLR, ANFIS and ANN. For these models, temperature (T) and streamflow (Q) were used as inputs in order to model the output (SSC).
MLR Model, S2
The training dataset was used to train the MLR model and the regression add-in tool in Excel was used to train the model. The regression significance P and adjusted R^{2} were respectively 1.766 26 × 10^{−47} and 0.4, an improvement over the S1 SLR model, and showing that the new input (T) influences the output. The summary of various regression figures is given in Table 5.
Table 5 MLR outputs using the S2 training dataset.
Coefficients | Standard Error | P-value | |
Intercept | 4.305 846 95 | 0.862 959 38 | 8.899 × 10^{−7} |
T | 0.233 848 36 | 0.046 029 28 | 5.693 × 10^{−7} |
Q | 0.181 922 69 | 0.011 082 59 | 4.544 × 10^{−47} |
The exact values of b_{0}, b_{1} and b_{2} were used to determine the various statistical measures for the training and testing datasets. An Excel spreadsheet was used for the calculations, and Table 6 below shows the value of each performance indicator for training and testing phases.
ANFIS Model, S2
The procedure described above for the S1 ANFIS model was used. After several trials, the model that gave the lowest RMSE was the GAUSS2 MF type with 5 and 2 MFs for T and Q inputs respectively. MAE, RMSE and NSE for the training and testing phases are shown in Table 6 below.
ANN Model, S2
The procedure described above for the S1 ANN model was used. After several trials, the model that gave the lowest R value was the one with 20 neurons in the hidden layer 1 with TANSIG and PURELIN transfer functions for hidden layers 1 and 2, respectively. Table 6 shows MAE, RMSE and NSE for the training and testing phases.
Table 6 Performance indicators for all models of S2.
MLR | ANFIS | ANN | ||||
Training Phase | Testing Phase | Training Phase | Testing Phase | Training Phase | Testing Phase | |
MAE | 4.422 | 5.641 | 4.641 | 5.420 | 3.533 | 3.590 |
RMSE | 6.077 | 7.266 | 5.776 | 6.813 | 4.865 | 4.869 |
NSE | 0.403 | 0.466 | 0.461 | 0.531 | 0.617 | 0.760 |
Figure 9 shows the observed and calculated SSC (mg/L) in the training phase for the best S2 model. Figure 10 shows the scatter diagram of the match between the measured and predicted SSC (mg/L) for the best S2 model. Figure 11 shows the observed and calculated SSC (mg/L) of some selected peaks in the testing phase for S2 SRC, SLR, ANFIS and ANN models. ANFIS and ANN show the better performance of the machine learning approach over the conventional approaches (SRC and SLR). The ANN model is better than the ANFIS model.
Figure 9 Observed and calculated SSC (mg/L), the training period using ANN (S2).
Figure 10 Scatter plot comparing predicted and observed SSC (mg/L) using ANN (S2), testing data.
Figure 11 Selected observed and calculated peaks of SSC (mg/L) in the testing phase, S2.
4.3 Training Various Models for S3
Three different models were developed using MLR, ANFIS and ANN. For these models, electrical conductivity (C) and the streamflow (Q) were used as inputs to the model to output (SSC).
MLR Model, S3
The regression significance P and adjusted R^{2} were 1.517 17 × 10^{−42} and 0.370, respectively. The various regression values are given in Table 7.
Table 7 MLR outputs using S3 training dataset.
Coefficients | Standard Error | P-value | |
Intercept | 11.015 064 2 | 2.016 088 0 | 8.040 × 10^{−8} |
Q | 0.149 594 53 | 0.010 307 84 | 6.3907 × 10^{−39} |
C | −0.004 275 44 | 0.002 802 08 | 0.127 815 85 |
The exact values of b_{0}, b_{1} and b_{2} were used for the training and testing datasets to determine the various statistical indicators. Table 8 below shows the value of each performance indictor for training and testing phases.
ANFIS Model, S3
The best RMSE value was 5 MFs for both Q and C inputs and of a GBELL MF type. MAE, RMSE and NSE for training and testing are shown in Table 8 below.
ANN Model, S3
The best model had 20 neurons in hidden layer 1 with TANSIG and PURELIN transfer functions for hidden layers 1 and 2, respectively. Table 8 shows MAE, RMSE and NSE for training and testing.
Table 8 Performance indicators for all models of S3.
MLR | ANFIS | ANN | ||||
Training Phase | Testing Phase | Training Phase | Testing Phase | Training Phase | Testing Phase | |
MAE | 4.597 | 6.038 | 3.833 | 4.906 | 3.586 | 3.473 |
RMSE | 6.244 | 7.555 | 5.311 | 6.770 | 5.134 | 4.861 |
NSE | 0.370 | 0.423 | 0.544 | 0.536 | 0.574 | 0.761 |
Figure 12 shows observed and calculated SSC (mg/L) in the training phase for the best S3 model. Figure 13 shows the scatter diagram for measured and predicted SSC (mg/L) for the best S3 model. Figure 14 shows observed and calculated SSC (mg/L) of some selected peaks from the testing phase using S3 SRC, SLR, ANFIS and ANN models. ANFIS and ANN show the better performance of the machine learning approaches over the conventional approaches (SRC and SLR). The ANN model is better than the ANFIS model.
Figure 12 Observed and calculated SSC (mg/L) for the training period using ANN, S3.
Figure 13 Scatter plot comparing predicted and observed SSC (mg/L) using ANN testing data, S3.
Figure 14 Observed and the calculated selected peaks of SSC (mg/L) in the testing phase, S3.
4.4 Training Various Models for S4
Three different models were developed using various modeling techniques, namely, MLR, ANFIS and ANN. For these models, the temperature (T), the electrical conductivity (C) and the streamflow (Q) were used as inputs to model the targeted output (SSC).
MLR Model (S4)
The regression significance F and adjusted R^{2} were 2.215 68 × 10^{−46} and 0.3992, respectively. The results summary of various regression figures is presented in Table 9.
Table 9 MLR outputs using S4’s training dataset.
Coefficients | Standard Error | P-value | |
Intercept | 3.423 115 47 | 2.513 074 1 | 0.173 896 2 |
T | 0.240 861 23 | 0.049 745 1 | 1.8165 × 10^{−6} |
Q | 0.183 900 08 | 0.122 891 | 8.1983 × 10^{−41} |
C | 0.001 102 28 | 0.002 946 | 0.708 558 6 |
The exact values of b_{0}, b_{1}, b_{2} and b_{3} were used in the training and testing datasets to determine the various statistical indicators. Table 10 below shows the value of each performance indictor for training and testing phases.
ANFIS Model, S4
3 MFs for T input, 2 MFs for Q input and 3 MFs for C input of a GAUSS MF type best modeled SSC. Table 10 below shows MAE, RMSE and NSE for the training and testing phases.
ANN Model, S4
The hidden layer 1 used 20 neurons and TANSIG and PURELIN transfer functions were used for hidden layers 1 and 2, respectively. Table 10 shows MAE, RMSE and NSE for the training and testing phases.
Table 10 Performance indicators for all models of S4.
MLR | ANFIS | ANN | ||||
Training Phase | Testing Phase | Training Phase | Testing Phase | Training Phase | Testing Phase | |
MAE | 4.413 | 5.614 | 4.061 | 5.752 | 3.456 | 2.823 |
RMSE | 6.076 | 7.269 | 5.666 | 7.082 | 4.736 | 3.720 |
NSE | 0.403 | 0.466 | 0.481 | 0.493 | 0.638 | 0.860 |
Figure 15 shows the observed and calculated SSC (mg/L) in the training period for the best S4 model. Figure 16 shows the scatter diagram for the match between measured and predicted SSC (mg/L) for the best S4 model. Figure 17 shows the observed and calculated SSC (mg/L) for some selected peaks from the testing phase of S4 SRC, SLR, ANFIS and ANN models. ANFIS and ANN perform better than the conventional approaches (SRC and SLR). The ANN model is better than the ANFIS model.
Figure 15 Observed and calculated SSC (mg/L) for the training period using ANN, S4.
Figure 16 Scatter plot comparing predicted and observed SSC (mg/L) using ANN testing data, S4.
Figure 17 Observed and the calculated selected peaks of SSC (mg/L) for the testing phase, S4.
Table 11 summarizes the final architecture of the best ANFIS and ANN models in each scenario. Table 12 summarizes the testing performance of various models developed in this study for each scenario.
Table 11 Final architecture of the best machine learning models.
Scenario No. | Scenario Inputs | ANFIS MFs Type | Number of ANFIS MFs | ANN Structure |
S1 | Q | GBELL | (9) | (1, 2, 1) |
S2 | T & Q | GAUSS2 | (5, 2) | (2, 2, 1) |
S3 | Q & C | GBELL | (5, 5) | (2, 2, 1) |
S4 | T, Q & C | GAUSS | (3, 2, 3) | (3, 2, 1) |
Table 12 Testing performance for various models.
Scenario No. | S1 | S2 | S3 | S4 | |
Scenario Inputs | Q | T & Q | Q & C | T, Q & C | |
SRC | MAE | 6.936 | - | - | - |
RMSE | 8.709 | - | - | - | |
NSE | 0.233 | - | - | - | |
SLR | MAE | 5.997 | - | - | - |
RMSE | 7.563 | - | - | - | |
NSE | 0.421 | - | - | - | |
MLR | MAE | - | 5.641 | 6.038 | 5.614 |
RMSE | - | 7.266 | 7.555 | 7.269 | |
NSE | - | 0.466 | 0.423 | 0.466 | |
ANFIS | MAE | 5.194 | 5.420 | 4.906 | 5.752 |
RMSE | 6.738 | 6.813 | 6.770 | 7.082 | |
NSE | 0.541 | 0.531 | 0.536 | 0.493 | |
ANN | MAE | 4.250 | 3.590 | 3.473 | 2.823 |
RMSE | 5.579 | 4.869 | 4.861 | 3.720 | |
NSE | 0.685 | 0.760 | 0.761 | 0.860 |
5 Conclusions and Recommendation
In this study, SRC, SLR, MLR, ANFIS and ANN models were investigated in modeling suspended sediment concentration in the Thames River, London Ontario. Weekly records of river temperature, discharge, water electrical conductivity and suspended sediment concentration were used to train various model. Models were evaluated using MAE, RMSE and NSE as performance indicators in order to compare and select the best model. The machine learning models ANFIS and ANN performed better than the other conventional models, SRC, SLR and MLR, in estimating SSC. Adding additional input variables (T or C) to the major input (Q) improved the various model estimates of SSC.
NSE increased by 194% using the ANN model when comparing models using the same data inputs. MAE and RMSE increased by 63% and 56% when Q was the only input. ANN using three inputs (discharge, temperature and electrical conductivity) performed better than all other models, with respective increases of 269% and 26% in NSE over the SRC and ANN models that used Q as the only input.
The least accurate model was that using SRC, followed by the regression models. Machine learning models performed better in estimating SSC. ANN is a superior approach.
Continuous daily time series sampling for SSC is recommended in order to better model SSC in all models. The effects of other input variables (e.g. rainfall intensity) will improve the accuracy of all models in modeling SSC.
References
- Angabini, S., H. Ahmadi, S. Feiznia, B. Vaziri and S. Ershadi. 2014. “Using Intelligence Models to Estimate Suspended Sediment System Case Study: Jagin Dam.” Bulletin of Environment, Pharmacology and Life Sciences 3 (3): 166–72.
- Bayram, A., M. Kankal, G. Tayfur and H. Önsoy. 2013. “Prediction of Suspended Sediment Concentration from Water Quality Variables.” Neural Computing and Applications 24 (5): 1079–87. https://doi.org/10.1007/s00521-012-1333-3
- Campbell, F. B. and H. A. Bauder. 1940. “A Rating-Curve Method for Determining Silt-Discharge of Streams.” Eos, Transactions, American Geophysical Union 21 (2): 603–7. https://doi.org/10.1029/TR021i002p00603
- Cigizoglu, H. and M. Alp. 2003. “Suspended Sediment Forecasting by Artificial Neural Networks Using Hydro Meteorological Data.” World Water & Environmental Resources Congress 90:1–8. https://doi.org/doi:10.1061/40685(2003)173
- Cobaner, M., B. Unal and O. Kisi. 2009. “Suspended Sediment Concentration Estimation by Adaptive Neuro-Fuzzy and Neural Network Approaches Using Hydro-Meteorological Data.” Journal of Hydrology 367 (1–2): 52–61. https://doi.org/10.1016/j.jhydrol.2008.12.024
- Dai, Q., H. Shan, Y. Jia and W. Cui. 2009. “Laboratory Study on the Relationships between Suspended Sediment Concentration and Electrical Conductivity.” In ASME 2009, 28th International Conference on Ocean, Offshore and Arctic Engineering: Volume 7, 179–86. New York: ASME (American Society of Mechanical Engineers). https://doi.org/10.1115/OMAE2009-79211
- Foroozesh, J., A. Khosravani, A. Mohsenzadeh and A. Mesbahi. 2013. “Application of Artificial Intelligence (AI) Modeling in Kinetics of Methane Hydrate Growth.” American Journal of Analytical Chemistry 4 (11): 616–22. https://doi.org/10.4236/ajac.2013.411073
- Ghorbani, M., S. Hosseini, M. Fazelifard and H. Abbasi. 2013. “Sediment Load Estimation by MLR, ANN, NF and Sediment Rating Curve (SRC) in Rio Chama River.” Journal of Civil Engineering and Urbanism 3 (4): 136–41.
- Grubbs, F. 1969. “Procedures for Detecting Outlying Observations in Samples.” Technometrics 11 (1): 1–21.
- Heng, S., and T. Suetsugi. 2013. “Using Artificial Neural Network to Estimate Sediment Load in Ungauged Catchments of the Tonle Sap River Basin, Cambodia.” Journal of Water Resource and Protection 5 (2): 111–23. https://doi.org/10.4236/jwarp.2013.52013
- Jang, J.-S. 1993. “ANFIS: Adaptive-Network_Based Fuzzy Inference System.” IEEE 23 (3): 665–85.
- Jang, J.-S., C.-T. Sun and E. Mizutani. 1997. Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River, NJ: Prentice-Hall Inc.
- Joshi, R., K. Kumar and V. Adhikari. 2015. “Modeling Suspended Sediment Concentration Using Artificial Neural Networks for Gangotri Glacier.” Hydrological Processes 1366 (November): 1354–66. https://doi.org/10.1002/hyp.10723.
- Karim, M. F. and J. F. Kennedy. 1990. “Menu of Coupled Velocity and Sediment-Discharge Relations for Rivers” Journal Of Hydraulic Engineering 116 (8): 978–96. https://doi.org/10.1061/(ASCE)0733-9429(1990)116:8(978)
- Kisi, Ö. 2004. “Multi-layer perceptrons with Levenberg-Marquardt training algorithm for suspended sediment concentration prediction and estimation/Prévision et estimation de la concentration en matières en suspension avec des perceptrons multi-couches et l’algorithme d’apprentissage de Levenberg-Marquardt.” Hydrological Sciences Journal 49 (6). https://doi.org/10.1623/hysj.49.6.1025.55720
- Kisi, Ö. 2005. “Suspended sediment estimation using neuro-fuzzy and neural network approaches/Estimation des matières en suspension par des approches neurofloues et à base de réseau de neurones.” Hydrological Sciences Journal 50 (4). https://doi.org/10.1623/hysj.2005.50.4.683
- Kisi, Ö., T. Haktanir, M. Ardiclioglu, Ö. Ozturk, E. Yalcin and S. Uludag. 2009. “Adaptive Neuro-Fuzzy Computing Technique for Suspended Sediment Estimation.” Advances in Engineering Software 40 (6): 438–44. https://doi.org/10.1016/j.advengsoft.2008.06.004
- Kisi, Ö., M. Karahan and Z. Şen. 2006. “River Suspended Sediment Modeling Using a Fuzzy Logic Approach.” Hydrological Processes 20 (20): 4351–62. https://doi.org/10.1002/hyp.6166
- Lopes, V. L., and P. F Ffolliott. 1993. “Sediment Rating Curves for a Clearcut Ponderosa Pine Watershed in Northern Arizona.” Water Resources Bulletin, JAWRA Journal of the American Water Resources Association 29 (3): 1–14. https://doi.org/10.1111/j.1752-1688.1993.tb03214.x
- McBean, E. A. and S. AI-Nassr. 1988. “Uncertainty in Suspended Sediment Transport Curves.” Journal of Hydraulic Engineering 114 (1): 63–74.
- Melesse, A. M., S. Ahmad, M. E. McClain, X. Wang and Y. H. Lim. 2011. “Suspended Sediment Load Prediction of River Systems: An Artificial Neural Network Approach.” Agricultural Water Management 98 (5): 855–66. https://doi.org/10.1016/j.agwat.2010.12.012
- Nagy, H., K. Watanabe and M. Hirano. 2002. “Prediction of Sediment Load Concentration in Rivers Using Artificial Neural Network Model.” Journal of Hydraulic Engineering 128 (6): 588–95. https://doi.org/10.1061/(asce)0733-9429(2002)128:6(588)
- Nash, J. E. and J. V. Sutcliffe. 1970. “River Flow Forecasting Through Conceptual Models Part I—A Discussion of Principles.” Journal of Hydrology 10:282–90. https://doi.org/10.1016/0022-1694(70)90255-6
- Nastos, P., K. Moustris, I. Larissi and A. Paliatsos. 2011. “Air Quality and Bioclimatic Conditions within the Greater Athens Area, Greece—Development and Applications of Artificial Neural Networks.” In Advanced Air Pollution, edited by F. Nejadkoorki. London: InTechOpen. https://doi.org/10.5772/710
- Rajaee, T., S. Mirbagheri, M. Zounemat-Kermani and V. Nourani. 2009. “Daily Suspended Sediment Concentration Simulation Using ANN and Neuro-Fuzzy Models.” Science of the Total Environment 407 (17):c4916–27. https://doi.org/10.1016/j.scitotenv.2009.05.016
- Rezaei, M. and M. Fereydooni. 2015. “Comparative Evaluation of Adaptive Neuro-Fuzzy Inference System (ANFIS) and Artificial Neural Network (ANN) in Simulation of Suspended Sediment Load (Case Study: Dalaki River, Cham Chit Station).” Indian Journal of Fundamental and Applied Life Sciences 5 (S1): 3598–606.
- Rojas, R. 1996. Neural Networks: A Systematic Introduction. New York: Springer-Verlag. https://doi.org/10.1016/0893-6080(94)90051-5
- Tachi, S. 2017. “Contribution to the Characterization and the Modeling of Sediment Transport in Urban Hydro-Systems.” Chlef, Algeria: Université Hassiba Benbouali de Chlef. Doctoral dissertation. http://dspace.univ-chlef.dz:8080/jspui/handle/123456789/1254
- Tahmoures, M., A. R. Moghadamnia and M. Naghiloo. 2015. “Modeling of Streamflow–Suspended Sediment Load Relationship by Adaptive Neuro-Fuzzy and Artificial Neural Network Approaches (Case Study: Dalaki River , Iran).” Desert 2:177–95. https://doi.org/10.22059/jdesert.2015.56481
- UTRCA (Upper Thames River Conservation Authority). 2012. River Bend Watershed Report Card. London, Ontario: Upper Thames River Conservation Authority.
- Zhu, Y. M., X. X. Lu and Y. Zhou. 2007. “Suspended Sediment Flux Modeling with Artificial Neural Network: An Example of the Longchuanjiang River in the Upper Yangtze Catchment, China.” Geomorphology 84 (1–2): 111–25. https://doi.org/10.1016/j.geomorph.2006.07.010