Assessment of Suitability of Gridded Precipitation Data for Hydrological Simulation in Eastern Himalaya: A Case Study

Abstract
Gridded precipitation datasets have been effectively employed in hydrological modeling in absence of gauge data. The study assessed the applicability of five spatially distributed precipitation datasets, Indian Meteorological Department [IMD] (gauge-interpolated), Climate Forecast System Reanalysis [CFSR] (reanalysis), Tropical Rainfall Measuring Mission [TRMM] (satellite-based), Precipitation Estimation From Remotely Sensed Information using Artificial Neural Networks [PERSIANN-CDR] (satellite-based), and Asian Precipitation – Highly-Resolved Observational Data Integration Towards Evaluation of Water Resources [APHRODITE] (gauge-interpolated), for hydrological modeling in an Eastern Himalayan basin. These gridded datasets were input to the Soil and Water Assessment Tool (SWAT), which was calibrated using the SWAT-CUP SUFI2 algorithm. Based on monthly simulated results, the CFSR gridded dataset outperformed others. Streamflow underprediction was also acceptable for the entire study period. IMD and TRMM performed satisfactorily in calibration but failed to perform in validation. APHRODITE and PERSIANN showed good correlation, but due to the overall low rainfall estimation, the data failed to produce satisfactory results and hence is considered unsuitable for hydrological simulation. The TRMM model simulation had the best overall trend against the observed data but failed to match the peaks. The study concluded that CFSR can be alternatively used for modeling in the absence of gauge data for the mountainous river basins of Eastern Himalaya.
1 INTRODUCTION
A hydrological model is a broken-down representation of an actual system, utilized to enhance comprehension, prediction, and water resources management. While there are several hydrological models available, Soil and Water Assessment Tool (SWAT) has been successfully implemented in various regions (Acharya 2018; Devia et al. 2015; Tegegne et al. 2017) and has a large user-base, detailed documentation (Neitsch et al. 2011), and an easy-to-use interface. A SWAT model is a continuous-time, semi-distributed, process based river basin model (Arnold et al. 2012).
The fundamental building component of a hydrological model is precipitation, which is a spatial and temporal variable. In recent years, gridded precipitation datasets have garnered significant attention due to their extensive spatial and temporal coverage, particularly in regions with limited resources and accessibility constraints (Meher and Das 2019) and their easy access. These gridded datasets are developed using ground data, satellite estimations, radar estimations, or a combination of these. By comparing the outputs from the developed results using various gridded datasets, the practical application of these gridded datasets can be ascertained. In the past, many studies have proven the dependability of SWAT with the SUFI-2 algorithm (sequential uncertainty fitting) in SWAT-CUP for streamflow modeling producing excellent models (Zhou et al. 2014; Hussain et al. 2017; Chiphang et al. 2020). Nevertheless, in mountainous regions, the performance of gridded dataset models has been found to be inferior compared to those in plain area regions (Musie et al. 2019; Yang et al. 2014). This is because precipitation acts differently in watersheds with varied elevations, especially due to orographic influences (Houze 2012). Hence, acknowledging the uncertainties associated with hydrological modeling in a mountainous region and the scarcity of available recorded precipitation data, this study has been conducted with the objective of evaluating the applicability of different gridded rainfall datasets for modeling streamflow over the Mago River basin of Arunachal Pradesh.
Many similar studies have been done in the past around the world. Using the ParaSol approach, Vu et al. (2012) employed several gridded precipitation datasets to create discharge models in Vietnam's central highlands and reported that the APHRODITE dataset could simulate a good model with NSE = 0.54 and R2 = 0.55, followed by GPCP with NSE = 0.46 and R2 = 0.51. In the absence of data for some periods, Sayyad et al. (2015) used a Climate Research Unit (CRU) gridded dataset as a complement to observed data and prepared the model using SWAT and the results provided a satisfactory discharge prediction. Thom et al. (2017) developed a SWAT hydrological model using APHRODITE, CFSR, PERSIANN, TRMM, and gauge based observed rainfall data for the Srepok River basin in Vietnam. The study showed that the station data produced an overall good fit with the station recorded discharge, and the TRMM and APHRODITE data produced satisfactory results for useful application in the region. Using the SWAT-CUP SUFI2 optimization algorithm, Singh and Saravanan (2020) compared the performance of satellite-based gridded rainfall datasets from APHRODITE, TRMM, GPCP, and CFSR in the Wunna basin in Maharashtra state, Central India, and concluded that the finest products for streamflow modeling in the region were GPCP and TRMM, whereas the least suited products were APHRODITE and CFSR. Tarek et al. (2020) compared nine global precipitation datasets in Africa which included satellite-based data corrected with respect to ground data, gauge-based datasets, a merged dataset, and reanalysis datasets. The merged dataset ‘Multi-Source Weighted-Ensemble Precipitation’ (MSWEP) outperformed all the other datasets. Venkatesh et al. (2020) compared 16 precipitation datasets and concluded that in India's mountainous tropical basin, TRMM_3B42_v7 was the best for streamflow modeling, and APHRODITE performed well to recognize rainfall. Tran et al. (2023) evaluated three satellite-based products – Integrated Multi-satellite Retrievals for GPRM (IMERG) Final run V6, Soil Moisture to Rain (SM2RAIN) – Advanced Scatterometer (ASCAT) V1.5, and Multi-Source Weighted-Ensemble Precipitation (MSWEP) V2.2 for a subbasin in the Mekong River basin, and IMERG showed the best performance, followed by SM2RAIN-ASCAT. Rajeevan et al. (2006) developed an IMD gridded dataset with a high resolution (1° × 1°) for India by interpolating data from 1803 stations for the period from 1951 to 2003 and compared it with similar global gridded rainfall datasets. The comparison showed that the developed gridded data showed a more faithful depiction of India’s spatial rainfall distribution. Later, Pai et al. (2014) interpolated daily rainfall records from 6995 rain gauge stations in India to create an even higher resolution IMD gridded dataset of 0.25° × .25° from 1901 to 2010.
In our study, SWAT parameters were taken from a previous study conducted by Chiphang et al. (2020), and their respective ranges were kept between the limit of their absolute values to produce realistic results. Compared to previous works, this study consists of five gridded rainfall datasets, namely IMD (gauge interpolated), CFSR (reanalysis product), TRMM (satellite-based), APHRODITE (gauge-based), and PERSIANN (satellite-based). This study considered the following objectives to:
- Compare the gridded precipitation data characteristics with gauge data (China bridge station);
- Study the suitability of gridded precipitation datasets for simulations of streamflow; and
- Evaluate the influence of precipitation and fitted parameters on the models' hydrological balance components.
2 METHODS
2.1 Study region
The Mago basin in Arunachal Pradesh (Republic of India) was chosen as the research region, with an aerial coverage of 841 km2 and engulfed under the latitudes 27.57° and 27.88° N, and longitudes of 92.07° E and 92.47° E (Figure 1). The catchment’s minimum and maximum elevations range from around 2,355 meters at the exit to 6,436 meters at the highest point. The region experiences snowfall starting from late October up to March. The region is very scarcely inhabited, without any major disturbances to the habitat. The river flow is maintained perennially by rain, snow, ground water, and glaciers, and the water is free from pollutants. About 37% of the region is forest covered; 26% is covered by range-grasses, and more than 20% of the region is either snow or glacier covered towards the higher elevation ranges of the region (Figure 2). The general soil composition over the region is loam with about 60% coverage. While most of the soil class have their texture as loam, they are different in their water holding capacity, hydraulic conductivity, and moist bulk density. All associated details are taken from the ‘usersoil’ database in the SWAT2012.mdb file. In the basin, eight distinct soil types and two miscellaneous land categories were identified. These include rocky mountains covered with perpetual snow/glacier, and various water bodies. The types of classes are given in the table included with Figure 3.
Figure 1 DEM of the Mago basin, Arunachal Pradesh.
Figure 2 Land use land cover map.
Figure 3 Soil classes. (*unweathered bedrock)
2.2 Data acquisition
The study compared five gridded rainfall datasets, namely IMD, CFSR, TRMM, APHRODITE, and PERSIANN, and their applicability for model simulation. Daily precipitation time-series data were extracted for the period 2004–2012 from the specified rainfall datasets. Only the grids that fell under the basin region, and the grids that were near the basin, were selected for extraction. Information on the rainfall datasets is highlighted below.
IMD is part of India's Ministry of Earth Sciences and is largely responsible for the country's meteorological observations, weather forecasting, and seismology. The current study uses the 0.25° × 0.25° resolution gridded dataset, which is available for the period from 1901 to the present time in ‘GRD’ format (Pai et al. 2014).
CFSR (Fuka et al. 2014) is a high resolution third-generation reanalysis product coupled with atmospheric, oceanic, and surface-modeling components, covering the entire globe with a gridded precipitation of up to 0.3125° × 0.3125° resolution prepared for a period of 36 years, from 1979 to 2014, and is directly available in the SWAT format.
TRMM is a NASA-Japan Aerospace Exploration Agency (JAXA) collaborative space mission that monitors global tropical and subtropical rainfall and was launched in 1997 (Huffman et al. 2007). Out of the many temporal resolutions of data available under TRMM, the daily rainfall TRMM 3B42_v.7 having a resolution of 0.25° × 0.25° was opted for in this study.
APHRODITE (v1901) is a dataset that utilizes a compilation of Asian rain-gauge data (Yatagai et al. 2012), encompassing the Himalayas, South and Southeast Asia, and the mountainous regions of the Middle East. It delivers datasets in the spatial resolution of 0.25° × 0.25° and 0.5° × 0.5° (the former was used for this study). The input data is said to be varying in space and time, bringing the possibility of a weak trend analysis and characterization of extreme events.
PERSIANN-CDR is an estimation of precipitation using remote sensing involving Artificial Neural Networks (Ashouri et al. 2015). It has recorded data from 1983 until near present time for the latitude band of 60°S – 60°N. It is based on data from the GridSat-B1 infrared satellite along with adjustments brought by NCEP–National Centers for Environmental Prediction.
Table 1 highlights the description (version, spatial resolution, and temporal resolution) of the rainfall datasets used in this analysis. The data for maximum and minimum temperatures, relative humidity, gauge precipitation, and observed streamflow were taken from the weather station set up by Central Water Commission (CWC) at the coordinates of 27°37'22"N and 92°00'58"E. CWC stands as a prominent technical organization in India specializing in water resources, operating as an attached office under the Ministry of Jal Shakti, within the Department of Water Resources, River Development, and Ganga Rejuvenation, under the Government of India. Solar radiation and wind velocity information were acquired using IMD gridded weather data that was converted to the SWAT format and made accessible at https://swat.tamu.edu/data/india-dataset. All data, except precipitation data, were kept the same while developing the models.
Table 1 Description of rainfall datasets.
Rainfall Dataset | Version | Spatial resolution | Temporal resolution |
CWC | N/A | point source | daily |
IMD | IMD4 | 0.25° | daily |
CFSR | ds093.1 | 0.3125° | daily |
TRMM | 3B42_v7 | 0.25° | daily |
APHRODITE | V1801_R1 | 0.25° | daily |
PERSIANN-CDR | v01r01 | 0.25° | daily |
The State Remote Sensing Application Centre (SRSAC) in Itanagar supplied a Land Use and Land Cover (LULC) map (Figure 2) for Arunachal Pradesh. The soil map was obtained from the State Land Use Board (SLUB), Arunachal Pradesh, and converted into a raster in ERDAS IMAGINE 2014 (Figure 3). Both the shape file and raster file were resampled to the DEM resolution and re-projected to WGS 1984 UTM zone 46°N.
2.3 SWAT model setup
The watershed was divided into 15 subbasins under which 508 hydrologic response units were identified using the ArcSWAT 2012.10.5.24 software (Winchell et al. 2013). Using IMD, CFSR, TRMM, APHRODITE, and PERSIANN gridded precipitation datasets, an attempt was made to develop five models, using the specified gridded precipitation datasets while keeping the rest of the SWAT inputs the same, to finally decide the best gridded rainfall dataset for hydrological simulation. The higher reaches of the basin were covered with glaciers, and hence to account for its influence in the downstream discharge, ten elevation bands were generated (Grusson et al. 2015). Chiphang et al. (2020) utilized single-point gauge-based precipitation information for simulating a hydrological model in the Mago Catchment, and they selected 19 parameters for the model after conducting a sensitivity analysis in the same basin (Mago). Due to the lack of nearby stations, it was found that generation of elevation bands helped in developing a better model with the inclusion of temperature and precipitation lapse rates in the parameters list. In the present study, PLAPS was not used because of the presence of spatial variation in the gridded precipitation dataset. Table 2 presents the list of parameters used for calibration. After the preparation of the raw model using ArcSWAT, further calibration and validation was done in SWAT Calibration and Uncertainty program (SWAT-CUP) using SUFI-2, which has been framed for its computational efficiency and has been successfully used to model varying watersheds (Rostamian et al. 2008; Chiphang et al. 2020; Singh and Saravanan 2020). Following the guidance of Moriasi et al. (2007), the model's performance was assessed using three metrics: Coefficient of Determination (R2 ), Nash Sutcliffe (NS), and Percent Bias (PBIAS). Anything above 0.5 for NS and R2 and less than ±55 for PBIAS is considered satisfactory.
Table 2 Parameters considered for calibration and validation and their absolute ranges.
Parameters | Description | Min | Max |
GW_DELAY | Groundwater delay (days) | 0 | 500 |
RCHRG_DP | Deep aquifer percolation fraction | 0 | 1 |
GWQMN | Threshold in the shallow aquifer for return flow to occur (mm H2O) | 0 | 5,000 |
ALPHA_BF | Base flow in alpha factor (1/day) | 0 | 1 |
CN2 (relative test) | SCS runoff curve number | -0.4 | 0.4 |
SOL_AWC (relative test) | Available water capacity of the soil layer (mm H2O/ mm soil) | -0.4 | 0.4 |
SOL_K (relative test) | Saturated hydraulic conductivity (mm/hr) | -0.4 | 0.4 |
SOL_BD (relative test) | Moist bulk density (g/cm3) | -0.4 | 0.4 |
CH_N2 | Manning’s “n” value for main channel | 0.025 | 0.3 |
CH_K2 | Effective hydraulic conductivity in the main channel (mm/hr) | 0.025 | 500 |
ALPHA_BNK | Base flow alpha factor for bank storage (1/day) | 0 | 1 |
ESCO | Soil evaporation compensation factor | 0.01 | 1 |
SLSOIL | Slope length for lateral subsurface flow (m) | 0 | 150 |
SFTMP (TS) | Snowfall temperature/ Critical temperature (°C) | -5 | 5 |
SMTMP (Tm) | Snowmelt base temperature/ Melt threshold (°C) | -5 | 5 |
TIMP(β) | Snowpack temperature lag factor | 0 | 1 |
SMFMX (αmax) | Maximum melt rate for snow during year (summer solstice) (mm H20/°C/day) | 1.5 | 7 |
TLAPS | Temperature lapse rate (°C/km) | -6 | 0 |
2.4 Comparison, sensitivity analysis, calibration, and validation
To assess the applicability of different gridded rainfall datasets with the ground-recorded rainfall dataset, only the grid that corresponds to the location of the station was considered. Daily time series data was extracted for the required grid point for each of the gridded precipitation datasets. After the data was extracted, the correlation coefficient was determined between each of the rainfall datasets. A warm-up period of two years (2004–2005) was considered to set up the model for proper initiation during the calibration process. Based on the observed data for rainfall and streamflow procured, four years of calibration period were defined (2006–2009) following a validation period of three years (2010–2012). The global sensitivity analysis was checked from the first set of iterations, comprising 500 simulations for each model. The calibrations were carried out on a monthly time scale. The raw model obtained from ArcSWAT prior to any calibration was imported into SWAT-CUP and the driving parameters were fed with a logical range for calibration of the model (Arnold et al. 2012). SUFI-2 offers three methods to apply change to a parameter – replace, absolute and relative. We used ‘replace’ for 14 of the parameters which directly replaces the existing parameter value with the given value, and ‘relative’ for 4 parameters which multiples the existing parameter value with ‘’1+ given value’ (SWATCUP manual). The model was calibrated monthly for the time span from 2006–2009, with multiple iterations of 500 simulations each. After each iteration, a new set of parameter ranges were imported in SWAT-CUP which had narrower range than the previous iteration and revolved around the best parameters obtained from the previous iteration. The iterations were continued until the R-factor and P-factor values of around 0.7 and 1, respectively, were attained (Abbaspour 2010).
After successfully calibrating the model, the best parameters obtained from the latest iteration were employed for model validation.
3 RESULTS AND DISCUSSION
3.1 Comparison of gridded rainfall datasets with CWC gauge data
Monthly rainfall data obtained from the gridded dataset for the station grid revealed varying relationships with the observed dataset (CWC). The correlation coefficient was determined between the observed rainfall and each of the gridded datasets. Additionally, the correlation coefficient was projected between the different gridded rainfall datasets and the time series data acquired from the station grid. A correlation coefficient matrix was prepared as shown in Figure 4. Figure 4(a) gives the correlation coefficient matrix between the various rainfall datasets for the entire period (2006–2012), as well as for the period of calibration (2006–2009), and validation (2010–2012). IMD showed the lowest correlation (0.66) with respect to the gauge data, while APHRODITE showed the best correlation (0.82) with the same. The matrix also shows the correlation between the various gridded datasets which aided in understanding their relationship. The correlation of the rainfall datasets for the period of calibration (2006–2009) and the period of validation (2010–2012) are shown in Figure 4(b) and Figure 4(c), respectively. It was seen that the correlation coefficient between all the gridded datasets were good during the calibration period with respect to the gauge data. However, IMD and CFSR showed lesser correlation with gauge data for the validation period. The correlation coefficient can help us understand the linearity in the trend between two datasets. While the correlation coefficient showed a comparatively good relationship between observed (CWC), TRMM, APHRODITE, and PERSIANN, the accumulated precipitation of the different rainfall datasets as shown in Figure 5 revealed that TRMM, APHRODITE, and PERSIANN hugely underestimated the precipitation. Though IMD and CFSR showed lesser similarity in trend with the observed data, the overall precipitation volumetrically was more appropriately captured by IMD and CFSR. The total rainfall for the period of study showed 20,442.4 mm in the gauge data and 19,595.3 mm, 22,160.9 mm, 10,764.64 mm, 5,600.8 mm, and 8,712.17 mm for IMD, CFSR, TRMM, APHRODITE, and PERSIANN, respectively. APHRODITE, TRMM, and PERSIANN underestimated the rainfall, while CFSR slightly overestimated the rainfall. Singh and Saravanan (2020) also found that APHRODITE tended to under-predict rainfall while CFSR over-predicted the rainfall. Overall, IMD and CFSR showed the best resemblance in comparison to the gauge recorded precipitation data.
Figure 4 Correlation matrix showing the monthly rainfall relationship between IMD, CFSR, TRMM, APHRODITE, and PERSIANN precipitation datasets, and CWC gauge data: (a) entire period (2006–2012), (b) calibration period (2006–2009), and (c) validation period (2010–2012).
Figure 5 Monthly accumulated precipitation of the five rainfall datasets against gauge data.
3.2 Suitability of gridded precipitation datasets for streamflow simulation
The summary statistics (NSE, R2, and PBIAS) of the developed models with the gridded rainfall dataset for the Mago basin are recorded in Table 3. The results were accepted after multiple iterations in SWAT-CUP, and as a result, a different set of best parameters were achieved for calibration of each model existing between the initially set parameter range values. The optimal set of parameters for each model is documented in Table 4. Though unusual, this uncertainty in the parameter values is unavoidable and is backed up by the fact that there are multiple possible solutions for modeling, with one being slightly better than the other, but the parameter settings can be vastly different for that little difference in the model performance. The results of the simulation of the five models plotted against observed discharge (CWC) and their corresponding precipitation are shown in Figure 6. Upon visually inspecting the calibration-validation graph of the simulated discharge, it becomes evident that the simulation trend is significantly influenced by the trend of the precipitation data. The x-y scale in Figure 6 has been kept in a consistent order, to portray the direct impact of rainfall on the simulation curve. APHRODITE recorded the least rainfall and showed the largest under-prediction of streamflow followed by PERSIANN, TRMM, CFSR, and IMD in sequence.
Table 3 Summary statistics for evaluation of monthly calibration and validation for the years 2006–2009 and 2010–2012, respectively.
Performance Indicator | IMD | CFSR | TRMM | APHRODITE | PERSIANN | |||||
Cal | Val | Cal | Val | Cal | Val | Cal | Val | Cal | Val | |
R2 | 0.81 | 0.41 | 0.60 | 0.65 | 0.85 | 0.85 | 0.77 | 0.59 | 0.82 | 0.53 |
NSE | 0.80 | 0.28 | 0.53 | 0.56 | 0.50 | 0.23 | -1.70 | -1.42 | -0.38 | -0.29 |
PBIAS (%) | 2.59 | -14.11 | -4.57 | -15.05 | -31.10 | -40.58 | -79.10 | -86.30 | -57.90 | -59.80 |
Table 4 Best-fit parameters obtained for the five gridded precipitation datasets.
Parameters | IMD | CFSR | TRMM | APHRODITE | PERSIANN |
GW_DELAY | 360.6556 | 457.5221 | 241.5159 | 179.1 | 0.9 |
RCHRG_DP | 0.668119 | 0.488364 | 0.110055 | 0.477 | 0.181 |
GWQMN | 3609.979 | 4975.943 | 797.087 | 1885 | 2325 |
ALPHA_BF | 0.637633 | 0.004856 | 0.281678 | 0.157 | 0.801 |
CN2 | -0.29539 | -0.29964 | -0.3402 | -0.3767 | -0.3503 |
SOL_AWC | 0.077611 | -0.04466 | -0.21034 | -0.3928 | -0.0056 |
SOL_K | 0.248987 | 0.100758 | -0.26233 | 0.0088 | -0.2392 |
SOL_BD | 0.160678 | -0.16588 | 0.367632 | 0.3624 | -0.1448 |
CH_N2 | 0.268959 | 0.135169 | 0.102312 | 0.233725 | 0.292575 |
CH_K2 | 201.3764 | 218.9583 | 338.5111 | 317.5091 | 462.5 |
ALPHA_BNK | 0.061301 | 0.288062 | 0.296914 | 0.289 | 0.185 |
ESCO | 0.674568 | 0.552284 | 0.985138 | 0.99307 | 0.87 |
SLSOIL | 101.2861 | 118.2169 | 35.35619 | 0.45 | 0.15 |
TIMP | 0.777583 | 0.368396 | 0.294831 | 0.237 | 0.089 |
TLAPS | -4.36458 | -4.96242 | -3.83886 | -3.80036 | -4.14757 |
SFTMP | 5.551169 | 4.90511 | 3.819588 | -1.996 | -1.3 |
SMFMX | 5.763957 | 4.532957 | 4.391909 | 3.302 | 3.446 |
SMTMP | -1.65272 | -0.83551 | 0.227272 | 0.86 | 1.724 |
Figure 6 Simulated discharge of the five gridded rainfall dataset models against observed discharge for calibration and validation periods.
A comparison of the statistical indices of the five models indicated that the overall model efficiency of CFSR was the best (among the five datasets), with a satisfactory NSE of 0.53 and 0.56, and an R2 of 0.6 and 0.65 for the calibration and validation, respectively, based on the qualification standard impressed by Moriasi et al. (2007). The simulation in the validation period performed better than the calibration period in our study (NSE and R2). This does not always happen, as seen in the cases of IMD, TRMM, APHRODITE, and PERSIANN-CDR. The presence of better weather quality data during the validation period can result in better simulation during the same, as the model is primarily dependent upon weather inputs. Muche et al. (2020) tested simulated discharge from multiple gridded precipitation datasets using SWAT, and results demonstrated better NSE and R2 values for the validation period for some of the gridded datasets. The simulated streamflow showed an underprediction of 4.57% during calibration, and 15.05% during validation, qualifying under the good category. While IMD achieved an excellent NSE of 0.8 during calibration, the validation had a poor result, with an NSE of 0.28. The same was seen in the case of R2 for IMD with values of 0.81 and 0.41 for calibration and validation, respectively. A dive into the daily precipitation values revealed the possibility of the presence of outliers in the rainfall, with some days recording rainfall greater than 150 mm, with the maximum rainfall per day reaching up to 310 mm. Additionally, the poor NSE during validation can be explained by the poor correlation (0.67) of the IMD gridded dataset against the gauge data for the period of validation (Figure 4c). TRMM was calibrated with a satisfactory NSE of 0.5, however, it could not perform well during the validation period (NSE = 0.23). APHRODITE and PERSIANN produced unacceptable NSE values of -1.42 and -0.29, respectively, rendering the gridded precipitation data totally unfit for use in its raw form. However, the R2 values showed interesting achievements, with TRMM having the best R2 values of all the models (0.85), and APHRODITE and PERSIANN also showing good trends during the calibration period with 0.77 and 0.82, respectively.
Many studies in the past have pointed out the presence of inherent bias in model data. The use of bias correction methods such as quantile mapping (QM) can render these biased datasets to be very promising for hydrological use. Shukla et al. (2019) conducted bias correction of TRMM data in the Himalayan region using QM, yielding satisfactory outcomes, suggesting its suitability for the area. Similarly, Dangol et al. (2022) applied a linear scaling technique to correct biases in APHRODITE and PERSIANN CDR data in Nepal, demonstrating enhanced data quality. Qi et al. (2023) developed a nonlinear model to correct gridded precipitation in poorly gauged basin featuring a multiplicative-exponential equation and the results were promising. It can be concluded that bias correction of the gridded precipitation values of TRMM, APHRODITE, and PERSIANN could improve the modeling characteristics of these rainfall datasets. However, it must be kept in mind that an ideal bias correction of these datasets would require a long period of observed precipitation data. Future research, combined with long-term data availability, could explore the correction of these rainfall datasets to produce better models for the region.
3.3 Evaluation of the impact of precipitation and fitted parameters on the models' hydrological balance components
Figure 7 shows the annual water balance scenarios corresponding to the five gridded rainfall datasets. While it is not possible for a basin to have five water balance scenarios simultaneously, looking into the different scenarios of water balance can give us a general understanding as to how the basin reacts to varying precipitation conditions. Keeping that in mind, we can derive some important relationships between various components. From the hydrological simulations, it is seen that percolation has a positive relationship with the magnitude of precipitation. In the case of CFSR, the annual precipitation was recorded to be the highest for the years 2006–2010, which was also reflected in the degree of evapotranspiration. For the year 2011–2012, when the data showed lower rainfall, the percolation reflected the same in its behaviour. PERSIANN recorded the least rainfall compared to all the other rainfall datasets, and the same can be said for its simulated percolation. This behaviour was consistent for all the rainfall models. The same behaviour has been reported against water balance components of models developed using a gridded dataset by Muche et al. (2020) for the region of Kansas. It was also seen that more evapotranspiration was recorded in the case of lower precipitation. However, amongst the many water balance scenarios plotted, we regarded the CSFR water balance components as the most accurate representation of the actual hydrological processes occurring in the catchment. The model translated 15.8% of the precipitation into evapotranspiration losses, 11% as lateral flow discharge, and 48.7% as water percolating the soil.
Figure 7 Water balance components of the five models using a yearly scale
(calibration: 2006–2009; validation: 2010–2012).
4 CONCLUSIONS
The current study evaluated the feasibility of five rainfall gridded datasets, namely IMD, CFSR, TRMM, APHRODITE, and PERSIANN. An attempt was made to set up a suitable model using ArcSWAT 2012 for the Mago basin using five gridded precipitation datasets, keeping all aspects (temperature, relative humidity, solar radiation, and wind velocity) the same, while only changing the rainfall data. The models were further calibrated and validated using SWAT-CUP to produce the final models.
Comparison of the selected gridded precipitation datasets revealed that the gridded rainfall dataset shows good correlation in the monthly-time step. APHRODITE has the best correlation with the gauge-based reference rainfall. However, examination of the accumulated precipitation data revealed that TRMM, APHRODITE, and PERSIANN consistently recorded lower rainfall, whereas IMD and CFSR show a better match in volumetric terms. With multiple parameter adjustments in the quest of a better performing model, it was evident that it is possible for different models or the same model to perform at similar efficiency, but with a completely different set of driving parameters, and hence there is a prospect of uncertainty in the parameterization. Among the five gridded rainfall datasets, CFSR was found to be the most suitable gridded precipitation dataset for hydrological simulation for the Mago basin with NSE and R2 values of 0.53 (0.56) and 0.6 (0.65) for calibration (validation). TRMM failed as a good model simulator but had the simulated discharge with the best trend (R2) with respect to observed discharge. The APHRODITE and PERSIANN datasets were found to be unsuitable for modeling in the study region. Nevertheless, by applying correction techniques to align the rainfall data with the observed data, these datasets could become more valuable for modeling purposes. While IMD is said to be gauge interpolated data, comparing the IMD gridded data with the station data indicated that proper interpolation of precipitation data for Northeast India is still lacking. Ensuring the inclusion of data obtained from available Automatic Weather Stations (AWS) that have been functioning around Northeast India, for interpolation of the gridded data could vastly improve the modeling capabilities of the IMD dataset for the region.
Finally, the research indicated that in the absence of gauge precipitation data, CFSR gridded precipitation dataset could be used as an alternative to satisfactorily model the streamflow for the Mago basin. Additionally, it's crucial to emphasize the significance of using the CFSR dataset in regions with sparse gauge data. This study underscores the reliability of CFSR as an alternative for hydrological modeling in data-scarce regions. By advocating for the wider adoption of the CFSR precipitation dataset, this research contributes to enhancing water resource management strategies globally.
ACKNOWLEDGEMENTS
The authors gratefully acknowledge the help, encouragement, and financial support provided by the Science and Engineering Research Board, Department of Science and Technology, Government of India, through Grant No. EMR/2016/005189.
The Central Water Commission (CWC) of India has discharge data that supports the findings of this study. The availability of these data, which were used under license for this work, is subject to restrictions. With the consent of CWC Headquarters in New Delhi, these data are available to the authors. All other data used in this study are openly available in the public domain.
The SWAT model used in this study is a free and open-source software available for download in the public domain. ArcMap is used under license which is procured from ESRI.
References
- Abbaspour, K.C. 2010. SWAT-Cup Manual. 130 (8): 965–970. EAWAG: Swiss Federal Institute of Aquatic Science and Technology, Dübendorf.
- Acharya, A. 2018. “Evaluating the suitability of application of hydrological models in a mixed land use watershed.” Journal of Water Management Modeling 26, C456. https://doi.org/10.14796/JWMM.C456
- Arnold, J.G., D.N. Moriasi, P.W. Gassman, K.C. Abbaspour, M.J. White, R. Srinivasan, C. Santhi, et al. 2012. “SWAT: Model use, calibration, and validation.” Transactions of the ASABE 55 (4): 1491–1508. http://swatmodel.tamu.edu
- Ashouri, H., K.L. Hsu, S. Sorooshian, D.K. Braithwaite, K.R. Knapp, L.D. Cecil, et al. 2015. "PERSIANN-CDR: Daily precipitation climate data record from multi-satellite observations for hydrological and climate studies." Bulletin of the American Meteorological Society 96 (1): 69–83. https://doi.org/10.1175/BAMS-D-13-00068.1
- Chiphang, N., A. Bandyopadhyay, and A. Bhadra. 2020. “Assessing the Effects of Snowmelt Dynamics on Streamflow and Water Balance Components in an Eastern Himalayan River Basin Using SWAT Model.” Environmental Modeling and Assessment 25 (6): 861–883. https://doi.org/10.1007/s10666-020-09716-8
- Dangol, S., R. Talchabhadel, and V.P. Pandey. 2022. “Performance evaluation and bias correction of gridded precipitation products over Arun River Basin in Nepal for hydrological applications.” Theoretical and Applied Climatology 148 (3–4): 1353–1372. https://doi.org/10.1007/s00704-022-04001-y
- Devia, G.K., B.P. Ganasri, and G.S. Dwarakish. 2015. “A Review on Hydrological Models.” Aquatic Procedia 4, 1001–1007. https://doi.org/10.1016/j.aqpro.2015.02.126
- Fuka, D.R., M.T. Walter, C. Macalister, A.T. Degaetano, T.S. Steenhuis, and Z.M. Easton. 2014. “Using the Climate Forecast System Reanalysis as weather input data for watershed models.” Hydrological Processes 28 (22): 5613–5623. https://doi.org/10.1002/hyp.10073
- Grusson, Y., X. Sun, S. Gascoin, S. Sauvage, S. Raghavan, F. Anctil, and J.M. Sáchez-Pérez. 2015. “Assessing the capability of the SWAT model to simulate snow, snow melt and streamflow dynamics over an alpine watershed.” Journal of Hydrology 531, 574–588. https://doi.org/10.1016/j.jhydrol.2015.10.070
- Houze, R.A. 2012. “Orographic effects on precipitating clouds.” Reviews of Geophysics 50 (1). https://doi.org/10.1029/2011RG000365
- Huffman, G.J., R.F. Adler, D.T. Bolvin, G. Gu, E.J. Nelkin, and K.P. Bowman. 2007. “The TRMM Multi-satellite Precipitation Analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales.” Journal of Hydrometeorology 8 (1): 38–55. https://doi.org/10.1175/JHM560.1
- Hussain, S., X. Song, G. Ren, I. Hussain, D. Han, and M.H. Zaman. 2017. “Evaluation of gridded precipitation data in the Hindu Kush–Karakoram–Himalaya mountainous area.” Hydrological Sciences Journal 62 (14): 2393–2405. https://doi.org/10.1080/02626667.2017.1384548
- Meher, J.K., and L. Das. 2019. “Gridded data as a source of missing data replacement in station records.” Journal of Earth System Science 128 (3). https://doi.org/10.1007/s12040-019-1079-8
- Moriasi, D.N., J.G. Arnold, M.W. Van Liew, R.L. Bingner, R.D. Harmel, and T.L. Veith. 2007. “Model evaluation guidelines for systematic quantification of accuracy in watershed simulations.” In Transactions of the ASABE 50, 3.
- Muche, M.E., S. Sinnathamby, R. Parmar, C.D. Knightes, J.M. Johnston, K. Wolfe, S.T. Purucker, et al. 2020. “Comparison and Evaluation of Gridded Precipitation Datasets in a Kansas Agricultural Watershed Using SWAT.” Journal of the American Water Resources Association 56 (3): 486–506. https://doi.org/10.1111/1752-1688.12819
- Musie, M., S. Sen, and P. Srivastava. 2019. “Comparison and evaluation of gridded precipitation datasets for streamflow simulation in data scarce watersheds of Ethiopia.” Journal of Hydrology 579, 124168. https://doi.org/10.1016/j.jhydrol.2019.124168
- Neitsch, S.L., J.G. Arnold, J.R. Kiniry, and J.R. Williams. 2011. Soil and Water Assessment Tool, Theoretical Documentation, Version 2009. Texas Water Resources Institute, College Station, Texas.
- Pai, D.S., L. Sridhar, M. Rajeevan, O.P. Sreejith, N.S. Satbhai, and B. Mukhopadhyay. 2014. “Development of a new high spatial resolution (0.25° × 0.25°) long period (1901–2010) daily gridded rainfall data set over India and its comparison with existing data sets over the region.” Quarterly Journal of Meteorology, Hydrology, and Geophysics 65, 1.
- Qi, S., A. Lv, G. Wang, and C. Zhang. 2023. “A Multiplicative-exponential function to correct precipitation for distributed hydrological modeling in poorly-gauged basins.” Journal of Hydrology 620, 129393. https://doi.org/10.1016/j.jhydrol.2023.129393
- Rajeevan, M., J. Bhate, J.D. Kale, and B. Lal. 2006. “High resolution daily gridded rainfall data for the Indian Region: Analysis of break and active monsoon spells.” Current Science 91 (3): 296–306.
- Rostamian, R., A. Jaleh, M. Afyuni, S.F. Mousavi, M. Heidarpour, A. Jalalian, and K.C. Abbaspour. 2008. “Application of a SWAT model for estimating runoff and sediment in two mountainous basins in central Iran.” Hydrological Sciences Journal 53 (5): 977–988. https://doi.org/10.1623/hysj.53.5.977
- Sayyad, G., L. Vasel, A. Asghar Besalatpour, B. Gharabaghi, and G. Golmohammadi. 2015. “Modeling Blue and Green Water Resources Availability in an Iranian Data Scarce Watershed Using SWAT.” Journal of Water Management Modeling 23, C391. https://doi.org/10.14796/JWMM.C391
- Shukla, A.K., C.S.P. Ojha, R.P. Singh, L. Pal, and D. Fu. 2019. “Evaluation of TRMM Precipitation Dataset over Himalayan Catchment: The Upper Ganga Basin, India.” Water (Switzerland) 11 (3): 613. https://doi.org/10.3390/w11030613
- Singh, L., and S. Saravanan. 2020. “Evaluation of various spatial rainfall datasets for streamflow simulation using SWAT model of Wunna basin, India.” International Journal of River Basin Management 20 (3): 389–398. https://doi.org/10.1080/15715124.2020.1776305
- Tarek, M., F.P. Brissette, and R. Arsenault. 2020. “Evaluation of the ERA5 reanalysis as a potential reference dataset for hydrological modelling over North America.” Hydrology and Earth System Sciences 24 (5): 2527–2544. https://doi.org/10.5194/hess-24-2527-2020
- Tegegne, G., D.K. Park, and Y.O. Kim. 2017. “Comparison of hydrological models for the assessment of water resources in a data-scarce region, the Upper Blue Nile River Basin.” Journal of Hydrology: Regional Studies 14, 49–66. https://doi.org/10.1016/j.ejrh.2017.10.002
- Thom, V.T., D.N. Khoi, and D.Q. Linh. 2017. “Using gridded rainfall products in simulating streamflow in a tropical catchment – A case study of the Srepok River Catchment, Vietnam.” Journal of Hydrology and Hydromechanics 65 (1): 18–25. https://doi.org/10.1515/johh-2016-0047
- Tran, T.N.D., B.Q. Nguyen, R. Zhang, A. Aryal, M. Grodzka-Łukaszewska, G. Sinicyn, and V. Lakshmi. 2023. “Quantification of Gridded Precipitation Products for the Streamflow Simulation on the Mekong River Basin Using Rainfall Assessment Framework: A Case Study for the Srepok River Subbasin, Central Highland Vietnam.” Remote Sensing 15 (4): 1030. https://doi.org/10.3390/rs15041030
- Venkatesh, K., N.Y. Krakauer, E. Sharifi, and H. Ramesh. 2020. “Evaluating the Performance of Secondary Precipitation Products through Statistical and Hydrological Modeling in a Mountainous Tropical Basin of India.” Advances in Meteorology 2020, 1. https://doi.org/10.1155/2020/8859185
- Vu, M.T., S.V. Raghavan, and S.Y. Liong. 2012. “SWAT use of gridded observations for simulating runoff - A Vietnam river basin study.” Hydrology and Earth System Sciences 16 (8): 2801–2811. https://doi.org/10.5194/hess-16-2801-2012
- Winchell, M., R. Srinivasan, M. Di Luzio, and J. Arnold. 2013. ARCSWAT interface for SWAT2012: User’s guide. Blackland Research Center, Texas AgriLife Research, College Station, 1–464.
- Yang, Y., G. Wang, L. Wang, J. Yu, and Z. Xu. 2014. “Evaluation of gridded precipitation data for driving SWAT model in area upstream of three gorges reservoir.” PLoS ONE 9, 11. https://doi.org/10.1371/journal.pone.0112725
- Yatagai, A., K. Kamiguchi, O. Arakawa, A. Hamada, N. Yasutomi, and A. Kitoh. 2012. “APHRODITE: Constructing a long-term daily gridded precipitation dataset for Asia based on a dense network of rain gauges.” Bulletin of the American Meteorological Society 93 (9): 1401–1415. https://doi.org/10.1175/BAMS-D-11-00122.1
- Zhou, J., Y. Liu, H. Guo, and D. He. 2014. “Combining the SWAT model with sequential uncertainty fitting algorithm for streamflow prediction and uncertainty analysis for the Lake Dianchi Basin, China.” Hydrological Processes 28 (3): 521–533. https://doi.org/10.1002/hyp.9605