TY - JOUR
T1 - A methodology for integrating time-lagged rainfall and river flow data into machine learning models to improve prediction of quality parameters of raw water supplying a treatment plant
AU - Ortiz-Lopez, Christian
AU - Torres, Andres
AU - Bouchard, Christian
AU - Rodriguez, Manuel
N1 - Publisher Copyright:
© 2023 The Authors.
PY - 2023/11/1
Y1 - 2023/11/1
N2 - Rainfall and increased river flow can deteriorate raw water (RW) quality parameters such as turbidity and UV absorbance at 254 nm. This study aims to develop a methodology for integrating both time-lagged watershed rainfall and river flow data into machine learning models of the quality of RW supplying a drinking water treatment plant (DWTP). Spearman’s rank non-parametric cross-correlation analyses were performed using both river flow and rain in the watershed and RW data from the water intake. Then, RW turbidity and RW UV254 were modelled, using a support vector regression (SVR) and an artificial neural network (ANN) under several prediction scenarios with time-lagged variables. River flow presented a very strong correlation with RW quality, whereas rainfall showed a moderate correlation. Time lags with maximum correlations between flow data and turbidity were a few hours, while for UV254, they were between 2 and 4 days, demonstrating varied time lags and a complex behaviour. The best performing scenario was the one that used time-lagged watershed rainfall and river flow as input data. The ANN performed better for both turbidity and UV254 than SVR. Results from this study suggest the possibility for new modelling strategies and more accurate chemical dosing for the removal of key contaminants.
AB - Rainfall and increased river flow can deteriorate raw water (RW) quality parameters such as turbidity and UV absorbance at 254 nm. This study aims to develop a methodology for integrating both time-lagged watershed rainfall and river flow data into machine learning models of the quality of RW supplying a drinking water treatment plant (DWTP). Spearman’s rank non-parametric cross-correlation analyses were performed using both river flow and rain in the watershed and RW data from the water intake. Then, RW turbidity and RW UV254 were modelled, using a support vector regression (SVR) and an artificial neural network (ANN) under several prediction scenarios with time-lagged variables. River flow presented a very strong correlation with RW quality, whereas rainfall showed a moderate correlation. Time lags with maximum correlations between flow data and turbidity were a few hours, while for UV254, they were between 2 and 4 days, demonstrating varied time lags and a complex behaviour. The best performing scenario was the one that used time-lagged watershed rainfall and river flow as input data. The ANN performed better for both turbidity and UV254 than SVR. Results from this study suggest the possibility for new modelling strategies and more accurate chemical dosing for the removal of key contaminants.
KW - climatic and hydrological events
KW - cross-correlation coefficient
KW - raw water modelling
KW - source water quality
KW - time-lagged correlations
UR - http://www.scopus.com/inward/record.url?scp=85179096306&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/836aa2b1-1135-340e-b751-0b892f9d1514/
U2 - 10.2166/hydro.2023.122
DO - 10.2166/hydro.2023.122
M3 - Article
AN - SCOPUS:85179096306
SN - 1464-7141
VL - 25
SP - 2406
EP - 2426
JO - Journal of Hydroinformatics
JF - Journal of Hydroinformatics
IS - 6
ER -