A methodology for integrating time-lagged rainfall and river flow data into machine learning models to improve prediction of quality parameters of raw water supplying a treatment plant

Christian Ortiz-Lopez, Andres Torres, Christian Bouchard, Manuel Rodriguez

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Rainfall and increased river flow can deteriorate raw water (RW) quality parameters such as turbidity and UV absorbance at 254 nm. This study aims to develop a methodology for integrating both time-lagged watershed rainfall and river flow data into machine learning models of the quality of RW supplying a drinking water treatment plant (DWTP). Spearman’s rank non-parametric cross-correlation analyses were performed using both river flow and rain in the watershed and RW data from the water intake. Then, RW turbidity and RW UV254 were modelled, using a support vector regression (SVR) and an artificial neural network (ANN) under several prediction scenarios with time-lagged variables. River flow presented a very strong correlation with RW quality, whereas rainfall showed a moderate correlation. Time lags with maximum correlations between flow data and turbidity were a few hours, while for UV254, they were between 2 and 4 days, demonstrating varied time lags and a complex behaviour. The best performing scenario was the one that used time-lagged watershed rainfall and river flow as input data. The ANN performed better for both turbidity and UV254 than SVR. Results from this study suggest the possibility for new modelling strategies and more accurate chemical dosing for the removal of key contaminants.

Original languageEnglish
Pages (from-to)2406-2426
Number of pages21
JournalJournal of Hydroinformatics
Volume25
Issue number6
DOIs
StatePublished - 01 Nov 2023

Keywords

  • climatic and hydrological events
  • cross-correlation coefficient
  • raw water modelling
  • source water quality
  • time-lagged correlations

Fingerprint

Dive into the research topics of 'A methodology for integrating time-lagged rainfall and river flow data into machine learning models to improve prediction of quality parameters of raw water supplying a treatment plant'. Together they form a unique fingerprint.

Cite this