Страница публикации

A transfer Learning-Based LSTM strategy for imputing Large-Scale consecutive missing data and its application in a water quality prediction system

Авторы: Chen Z., Xu H., Jiang P., Yu S., Lin G., Bychkov I., Hmelnov A., Ruzhnikov G., Zhu N., Liu Z.

Журнал: Journal of Hydrology

Том: 602


Год: 2021

Отчётный год: 2021


Местоположение издательства:


Аннотация: In recent years, water quality monitoring has been crucial to improve water resource protection and management. Under the relevant laws and regulations, environmental protection department agencies monitor lakes, streams, rivers, and other types of water bodies to assess water quality conditions. The valid and high-quality data generated from these monitoring activities help water resource managers understand the existing pollution situations, energy consumption problems and pollution control needs. However, there are inevitably many problems with water quality data in the real world due to human mistakes or system failures. One of the most frequently occurring issues is missing data. Although most existing studies have explored classic statistical methods or emerging machine/deep learning methods to fill gaps in data, these methods are not suitable for large-scale consecutive missing data problems. To address this issue, this paper proposes a novel algorithm called TrAdaBoost-LSTM, which integrates state-of-the-art deep learning theory through long short-term memory (LSTM) and instance-based transfer learning through TrAdaBoost. This model inherits the full advantages of the LSTM model and transfer learning technique, namely the powerful ability to capture the long-term dependencies among time series and the flexibility of leveraging the related knowledge from complete datasets to fill in large-scale consecutive missing data. A case study involving Dissolved Oxygen concentrations obtained from water quality monitoring stations is conducted to validate the effectiveness and superiority of the proposed method. The results show that the proposed TrAdaBoost-LSTM model not only improves the imputation accuracy by 15%~25% compared with that of alternative models based on the obtained performance indicators, but also provides potential ideas for similar future research.

Индексируется WOS: 1

Индексируется Scopus: 1

Индексируется РИНЦ: 1

Публикация в печати: 0

Добавил в систему: