COMPARATIVE ANALYSIS OF MISSING DATA IMPUTATION METHODS FOR FLOOD FEATURES FROM LANGAT RIVER IN SELANGOR, MALAYSIA

Authors

  • Ainaa Hanis Zuhairi Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Fitri Yakub Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Aizul Nahar Harun Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Mas Omar Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Muhamad Sharifuddin Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Amrul Faruq Department of Electrical Engineering, Universitas Muhammadiyah Malang, 65144 Kota Malang, Indonesia
  • Vijay Sinha Department of Computer Science and Engineering, Chitkara University, 174103 Himachal Pradesh, India
  • Khamarrul Azahari Razak Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia
  • Shahrum Shah Abdullah Malaysia-Japan International Institute of Technology, Universiti Teknologi Malaysia, 54100 Kuala Lumpur, Malaysia

DOI:

https://doi.org/10.22452/mjcs.vol38no3.2

Keywords:

Flood forecast, Missing Data Imputation, Last Observation Carried Forward, Next Observation Carried Backwards, Linear Interpolation, Cubic Spline Interpolation, K-Nearest Neighbours

Abstract

Flooding poses serious risks to lives, infrastructure, and ecosystems, underscoring the need for accurate forecasting. However, missing values in hydrological datasets—often caused by equipment failure or extreme weather—can compromise forecast reliability. This study evaluates five imputation techniques: Last Observation Carried Forward, Next Observation Carried Backward, Linear Interpolation, Spline Interpolation, and K-Nearest Neighbours, to identify the most effective method for reconstructing missing flood-related data. Using temperature, humidity, and water level records from the Langat River, Selangor, Malaysia, each method’s performance was assessed via Root Mean Square Error. Results show that Linear Interpolation generally yields the lowest error, while Next Observation Carried Backward performs best when missing data is minimal (1.20%).

Downloads

Download data is not yet available.

Downloads

Published

2025-09-30

How to Cite

Zuhairi, A. H. ., Yakub, F. ., Harun, A. N. ., Omar, M. ., Sharifuddin, M. ., Faruq, A. ., Sinha, V. ., Razak, K. A. ., & Abdullah, S. S. . (2025). COMPARATIVE ANALYSIS OF MISSING DATA IMPUTATION METHODS FOR FLOOD FEATURES FROM LANGAT RIVER IN SELANGOR, MALAYSIA. Malaysian Journal of Computer Science, 38(3), 224–237. https://doi.org/10.22452/mjcs.vol38no3.2