Water Potability Prediction Using Machine Learning Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.21203/rs.3.rs-2965961/v1
· OA: W4378221105
Water is a crucial and indispensable resource for sustaining human life, and maintaining its quality is of utmost importance for the well-being of individuals. When drinking water becomes contaminated, it poses severe health risks, including diseases like diarrhea, cholera, and various other waterborne ailments. As a result, ensuring safe and clean water becomes crucial to promote public health. Recent findings indicate that a significant number of approximately 3,575,000 people lose their lives each year due to water-related illnesses. Therefore, accurate prediction of water potability has the potential to substantially reduce the incidence of such diseases. Notably, machine learning algorithms have emerged as powerful tools for effectively predicting water quality, enabling timely and precise monitoring of water resources. This research focuses on multiple algorithms to forecast water potability based on the physicochemical properties of water samples obtained from the Drinking Water dataset available on Kaggle. This dataset comprises nine distinct parameters, namely pH, hardness, solids, chloramines, sulfates, trihalomethanes, organic carbon, conductivity, and turbidity. By employing various algorithms, such as Random Forest, Logistic Regression, SVM, XGBoost and KNN, we aim to determine the potability of drinking water. Notably, the XGBoost algorithm demonstrates superior performance compared to traditional ML models, achieving an impressive accuracy of 99.5%, precision of 0.99, sensitivity of 0.99, specificity of 1.0, and F1 score of 0.99. Additionally, the Random Forest algorithm also performs well, yielding an accuracy of 74%. Consequently, this research holds significant promise in providing reliable water quality data to researchers, water management personnel, and policymakers, thereby enhancing the effectiveness of water potability monitoring.