A Study on the Classification and Prediction of Environmental Factors based on AI using Transcriptome Data
This study aims to evaluate the effects of various environmental pollutants such as heavy metals and microplastics on aquatic ecosystems by applying machine learning techniques to the transcriptome data of aquatic model organisms such as water fleas to rapidly identify and predict environmental factors to which they are exposed.
Transcriptome data on environmental factors that may cause environmental problems in aquatic ecosystems were collected from open source databases using the water flea (Daphnia magna), a standard test organism for toxicity assessment designated by the OECD. The collected transcriptome data was processed by performing quality checks and trimming, and differentially expressed genes according to environmental factors were discovered. Feature-selection was performed to select differentially expressed genes that can accurately identify environmental factors, and the selected features were subjected to various machine learning-based algorithms such as Random Forest and XGBoost to build an optimized model for identifying and predicting environmental factors.
This research could be used in the field by organizations that conduct environmental monitoring of water ecosystems, such as the Korea Environment Institute, Nakdonggang National Institute of Biological Resources, Water Environment Research Institute, and National Institute of Environmental Research. It is expected to help assess the impact of various environmental pollutants on aquatic ecosystems and prevent them.