A study on species identification of eDNA using machine learning for environment monitoring and assessment
This study aims to build an eDNA analysis pipeline by applying artificial intelligence technology to analyze environmental genome information for the purpose of environmental monitoring and assessment.
Discover environmental genome information including metagenome data such as eDNA and eRNA in various water environments. Biological features were extracted from environmental genome data such as eDNA and eRNA metabarcoding and metagenome, and analyzed based on machine learning by applying unsupervised learning algorithms such as K-means, K-Medians, and Mean-Shift in the OTU clustering process and supervised learning algorithms such as Random forest, XGBoost, and Catboost in the taxonomy assignment process. Among them, research has been conducted to build an eDNA analysis pipeline that applies artificial intelligence technology to specifically analyze fish and aquatic insects, which are indicator organisms mainly used for environmental monitoring and evaluation.
This research is expected to be utilized by the National Institute of Environmental Science, the National institute of Biological Resources, the National Institute of Ecology, the Maritime Environment Management Corporation, and the National Park Services to efficiently monitor and assess various environments, including aquatic ecosystems.