An Ensemble Learning Approach to Predict Iris Species Using Random Forest and XGBoost
Authors:
Nithin Reddy Gadicharla (Elan Technologies)
Sharath Chandra Macha (Independent Researcher USA)
Abstract

The classification of plant species is vital in botanical research, ecological tracking, and farming practices. The present research is geared towards classifying Iris flower species with sophisticated ensemble machine learning techniques, namely Random Forest and Extreme Gradient Boosting (XGBoost), aimed at attaining improved predictive accuracy and resilience compared to traditional classifiers like logistic regression and decision trees. We use the famous Iris dataset, containing sepal length, sepal width, petal length, and petal width measures, to build and test models. Our research approach involves rigorous data preprocessing, exploratory data analysis (EDA), and feature importance analysis to determine the most distinguishing floral attributes. We use multiple performance metrics to assess the models, such as accuracy, precision, recall, F1-score, and the confusion matrix. This helps to ensure that we gain a full picture, and not just correctness. Experimental outcomes show that ensemble models improve classification accuracy and have better generalisations and stability than single classifiers. Visualisations of feature importance also provide useful information on how certain physical traits are biologically significant in distinguishing between species. This work demonstrates the power of ensemble learning with botanical data and proposes its applicability in more complicated plant classification problems, enabling machine learning integration in contemporary plant science and agricul- tural informatics. Index TermsIris dataset, ensemble learning, Random Forest, XGBoost, classification, feature importance, supervised machine learning, botanical datasets, predictive modeling, species identi- fication.

📄 Download Full Paper (PDF)
Published in: NCAIDT 2025 Proceedings
DOI: 10.63169/NCAIDT2025.p10
Paper ID: NCAIDT2025-0465