A Predictive Model for Early Asthma Detection
Abstract
Asthma is a respiratory disease that affects millions of people and has become one of the major causes of death worldwide. Early predictions of asthma can help health workers take the necessary precautions to prevent further complications. The traditional ways of asthma prediction are no longer effective because they are prone to error and time-consuming. Studies have employed sophisticated techniques, such as machine learning and deep learning, for asthma prediction, yielding promising results. However, previous research failed to consider datasets from different demographics, limiting the models to a particular population. Also, previous research has failed to perform a thorough comparative analysis between ensemble, machine learning, and deep learning models. This research addresses these gaps by developing a comparative multi-source predictive framework using four different algorithms, including Random Forest (RF), Support Vector Machine (SVM), Multi-Layer Perceptron, and a hybrid Stacking Ensemble model (SVM + RF + MLP), using datasets collected from Federal Teaching Hospital Lokoja, Specialist Hospital Lokoja, and Kaggle data. The dataset undergoes the process of an 80/20 stratified train/test split, removing of low variance using a threshold of 0.01. The training set was balanced using SMOTETomek. The three models (RF, SVM, MLP) were tuned using GridSearchCV hyperparameter optimization (with 5-fold cross-validation) before combining the best-performing models in a stacking ensemble (SVM + RF + MLPF) with a LogisticRegression meta-learner, leveraging kernel-based, tree-based, and neural network models for improved predictive performance. The models' performances were compared using Accuracy, Recall, F1-Score, and AUC-ROC. The hybrid stacking ensemble model achieved the highest AUC-ROC (0.9910) while maintaining 99.23% accuracy and an F1-score of 0.9920. RF and the ensemble models identified the most important predictors, variables that distinguish between asthmatic and non-asthmatic patients. The study demonstrates that integrating heterogeneous datasets improves predictive robustness and provides a strong foundation for real-world asthma detection systems.
Downloads
Published
How to Cite
Issue
Section
Copyright (c) 2026 Sabdat Ahmed

This work is licensed under a Creative Commons Attribution 4.0 International License.
