Abstract
The precise categorization of medical cases is crucial in the field of disease diagnosis. Traditional machine learning techniques, such as ensemble learning with bagging, have shown promising results in this domain. However, the performance of these methods heavily relies on the quality of the bags, i.e., the instances selected for training the base classifiers. In order to overcome this drawback, we provide a brand-new hybrid strategy that optimizes the bag composition by fusing a genetic algorithm (GA) with teaching-learning-based optimization (TLBO). The TLBO algorithm then optimizes the bags’ composition by iteratively selecting the best bags based on fitness and in the learning phase of TLBO it improves the worst performing bag through hybrid optimization. In this study, dynamic bag size has been used for varied subset creation, which minimized overfitting and enhanced adaptability. A distinctive fitness function that balances accuracy and diversity has also been proposed. In this process, a set of base classifiers are trained on the instances within the bags. The ensemble accuracy is evaluated using a voting scheme. The proposed hybrid approach was evaluated on a real-world dataset from UCI repository for disease diagnosis and its performance was compared to the traditional bagging method. The proposed approach outperforms the most advanced ensemble model, and statistical evidence indicates that it differs significantly from baseline models.