Enhanced Skin Cancer Classification with a Attention-CNN-Transformer Model
Main Article Content
Abstract
Skin cancer is one of the largest health threats globally, and it requires easy-to-use and dependable diagnostic tools to ensure pleasing detection in time. An Introduction to the hybrid Attention-CNN-Transformer model for enhanced classification of skin lesion image Our proposed model can overcome the limitation by combining the feature extraction strength of CNNs with the global contextual modelling ability of transformers and attention mechanisms. Index Terms—Image segmentation, HAM10000 dataset, malignant melanoma, basal cell carcinoma. 1 Introduction The following experimentations were done using the HAM10000 dataset, which comprises more than 10,000 dermatoscopic images with seven classes including melanoma and basal cell carcinoma. The dataset was then split into training (70%), validation (20%), and test (10%) for the robust evaluation of the model performance. Results show that our proposed hybrid model produced a high prediction accuracy of 92.4%, even higher compared to recent hybrid models (R. Sharma et al.: 88.3%, A. Shrestha et al.: 89.7%). In this context, the model achieved high area under the curve (AUC) [0.95] for critical classes (indicating excellent discriminatory power), compared to an AUC of 0.900.93 reported in other studies. With balanced performance across all classes, the macro-averaged F1 score was 0.90. We obtained Grad-CAM visualizations to confirm the effectiveness of the attention mechanism in focusing attention in skin hilum, which in turn improved the interpretability of the model heavy task for a clinical purpose a pre-requisite for its implementation in healthcare setup. Finally, the competitive evaluation of new approaches presented such as standard CNNs and transfer learning frameworks, i.e., VGG16 and InceptionV3, outperforms other clinically their accuracy, recall, and precision. The attention mechanism in the proposed system was crucial for achieving attention over important features and the transformer layers supported understanding of contextual dependencies. With these innovations, the reliability and robustness of the model increased for classifying skin conditions, making it suitable for clinical applications.