"Empirical Analysis of Transformer Models and Pretrained Convolutional Neural Networks for Medicinal Plant Identification and Classification"

Sheetal S. Patil

PDF

Published: Aug 5, 2024

Keywords:

Computer Vision, Pretrained CNN, Transformer, EfficientNetB7, VIT, Medicinal plant classification.

Sheetal S. Patil, Suhas H. Patil, Avinash M. Pawar, Gauri R. Rao, Rohini B. Jadhav, Dharmesh Dhabliya

Abstract

This paper explores the classification of medicinal plants using computer vision, specifically focusing on the effectiveness of pretrained models and vision transformers. The study aims to compare two classification approaches: transfer learning using pretrained models and the application of vision transformers, which leverage self-attention mechanisms. By utilizing a dataset comprising leaf images of 10 distinct medicinal plant species, we conducted an empirical analysis with an 80:20 split ratio, the data was separated into training and testing sets. Preprocessing techniques, including class balance and data augmentation (small adjustments to data like (cropping, adding noise) were used to optimize the dataset.

The EfficientNetB7, a pretrained Convolutional Neural Network (CNN), was employed, achieving a Test Accuracy: 98.25% with learning rate 1e-5. In contrast, the VIT (Vision Transformer) model, a vision transformer-based approach, outperformed EfficientNetB7(Pretrained CNN) a test accuracy of 98.80 % with learning rate 1e-4.Despite its higher accuracy, the VIT model required greater computational resources and longer training times.

Issue

Vol. 13 No. 2 (2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details