A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models

dc.contributor.author Bangalore Vijayakumar, Shreyas
dc.contributor.department Department of Electrical and Computer Engineering
dc.contributor.majorProfessor Somani, Arun K.
dc.date.accessioned 2024-08-22T20:18:42Z
dc.date.available 2024-08-22T20:18:42Z
dc.date.copyright 2024
dc.date.issued 2024-08
dc.description.abstract Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we perform an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieve exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications.
dc.identifier.doi https://doi.org/10.31274/cc-20250502-119
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/105857
dc.language.iso en_US
dc.rights.holder Shreyas Bangalore Vijayakumar
dc.subject.disciplines DegreeDisciplines::Engineering::Computer Engineering
dc.subject.disciplines DegreeDisciplines::Engineering::Electrical and Computer Engineering
dc.subject.keywords convolutional neural networks
dc.subject.keywords vision transformers
dc.subject.keywords deep-learning framework
dc.subject.keywords PyTorch
dc.subject.keywords COVID-19
dc.subject.keywords ConVision Benchmark
dc.title A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models
dc.type creative component
dc.type.genre creative component
dspace.entity.type Publication
relation.isOrgUnitOfPublication a75a044c-d11e-44cd-af4f-dab1d83339ff
thesis.degree.discipline Computer Engineering
thesis.degree.level Masters
thesis.degree.name Master of Science
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Bangalore Vijayakumar, Sh CC 124.pdf
Size:
6.2 MB
Format:
Adobe Portable Document Format
Description: