A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models

Bangalore Vijayakumar, Shreyas

A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models

dc.contributor.author	Bangalore Vijayakumar, Shreyas
dc.contributor.department	Department of Electrical and Computer Engineering
dc.contributor.majorProfessor	Somani, Arun K.
dc.date.accessioned	2024-08-22T20:18:42Z
dc.date.available	2024-08-22T20:18:42Z
dc.date.copyright	2024
dc.date.issued	2024-08
dc.description.abstract	Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we perform an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieve exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications.
dc.identifier.doi	https://doi.org/10.31274/cc-20250502-119
dc.identifier.uri	https://dr.lib.iastate.edu/handle/20.500.12876/105857
dc.language.iso	en_US
dc.rights.holder	Shreyas Bangalore Vijayakumar
dc.subject.disciplines	DegreeDisciplines::Engineering::Computer Engineering
dc.subject.disciplines	DegreeDisciplines::Engineering::Electrical and Computer Engineering
dc.subject.keywords	convolutional neural networks
dc.subject.keywords	vision transformers
dc.subject.keywords	deep-learning framework
dc.subject.keywords	PyTorch
dc.subject.keywords	COVID-19
dc.subject.keywords	ConVision Benchmark
dc.title	A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models
dc.type	creative component
dc.type.genre	creative component
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	a75a044c-d11e-44cd-af4f-dab1d83339ff
thesis.degree.discipline	Computer Engineering
thesis.degree.level	Masters
thesis.degree.name	Master of Science

File

Original bundle

Now showing 1 - 1 of 1

Name:: Bangalore Vijayakumar, Sh CC 124.pdf
Size:: 6.2 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Creative Components