A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models
dc.contributor.author | Bangalore Vijayakumar, Shreyas | |
dc.contributor.department | Department of Electrical and Computer Engineering | |
dc.contributor.majorProfessor | Somani, Arun K. | |
dc.date.accessioned | 2024-08-22T20:18:42Z | |
dc.date.available | 2024-08-22T20:18:42Z | |
dc.date.copyright | 2024 | |
dc.date.issued | 2024-08 | |
dc.description.abstract | Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we perform an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieve exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications. | |
dc.identifier.doi | https://doi.org/10.31274/cc-20250502-119 | |
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/105857 | |
dc.language.iso | en_US | |
dc.rights.holder | Shreyas Bangalore Vijayakumar | |
dc.subject.disciplines | DegreeDisciplines::Engineering::Computer Engineering | |
dc.subject.disciplines | DegreeDisciplines::Engineering::Electrical and Computer Engineering | |
dc.subject.keywords | convolutional neural networks | |
dc.subject.keywords | vision transformers | |
dc.subject.keywords | deep-learning framework | |
dc.subject.keywords | PyTorch | |
dc.subject.keywords | COVID-19 | |
dc.subject.keywords | ConVision Benchmark | |
dc.title | A Comprehensive Pytorch Framework to Benchmark CNN and ViT Models | |
dc.type | creative component | |
dc.type.genre | creative component | |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | a75a044c-d11e-44cd-af4f-dab1d83339ff | |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Bangalore Vijayakumar, Sh CC 124.pdf
- Size:
- 6.2 MB
- Format:
- Adobe Portable Document Format
- Description: