Analyzing redundancy in code-trained language models
dc.contributor.advisor | Jannesari, Ali | |
dc.contributor.advisor | Quinn, Christopher J | |
dc.contributor.advisor | Li, Yang | |
dc.contributor.author | Sharma, Arushi | |
dc.contributor.department | Department of Computer Science | |
dc.date.accessioned | 2025-02-11T17:33:05Z | |
dc.date.available | 2025-02-11T17:33:05Z | |
dc.date.issued | 2024-12 | |
dc.date.updated | 2025-02-11T17:33:06Z | |
dc.description.abstract | Code-trained language models have proven to be highly effective for various code intelligence tasks. However, they can be challenging to train and deploy due to computational bottlenecks and memory constraints. Implementing effective strategies to address these issues requires a better understanding of these ’black box’ models. In this paper, I perform a neuron-level analysis of code-trained language models on three different software engineering and one high performance computing downstream task. I identify important neurons within latent representations by eliminating neurons that are highly similar or irrelevant to the given task. This approach helps us understand which neurons and layers can be eliminated (redundancy analysis) and where important code properties are located within the network (concept analysis). We find that over 95% of the neurons can be eliminated without significant loss in accuracy for our code intelligence tasks. We also discover several compositions of neurons that can make predictions with baseline accuracy. Additionally, I explore the traceability and distribution of human-recognizable concepts within latent representations. I also demonstrate the effectiveness of our redundancy approach by creating an efficient transfer learning pipeline. | |
dc.format.mimetype | ||
dc.identifier.orcid | 0009-0008-2089-356X | |
dc.identifier.uri | https://dr.lib.iastate.edu/handle/20.500.12876/qzoD8W2w | |
dc.language.iso | en | |
dc.language.rfc3066 | en | |
dc.subject.disciplines | Computer science | en_US |
dc.subject.disciplines | Computer science | en_US |
dc.subject.keywords | Interpretability | en_US |
dc.subject.keywords | Neural Networks | en_US |
dc.subject.keywords | Pretrained language models | en_US |
dc.subject.keywords | Redundancy | en_US |
dc.title | Analyzing redundancy in code-trained language models | |
dc.type | thesis | en_US |
dc.type.genre | thesis | en_US |
dspace.entity.type | Publication | |
relation.isOrgUnitOfPublication | f7be4eb9-d1d0-4081-859b-b15cee251456 | |
thesis.degree.discipline | Computer science | en_US |
thesis.degree.discipline | Computer science | en_US |
thesis.degree.grantor | Iowa State University | en_US |
thesis.degree.level | thesis | $ |
thesis.degree.name | Master of Science | en_US |
File
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Sharma_iastate_0097M_21894.pdf
- Size:
- 3.05 MB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 0 B
- Format:
- Item-specific license agreed upon to submission
- Description: