Latent space analysis and alignment for cross-Language code translation

Zhao, Xiaoquan

Latent space analysis and alignment for cross-Language code translation

File

Latent space analysis and alignment for cross-Language code translation_Xiaoquan Zhao.pdf (5.11 MB)

Date

2024-12

Authors

Zhao, Xiaoquan

Major Professor

Mitra, Simanta

Committee Member

Prabhu, Gurpur

Abstract

The motivation for this work is to better understand the latent representations learned by neural networks and how these representations align with human-perceivable concepts. Neural networks often operate as black-box systems, making it challenging to interpret the meaning of their internal activations. This study investigates the organization of these latent representations in the encoder and decoder modules of a language model. Using a code translation task between Java and C#, layer activations are extracted and grouped using K-means clustering. Metrics are applied to evaluate the semantic alignment and bidirectional consistency of the clusters, as well as their structural similarities between source and target language representations. This approach aims to provide insights into the organization of neural representations, offering a basis for further analysis of their alignment with meaningful, interpretable patterns.

Academic or Administrative Unit

Department of Computer Science

Type

creative component

Rights Statement

Attribution 3.0 United States

Copyright

2024