Decomposing deep learning models into modules
Is Version Of
Deep learning-based software is prevalently being utilized in various applications, i.e., image classification, autonomous driving, and medical analysis. To build these models, developers collect data, craft the architecture of the model, and finally, train the model with the available data. To make any changes or build a new model, the most common way is to train a model from scratch. This scenario is very similar to traditional software in the pre-modular days with limited reusability. However, with the notion of decomposition, monolithic software is decomposed into modules that eventually make the software development and maintenance process more manageable, flexible, and comprehensible. With that motivation, we ask, can we decompose deep learning models into modules? Could this decomposition lead to fine-grained reusability and replaceability? To answer these questions, in this work, we have shown that it is feasible to decompose fully connected neural networks (FCNN) into modules, one for each output class in the training dataset. Each module takes the same input that the model does but acts as a binary classifier. These decomposed modules can be reused to build a new problem without needing retraining. Also, one module can be replaced by another without affecting other modules. As the next step, we have shown that it is also possible to decompose more complex models, i.e., convolutional neural networks (CNN), into modules. Since edges in an FCNN have a one-to-one relationship between nodes in consecutive layers, whereas, in CNN, edges are shared among the input and output nodes, the previous approach cannot directly be applied to CNN. Also, the previous approach does not work for other CNN-related layers, i.e., merge. To that end, we apply a decomposition strategy that leverages unsharing the shared weight and bias by using a mapping-based technique that stores the position of the nodes that are not part of a module. We also show how these reusable and replaceable modules perform compared to models trained from scratch to solve similar problems. While these two contributions show the possibilities of enabling the two benefits of decomposition, i.e., fine-grained reusability and replaceability, we believe that other benefits can be had, e.g., hiding changes and understanding the logic. We evaluated these decomposition strategies and showed that they do not lead to a significant loss of accuracy when compared to the original models. Also, we found that these decomposed modules can be reused and replaced to build new problems without the need to retrain a model from scratch. To understand how nodes at each hidden layer interact with others, we apply an approach for decomposition that splits a DL model into modules that are connected using three conditional clauses, i.e., AND, OR, and NOT. We call this approach structured decomposition of deep learning models. Finally, we also show how decomposition-based approaches can hide the changes to fewer modules. To do that, first, we identify the changes that a trained DL model undergoes by studying GitHub repositories. As a result, we found 8 types of changes. Then, we evaluated decomposition approaches and found out of 8, 5 changes can be hidden by applying decomposition-based approaches.