Scalable Techniques for the Analysis of Large-scale Materials Data
Many physical systems of fundamental and industrial importance are significantly affected by the development of new materials. By establishing process-structure-property relationship one can design new, tailor-made materials that possess desired properties. Conventional experimental and analytical techniques like first-principle calculations, though accurate, are extremely tedious and resource-intensive resulting in a significant gap between the time of discovery of a new material and the time it is put to engineering practice. Furthermore, huge amounts of data produced by these techniques poses a tough challenge in terms of analysis. This thesis addresses the challenges in analyzing huge datasets by leveraging the advanced mathematical and computational techniques in order to establish process-structure-property relationship of materials.
First of the three parts of this thesis describes application of dimensionality reduction (DR)
techniques to analyze a dataset of apatites described in structural descriptor space. This data reveals interesting correlations between structural descriptors like ionic radius and covalence with characteristic properties like apatite stability; information crucial to promote the use of apatites as an antidote in lead poisoning. Second part of the thesis describes a parallel spectral DR framework that can process thousands of points lying in a million dimensional space, which is beyond the reach of currently available tools. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify the optimal processing parameters. Third significant approach discussed in this thesis includes applying well-studied graph-theoretic methods to analyze large datasets produced from Atom Probe Tomography (APT) to quantify the morphology of precipitates in a solvent material. The above three mathematical models and computational strategies were applied to large-scale materials data in order to establish process-structure-property relationship.