Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis

Thumbnail Image
Date
2023-04-26
Authors
Lu, Hao
Thelen, Adam
Fink, Olga
Hu, Chao
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
arXiv
Abstract
Operators from various industries have been pushing the adoption of wireless sensing nodes for industrial monitoring, and such efforts have produced sizeable condition monitoring datasets that can be used to build diagnosis algorithms capable of warning maintenance engineers of impending failure or identifying current system health conditions. However, single operators may not have sufficiently large fleets of systems or component units to collect sufficient data to develop datadriven algorithms. Collecting a satisfactory quantity of fault patterns for safety-critical systems is particularly difficult due to the rarity of faults. One potential solution to overcome the challenge of having limited or not sufficiently representative datasets is to merge datasets from multiple operators with the same type of assets. This could provide a feasible approach to ensure datasets are large enough and representative enough. However, directly sharing data across the company’s borders yields privacy concerns. Federated learning (FL) has emerged as a promising solution to leverage datasets from multiple operators to train a decentralized asset fault diagnosis model while maintaining data confidentiality. However, there are still considerable obstacles to overcome when it comes to optimizing the federation strategy without leaking sensitive data and addressing the issue of client dataset heterogeneity. This is particularly prevalent in fault diagnosis applications due to the high diversity of operating conditions and system configurations. To address these two challenges, we propose a novel clustering-based FL algorithm where clients are clustered for federating based on dataset similarity. To quantify dataset similarity between clients without explicitly sharing data, each client sets aside a local test dataset and evaluates the other clients’ model prediction accuracy and uncertainty on this test dataset. Clients are then clustered for FL based on relative prediction accuracy and uncertainty. Experiments on three bearing fault datasets, two publicly available and one newly collected for this work, show that our algorithm significantly outperforms FedAvg and a cosine similarity-based algorithm by 5:1% and 30:7% on average over the three datasets. Further, using a probabilistic classification model has the additional advantage of accurately quantifying its prediction uncertainty, which we show it does exceptionally well.
Series Number
Journal Issue
Is Version Of
Versions
Series
Type
Preprint
Comments
This is a pre-print of the article Lu, Hao, Adam Thelen, Olga Fink, Chao Hu, and Simon Laflamme. "Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis." arXiv preprint arXiv:2304.13275 (2023). DOI: 10.48550/arXiv.2304.13275. Copyright 2023 The Authors. Posted with permission.
Rights Statement
Copyright
Funding
DOI
Supplemental Resources
Collections