Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis

Lu, Hao; Thelen, Adam; Fink, Olga; Hu, Chao; Laflamme, Simon

Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis

File

2023-LaflammeSimonFederatedLearning.pdf (6.94 MB)

Date

2023-04-26

Authors

Lu, Hao

Thelen, Adam

Fink, Olga

Hu, Chao

Laflamme, Simon

Publisher

arXiv

Abstract

Operators from various industries have been pushing the adoption of wireless sensing nodes for industrial monitoring, and such efforts have produced sizeable condition monitoring datasets that can be used to build diagnosis algorithms capable of warning maintenance engineers of impending failure or identifying current system health conditions. However, single operators may not have sufficiently large fleets of systems or component units to collect sufficient data to develop datadriven algorithms. Collecting a satisfactory quantity of fault patterns for safety-critical systems is particularly difficult due to the rarity of faults. One potential solution to overcome the challenge of having limited or not sufficiently representative datasets is to merge datasets from multiple operators with the same type of assets. This could provide a feasible approach to ensure datasets are large enough and representative enough. However, directly sharing data across the company’s borders yields privacy concerns. Federated learning (FL) has emerged as a promising solution to leverage datasets from multiple operators to train a decentralized asset fault diagnosis model while maintaining data confidentiality. However, there are still considerable obstacles to overcome when it comes to optimizing the federation strategy without leaking sensitive data and addressing the issue of client dataset heterogeneity. This is particularly prevalent in fault diagnosis applications due to the high diversity of operating conditions and system configurations. To address these two challenges, we propose a novel clustering-based FL algorithm where clients are clustered for federating based on dataset similarity. To quantify dataset similarity between clients without explicitly sharing data, each client sets aside a local test dataset and evaluates the other clients’ model prediction accuracy and uncertainty on this test dataset. Clients are then clustered for FL based on relative prediction accuracy and uncertainty. Experiments on three bearing fault datasets, two publicly available and one newly collected for this work, show that our algorithm significantly outperforms FedAvg and a cosine similarity-based algorithm by 5:1% and 30:7% on average over the three datasets. Further, using a probabilistic classification model has the additional advantage of accurately quantifying its prediction uncertainty, which we show it does exceptionally well.

Academic or Administrative Unit

Department of Civil, Construction and Environmental Engineering

Mechanical Engineering

Type

Preprint

Comments

This is a pre-print of the article Lu, Hao, Adam Thelen, Olga Fink, Chao Hu, and Simon Laflamme. "Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis." arXiv preprint arXiv:2304.13275 (2023). DOI: 10.48550/arXiv.2304.13275. Copyright 2023 The Authors. Posted with permission.