Seeded transfer learning for road roughness regression

Thumbnail Image
Jiang, Hao
Major Professor
Zhang, Wensheng
Ceylan, Halil
Jannesari, Ali
Committee Member
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Journal Issue
Is Version Of
Computer Science
Road roughness affects riding comfort and vehicle operational cost, and to maintain good road quality, federal and state transportation agencies must routinely survey road roughness. Since traditional methods for such surveys require specialized equipment and training, more efficient methods that collect measurement data from sensors in off-the-shelf mobile/smart devices (such as Android phones and iPhones), have recently been developed. Such sensor data, along with labelled road roughness index value, can be used to construct machine-learning models for inferring road roughness from the sensor data. Since the sensor data generated by different device types/models have different characteristics, in current practice a model is often constructed from the data of one single or small set of device type/model, using only the data from the same configuration for predictions. This means that, despite the potentially large amount of sensor data from a greater variety of devices, labelled data may still be lacking when applying machine learning to construct a model in a specific setting. Transfer learning focuses on extracting knowledge from a source domain, then applying it to a related target domain, and this method has been used to reduce training/labeling costs for a target domain. Based on existing extensive research, in this work we propose a clustering-based seeded-transfer learning approach to address the road roughness modeling and prediction problem. We specifically develop a complete solution for transferring data from a source domain (sensor data collected from devices of type/model A) to a target domain (sensor data collected from devices of type/model B) for model training. Our contributions include: a data scaling step, an implementation to match source clusters and target seeds, and the exploration of clustering methods, optimal number of clusters, and seeds percentage of transfer learning. We evaluate the performance of our approach using both the sensor data set collected from the field and some public data sets. The results show that: K-means clustering is less stable than hierarchical clustering; a moderate seeds percentage is preferred; using a greater number of clusters positively affects model prediction accuracy.
Subject Categories