Applications of SSM sequence models in robotics, bioinformatics, and speech recognition

Thumbnail Image
Date
2019-01-01
Authors
Sukhoy, Vladimir
Major Professor
Advisor
Alexander . Stoytchev
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Humans can quickly read text where some letters in words are rearranged. Several psychological theories that tried to explain this transposed-letter effect introduced the open bigram concept, which refers to an ordered pair of letters within a word. The two letters that form an open bigram do not have to be adjacent, which distinguishes it from a regular bigram. The SSM model is a computational representation of sequences inspired by these ideas. It extended the definition of an open bigram to make it possible to count them efficiently by an algorithm. These counters can be arranged in a matrix, which is called the SSM matrix.

This dissertation describes three applications of SSMs, one in each of three domains, i.e., robotics, bioinformatics, and speech recognition. Each application is a data processing pipeline that combines SSMs with standard algorithms. In robotics, the experiments evaluated three SSM-based systems that recognized household objects from sensorimotor trajectories recorded by a robot while it was performing exploratory behaviors on them. The results show that the recognition accuracy can be improved with the new approach compared to the previous work. In bioinformatics, this dissertation describes an SSM-based pipeline for inferring locations of proteins in cells from their amino acid sequences. In this case, the system accuracy was compared to the previous work that used counters of regular n-grams. The results show that several hundreds of SSM-based features combined with histogram counters can be more informative for this problem than hundreds of thousands of regular n-gram counters. In speech recognition, an SSM-based system for recognizing isolated spoken digits was built around SSMs by combining them with standard algorithms for acoustic feature extraction and supervised learning. The evaluation used TIDIGITS, probably the most popular benchmark data set for evaluating small-vocabulary systems. The results show that the system achieved state-of-the-art accuracies for this problem with this data set.

Series Number
Journal Issue
Is Version Of
Versions
Series
Academic or Administrative Unit
Type
dissertation
Comments
Rights Statement
Copyright
Sun Dec 01 00:00:00 UTC 2019
Funding
DOI
Supplemental Resources
Source