Applications of SSM sequence models in robotics, bioinformatics, and speech recognition

Sukhoy, Vladimir
Journal Title
Journal ISSN
Volume Title
Source URI
Research Projects
Organizational Units
Journal Issue

Humans can quickly read text where some letters in words are rearranged. Several psychological theories that tried to explain this transposed-letter effect introduced the open bigram concept, which refers to an ordered pair of letters within a word. The two letters that form an open bigram do not have to be adjacent, which distinguishes it from a regular bigram. The SSM model is a computational representation of sequences inspired by these ideas. It extended the definition of an open bigram to make it possible to count them efficiently by an algorithm. These counters can be arranged in a matrix, which is called the SSM matrix.

This dissertation describes three applications of SSMs, one in each of three domains, i.e., robotics, bioinformatics, and speech recognition. Each application is a data processing pipeline that combines SSMs with standard algorithms. In robotics, the experiments evaluated three SSM-based systems that recognized household objects from sensorimotor trajectories recorded by a robot while it was performing exploratory behaviors on them. The results show that the recognition accuracy can be improved with the new approach compared to the previous work. In bioinformatics, this dissertation describes an SSM-based pipeline for inferring locations of proteins in cells from their amino acid sequences. In this case, the system accuracy was compared to the previous work that used counters of regular n-grams. The results show that several hundreds of SSM-based features combined with histogram counters can be more informative for this problem than hundreds of thousands of regular n-gram counters. In speech recognition, an SSM-based system for recognizing isolated spoken digits was built around SSMs by combining them with standard algorithms for acoustic feature extraction and supervised learning. The evaluation used TIDIGITS, probably the most popular benchmark data set for evaluating small-vocabulary systems. The results show that the system achieved state-of-the-art accuracies for this problem with this data set.

Bioinformatics, Open Bigrams, Robotics, Speech Recognition, SSM Sequence Model