An analysis of reconfigurable memory queues/computing units architecture
The clock speed of modern processors allows for the processing of huge amounts of data, but the limited amount of on chip data storage creates situations where the processing bandwidth of the processors overwhelm their memory bandwidth. A solution to this problem is to make portions of the processor dynamically configurable so that they can switch between functioning as processing units and storage units. Pipelined functional units can offer this functionality with only slight hardware and instruction set modifications. However, the additional register storage provided is in the form of a circular queue rather than a random access set of registers. The usefulness of such a set of registers was determined by studying operand access patterns and their potential mapping to such a circular queue. The results demonstrated that while few operand sets map well to circular queues in the general case, targeting specific operations such as matrix multiplication or loop unrolling provides an excellent opportunity to make use of these queues to reduce register pressure and therefore gain an overall speedup.