Reliable In-memory Computing with Unreliable Devices and Circuits
With the ever-increasing demand of AI algorithms and high-definition sensors, Contemporary microprocessor design is facing tremendous challenges in memory bandwidth (i.e., the von Neumann bottleneck), processing speed and power consumption. Leveraging the advances in device technology and design techniques, in-memory computing (IMC) embeds analog deep-learning operations in the memory array, achieving massively parallel computing with high storage density. On the other side, its performance is still limited by device non-idealities, circuit precision, on-chip interconnection, and algorithm properties. In this talk, we will first review the state-of-the-art IMC design techniques, such as those based on resistive random-access memory (RRAM) and SRAM. Then based on statistical data from a fully integrated 65nm CMOS/RRAM test chip, we will illustrate the bottlenecks of current IMC system, including RRAM variations, the stability of machine learning models, peripheral circuits and interconnection. They interact with each other, limiting the inference accuracy and system energy-delay product (EDP). To efficiently explore design space, we will present a newly developed benchmark simulator, SIAM, which integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP) and DRAM access models to address the bottlenecks in data movement and robustness. Furthermore, we will demonstrate two methods to recover the accuracy loss: training for model stability before mapping to the hardware, and a hybrid SRAM/RRAM architecture for post-mapping recovery. These methods are applied to various datasets as well as a 65nm SRAM/RRAM test chip, helping shed light on future IMC research focus.