New Memory Architecture for Deep Learning Applications

Recently, a number of neural network processors have been developed to efficiently process deep neural networks. Most these processors include a memory system with a large capacity of on-chip SRAMs as well as high-bandwidth off-chip DRAMs. In order to fully utilize the hardware resources of neural processors, efficient access of both on-chip SRAMs and off-chip DRAMs is essential. This tutorial presents traditional and state-of-art optimization techniques for memory access for neural network applications. State-of-art neural processors are introduced and common characteristics of the memory systems are explained. The optimization techniques for the data access of an on-chip SRAM are introduced. The scheduling, parallelization and data allocation of various deep learning algorithms are presented and the pros and cons of optimizations for a given memory system are explained. Additional data optimization for efficient access of an off-chip DRAM is explained in the next. To this end, the basic characteristics of a DRAM organization are introduced and then the data access scheduling for efficient DRAM access is explained. As the last subject of this tutorial, future memory systems are introduced. Processing-in-Memory (PIM) and Approximate Memory (AM) architecture are briefly introduced and data optimizations for PIM and AM in deep neural processing are presented. A SCM (Storage Class Memory), a potential new memory hierarchy, is introduced and data access techniques for them are also presented.

Join CASS

Join CASS

Join CASS

Visit CASS MiLe

Join CASS

ISCAS 2025

2025 IEEE International Symposium on Circuits and Systems

2025 23nd International Forum on MPSoC for Software-Defined Hardware (MPSoC)

2025 IEEE 23rd Interregional NEWCAS Conference

New Memory Architecture for Deep Learning Applications

Hyuk-Jae Lee

Presentation Menu