Analog memory-based techniques for accelerating the training of fully-connected deep neural networks (Conference Presentation)

Hsinyu Tsai; Stefano Ambrogio; Pritish Narayanan; Robert M. Shelby; Charles Mackin; Geoffrey W. Burr

doi:10.1117/12.2515630

16 August 2019 Analog memory-based techniques for accelerating the training of fully-connected deep neural networks (Conference Presentation)

Hsinyu Tsai, Stefano Ambrogio, Pritish Narayanan, Robert M. Shelby, Charles Mackin, Geoffrey W. Burr

Author Affiliations +

Proceedings Volume 10958, Novel Patterning Technologies for Semiconductors, MEMS/NEMS, and MOEMS 2019; 109580U (2019) https://doi.org/10.1117/12.2515630
Event: SPIE Advanced Lithography, 2019, San Jose, California, United States

Abstract

Crossbar arrays of resistive non-volatile memories (NVMs) offer a novel and innovative solution for deep learning tasks which are typically implemented on GPUs [1]. The highly parallel structure employed in these architectures enables fast and energy-efficient multiply-accumulate computations, which is the workhorse of most deep learning algorithms. More specifically, we are developing analog hardware platforms for acceleration of large Fully Connected (FC) Deep Neural Networks (DNNs) [1,2], where training is performed using the backpropagation algorithm. This algorithm is a supervised form of learning based on three steps: forward propagation of input data through the network (a.k.a. forward inference), comparison of the inference results with ground truth labels and backpropagation of the errors from the output to the input layer, and then in-situ weight updates. This type of supervised training has been shown to succeed even in the presence of a substantial number of faulty NVMs, relaxing yield requirements vis-à-vis conventional memory, where near 100% yield may be required [2]. We recently surveyed the use of analog memory devices for DNN hardware accelerators based on crossbar array structures and discussed design choices, device and circuit readiness, and the most promising opportunities compared digital accelerators [3]. In this presentation, we will focus on our implementation of an analog memory cell based on Phase-Change Memory (PCM) and 3-Transistor 1-Capacitor (3T1C) [4]. Software-equivalent accuracy on various datasets (MNIST, MNIST with noise, CIFAR-10, CIFAR-100) was achieved in a mixed software-hardware demonstration with DNN weights stored in real PCM device arrays as analog conductances. We will discuss how limitations from real-world non-volatile memory (NVM), such as conductance linearity and variability affects DNN training and how using two pairs of analog weights with varying significance relaxes device requirements [5, 6, 7]. Finally, we summarize all pieces needed to build an analog accelerator chip [8] and how lithography plays a role in future development of novel NVM devices. References: [1] G. W. Burr et al., “Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element” IEDM Tech. Digest, 29.5 (2014). [2] G. W. Burr et al., “Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses) using phase-change memory as the synaptic weight element”, IEEE Trans. Elec. Dev, 62(11), pp. 3498 (2015). [3] H. Tsai et al., “Recent progress in analog memory-based accelerators for deep learning”, Journal of Physics D: Applied Physics, 51 (28), 283001 (2018) [4] S. Ambrogio et al., “Equivalent-Accuracy Accelerated Neural Network Training using Analog Memory”, Nature, 558 (7708), 60 (2018). [5] T. Gokmen et al., “Acceleration of deep neural network training with resistive cross-point devices: design considerations”, Frontiers in neuroscience, 10, 333 (2016). [6] S. Sidler et al., “Large-scale neural networks implemented with non-volatile memory as the synaptic weight element: impact of conductance response”, ESSDERC Proc., 440 (2016). [7] G. Cristiano et al., “Perspective on Training Fully Connected Networks with Resistive Memories: Device Requirements for Multiple Conductances of Varying Significance”, accepted in Journal of Applied Physics (2018). [8] P. Narayanan et al., “Toward on-chip acceleration of the backpropagation algorithm using nonvolatile memory”, IBM J. Res. Dev., 61 (4), 1-11 (2017).

Conference Presentation

Citation Download Citation

Hsinyu Tsai, Stefano Ambrogio, Pritish Narayanan, Robert M. Shelby, Charles Mackin, and Geoffrey W. Burr "Analog memory-based techniques for accelerating the training of fully-connected deep neural networks (Conference Presentation)", Proc. SPIE 10958, Novel Patterning Technologies for Semiconductors, MEMS/NEMS, and MOEMS 2019, 109580U (16 August 2019); https://doi.org/10.1117/12.2515630

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
PRESENTATION

WATCH
PRESENTATION SAVE TO MY LIBRARY

GET CITATION

KEYWORDS

Analog electronics

Neural networks

Algorithm development

Applied physics

Evolutionary algorithms

Tolerancing

Computer architecture

Show All Keywords

Keywords/Phrases

Search In:

Publication Years