We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Hardware Architecture

Title: FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads

Abstract: In this work, we propose FUSE, a novel GPU cache system that integrates spin-transfer torque magnetic random-access memory (STT-MRAM) into the on-chip L1D cache. FUSE can minimize the number of outgoing memory accesses over the interconnection network of GPU's multiprocessors, which in turn can considerably improve the level of massive computing parallelism in GPUs. Specifically, FUSE predicts a read-level of GPU memory accesses by extracting GPU runtime information and places write-once-read-multiple (WORM) data blocks into the STT-MRAM, while accommodating write-multiple data blocks over a small portion of SRAM in the L1D cache. To further reduce the off-chip memory accesses, FUSE also allows WORM data blocks to be allocated anywhere in the STT-MRAM by approximating the associativity with the limited number of tag comparators and I/O peripherals. Our evaluation results show that, in comparison to a traditional GPU cache, our proposed heterogeneous cache reduces the number of outgoing memory references by 32% across the interconnection network, thereby improving the overall performance by 217% and reducing energy cost by 53%.
Subjects: Hardware Architecture (cs.AR)
Journal reference: HPCA 2019
Cite as: arXiv:1903.01776 [cs.AR]
  (or arXiv:1903.01776v2 [cs.AR] for this version)

Submission history

From: Myoungsoo Jung [view email]
[v1] Tue, 5 Mar 2019 11:56:32 GMT (887kb)
[v2] Sat, 9 Mar 2019 12:44:11 GMT (887kb)

Link back to: arXiv, form interface, contact.