MEMSYS 2023

Welcome to MEMSYS23. We are happy to be back after three years of virtual conferences!

Sponsors

Keynotes:

Keynote: Memory Technologies — Truths, Myths, and Hype

Shekhar Borkar

Ever wondered why SRAM, DRAM, and Flash are the only three successful memory technologies today, despite several emerging technologies being hyped over decades with outrageous claims and promises? These new emerging technologies either failed to deliver, or overpromised, or misrepresented their benefits. In this talk we will first discuss salient memory attributes such as energy, performance, persistence, and endurance. Then describe how to measure these attributes for existing memory technologies, exposing some of the myths and hype. Next, we will establish a system level value metric for comparison and evaluate different memory technologies, comparing their attributes using the established metric, and conclude on what is required for a memory technology to succeed!


Shekhar Borkar is a Sr. Director of Technology at Qualcomm Inc. He started his career with Intel Corp, worked on the 8051 family of microcontrollers, supercomputers, and high performance & low power digital circuits research. He has authored over 100 peer reviewed publications in conferences and journals, over 60 invited papers and keynotes, five book chapters, and has more than 60 issued patents. His research interests are low power, high performance digital circuits and system level optimization.

Keynote: Next Steps in 3D Memory Systems

Paul Franzon

Modern machine learning workloads are increasing in scale at a rapid rate. For example, GPT-3 has 175 billion parameters, requiring 570 GB of storage, and GPT-4 is even larger.   ML workloads are typically memory bandwidth constrained.  For example, a saturated Google TPU can support four HBM channels operating at full capacity.  The Cerebras Wafer Scale Engine is designed for ML workloads and supports 20 PBps of memory bandwidth.  The problem is not just in inference but also in training.  The cost of training is a serious issue especially for large models and can benefit from acceleration.  Compute is made more difficult due the irregular nature of training and Transformer workloads.  Processor near memory (PnM) paradigms with high density memory stacked on logic are especially attractive options.

We describe multiple memory alternatives to address these issues and resulting opporunities.  All are 3DIC technology enabled with aggressive use of high density Through Silicon Vias (TSVs) and two-sided hybrid bonding.  First, we modified the design of the Tezzaron 64 Gb diRAM memory to expose over 130 Tbps of peak memory bandwidth. Second, we revisit the DRAM stack using more conventional bank designs and aggressive use of 3DIC technologies. Finally, we explore a mix of non-volatile and SRAM to enable a stack that can be built with accessible foundry technologies.  In each instance the memory array(s) are matched with  a network layer and array of SIMD processors to create a high performance, high capacity power-efficient solution.   The resulting solution is very area efficient and has the best power efficiency out of programmable solutions. 

Paul D. Franzon is currently the Cirrus Logic Distinguished Professor and the Director of Graduate programs in the Department of Electrical and Computer Engineering at North Carolina State University.  He is also a site Director for the Center for Advanced Electronics through Machine Learning (CAEML).  He earned his Ph.D. from the University of Adelaide, Adelaide, Australia.  He has also worked at AT&T Bell Laboratories, DSTO Australia, Australia Telecom, Rambus, and four companies he cofounded, Communica, LightSpin Technologies, Polymer Braille Inc. and Indago Technologies.  His current interests include applying machine learning to EDA, building AI accelerators, RFID, advanced packaging, heterogeneous integration, 2.5D and 3DICs and secure chip design. He has lead several major efforts and published over 300 papers in these areas.  In 1993 he received an NSF Young Investigators Award, in 2001 was selected to join the NCSU Academy of Outstanding Teachers, in 2003, selected as a Distinguished Undergraduate Alumni Professor, received the Alcoa Research Award in 2005, the Board of Governors Teaching Award in 2014, and the Distinguished Graduate Alumni Professor in 2021.  He has been awarded faculty awards from Qualcomm, IBM, Synopsys, and Google.  He served with the Australian Army Reserve for 13 years as an Infantry Soldier and Officer.  He is a Fellow of the IEEE.

Monday, October 2nd

18:00Welcome Reception

Tuesday, October 3rd

8:00Breakfast in Restaurant
8:50Opening Remarks
9:00Keynote: Memory Technologies: Truths, Myths, and Hype, Shekhar Borkar
10:00Break
Session1: Applications
10:20An Empirical Analysis on Memcached’s Replacement Policies
10:40Large-scale Graph Processing on Commodity Systems: Understanding and Mitigating the Impact of Swapping
11:00LLVM Static Analysis for Program Characterization and Memory Reuse Profile Estimation
11:20Evaluating Gather Scatter Performance on CPUs and GPUs *
11:40Writeback Modeling: Theory and Application to Zipfian Workloads *
12:00Lunch
Session2: Safety and Security
13:00RAMPART: RowHammer Mitigation and Repair for Server Memory Systems
13:20ECC-Map: A Resilient Wear-Leveled Memory-Device Architecture with Low Mapping Overhead
13:40Error Detecting and Correcting Codes for DRAM Functional Safety
14:00An LPDDR4 Safety Model for Automotive Applications *
14:20Break
Session3: Modeling and Simulation
14:40Memory Workload Synthesis Using Generative AI
15:00Multifidelity Memory System Simulation in SST
15:20Modeling and Characterizing Shared and Local Memories of the Ampere GPUs 
15:40PPT-SASMM: Scalable Analytical Shared Memory Model *
16:00Break
16:30Spirited Discussion
19:00TPC Dinner

Wednesday, October 4th

8:00Breakfast in Restaurant
9:00Keynote: Next Steps in 3DMemory Systems, Paul Franzon
10:00Break
Session4: Architecture
10:20Efficient Mobility Centric Caching
10:40Linear-Mark: Locality vs. Accuracy in Mark-Sweep Garbage Collection
11:00The Feasibility of Utilizing Low-Performance DRAM in Disaggregated Systems
11:20CLAM: Compiler Lease of Cache Memory *
11:40Thoughts on Merging the File System with the Virtual Memory System
12:00Lunch
Session5: Processing in/near Memory
13:00An In-Storage Processing Architecture with 3D NAND Heterogeneous Integration for Spectra Open Modification Search
13:20Sadram: A new Memory Addressing Paradigm
13:40Streaming Sparse Data on Architectures with Vector Extensions using Near Data Processing
14:00PEPERONI: Pre-Estimating the Performance of Near-Memory Integration *
14:20Break
Session6: The Devil is in the Details
14:40A Precise Measurement Platform for LPDDR4 Memories
15:00Extending the Life of Old Systems with More Memory
15:20Addressing DRAM Performance Analysis Challenges for Network-on-Chip (NoC) Design
15:40Break
16:30Spirited Discussion
19:00Conference Dinner

Thursday, October 5th

8:00Breakfast in Restaurant
Session7: Prefetching and Paging
9:00An Empirical Evaluation of PTE Coalescing
9:20Building Efficient Neural Prefetcher
9:40Protean: Resource-efficient Instruction Prefetching
10:00Break
Session8: Non Volatile Memories
10:20MC-ELMM: Multi-Chip Endurance-Limited Memory Management
10:40Critical Issues in Advanced ReRAM Development
11:00ENTS: Flush-and-Fence-Free Failure Atomic Transactions
11:20Closing Remarks and Award Ceremony

Papers marked with * are pandemic papers, presented at virtual MEMSYS 2020, 2021, or 2022.