Google researchers have revealed that memory and interconnect are the primary bottlenecks for LLM inference, not compute power, as memory bandwidth lags 4.7x behind.
SK hynix says it has begun mass production of its 321-layer 2-terabit (Tb) quad-level cell (QLC) NAND flash memory, which marks the first implementation of QLC NAND to employ more than 300 layers.