
NVIDIA GTC 2026 has just concluded, once again drawing global attention to “AI inference.” During the keynote, CEO Jensen Huang clearly stated that we have entered the era of the “Token Factory,” where computational demand has surged by a millionfold in just two years.
Amid this wave of computation, semiconductor architecture is undergoing an unprecedented transformation. In the past, the focus was on compute power, but today, memory performance has become the decisive factor in determining the responsiveness and operational cost of Agentic AI.
A New Trend in Disaggregated Inference: When GPUs Meet LPUs
Among the discussions surrounding GTC, the most notable technological shift is “disaggregated inference.” NVIDIA has integrated technology from the Groq team to introduce the new Groq LP30 chip. The underlying logic is to overcome the “memory wall” bottleneck.
The Vera Rubin GPU excels at the “prefill” stage, leveraging massive compute power to process large-scale context. In contrast, the Groq LPU specializes in the “decode” stage, focusing on token generation. Its core design abandons traditional high-bandwidth memory (HBM) in favor of massive on-chip SRAM.
Why SRAM? Because it offers extremely low latency and high speed. During real-time AI interactions or code generation, data can flow rapidly within the chip without frequently accessing external memory. Therefore, the capacity and stability of SRAM directly determine the competitiveness of AI chips.
The Unexpected Comeback of 2D NAND: Behind the Surge in MLC Prices
While advanced process nodes aggressively pursue SRAM and HBM, the mature-node memory market is experiencing unexpected disruption. According to the latest industry insights, niche memory segments are seeing a peculiar phenomenon of “price doubling.”
Major players such as Samsung and Micron are accelerating their exit from low-capacity, legacy 2D NAND production lines to reallocate capacity toward higher-margin AI applications, such as 300+ layer 3D NAND. As a result, essential MLC NAND used in networking devices, digital TVs, and set-top boxes is facing severe supply shortages. Supply in 2026 is expected to drop by half, with a market gap reaching 30% to 40%.
This phenomenon underscores that even in non-core AI applications, memory reliability and stable supply remain critical to business survival. In the face of unprecedented NAND price hikes, ensuring the flawless quality of every shipped memory unit has become a key determinant of profitability for IC design houses and manufacturers.
Chip Yield: The Hidden Moat in the AI Era
Whether it is the SRAM-centric LPU highlighted at GTC or the increasingly scarce MLC NAND, both share a common technical challenge: yield and reliability.
As process technologies advance to N3P and beyond, SRAM occupies an increasingly larger portion of chip area, raising the probability of defects. If an expensive AI chip is scrapped due to faulty SRAM, the financial loss can be enormous. This is why the investment market has recently shown strong interest in testing technologies. In the pursuit of extreme performance, “test and repair” has evolved from a supporting role into a fundamental safeguard of chip design quality.
Intelligent and Flexible Memory Test & Repair Technologies Become Profit Drivers
Amid the surge of AI inference demand and memory shortages, iSTART-TEK has become a key guardian of SRAM yield through its long-established expertise in memory test and repair technologies.
If standard algorithms represent the baseline defense, iSTART-TEK’s User-Defined Algorithm (UDA) platform serves as a flexible R&D environment. Through a graphical interface and proprietary Testing Elements Change (TEC) technology, engineers can design customized test strategies for specific conditions—such as high temperature or low voltage—without writing complex code.
Whether detecting complex leakage defects or diverse memory fault types (such as SAF and DRDF), UDA enables comprehensive coverage, ensuring stable operation of chips even under harsh conditions.
At the same time, to address the growing complexity of SRAM structures in advanced nodes, iSTART-TEK has introduced MART (MBIST Algorithm Recommendation Tool), which incorporates AI into the test flow. This overcomes the limitations of traditional lookup tables and rigid test selection processes.
With simple interactive inputs, the system performs AI-weighted analysis based on factors such as power, area, and yield. This significantly reduces decision-making costs and DPPM (defective parts per million), transforming BIST algorithm selection from experience-driven judgment into intelligent, precision-based recommendation—helping customers secure time-to-market advantages in the highly competitive AI chip landscape.