At the recent Silicon 100 Summit Intel senior vice president Jim Keller gave a keynote titled: “Moore’s Law isn’t dead and if you think that, you’re stupid.” Well, then call me stupid because I’ve been arguing for several years that Moore’s Law is dying; and making the case that this will have a profound impact on semiconductor, system, and data center innovation and dictate new development strategies required to succeed. In fact, my #1 prediction for 2019 was that Moore’s Law would be declared officially dead.
But before we go any further it’s important to define what we mean by Moore’s Law. For that, it’s useful to go back to Gordon Moore’s original article “Cramming More Components onto Integrated Circuits,” published in 1965 in Design magazine. It was in this article that Moore first publicly made the now-famous observation, by recognizing the trend of doubling of the number of transistors in a chip every twelve months, and predicted that this would continue for at least the next ten years. This observation and prediction proved to be so prescient that it eventually was accorded with “LAW” status. (Later, in 1975, citing the “end of cleverness” Moore adjusted his Law to state that the doubling would occur every two years.)
Having started my career as a new process development engineer in the wafer fabs, this notion that there was a law that transistor count would keep doubling always irked me, as it seemed to imply that this was some sort of fundamental law of nature, and thus undercut the tremendous R&D and engineering effort required to stay on this exponential path of advancement. Nonetheless, all of us in the fabs struggled mightily to stay on this ever more complicated path.
So, if viewed through a narrow lens, Moore’s Law might be defined as the prediction of chip device density doubling every two years. But Gordon said much more, noting that simple, regular improvements in photolithography were all that were needed to maintain this doubling:
“It is not even necessary to do any fundamental research or to replace present processes. Only the engineering effort is needed.”
So if you can improve lithography by a factor of 30% every two years, the result is a 2D chip that is half the size (0.7 * 0.7 ~ 0.5); or a chip that doubles the transistor count for a fixed die size.
But Gordon went well beyond this, predicting that this doubling would also have the benefits of faster devices that consumed less power:
“In fact, shrinking dimensions on an integrated structure makes it possible to operate the structure at higher speed for the same power per unit area.”
But perhaps most importantly, Moore’s Law also predicted that the reduction in cost per device would follow this same exponential trend. As Gordon himself later said: “Moore’s Law is really about economics,” and he clearly understood the impact that this would have on society, including in his original 1965 article this wonderful figure:
I just love this image because it demonstrates what a truly mind-blowing prediction Gordon made and just how visionary he was. Today we take computers that you can slide into your pocket for granted, but in 1965, when computers were the size of a house and required massive cooling plants to keep them running, the idea of a “handy home computer” was truly fantastic.
In fact, there are many concurrent innovations and challenges required to achieve each transition to the next process node. Moreover, meeting these challenges requires not just advances in photolithography, but encompass addressing the entire range of engineering and physics challenges.
The device physics that underpins Moore’s Law were first clearly articulated by Robert Dennard in 1974, in an article where he outlined all of the various device parameters and how they needed to scale together. He described that as a MOSFET transistor shrinks exponentially it also becomes:
The net takeaway from Dennard scaling is truly remarkable. If you scale things correctly everything just gets better! Smaller, faster, cheaper. And even though the number of transistors per unit area increases by the square of the scaling factor, the power of each device decreases by precisely the same square law amount. The net is that the power density stays constant so that you get both faster and more devices in the same area and power and it all costs the same. In short Dennard Scaling explains Moore’s Law at a fundamental level of device physics, but also points out all the simultaneous things that need to be scaled to remain on track.
And Moore’s Law and the associated Dennard scaling has been wildly successful. Even as transistor counts repeatedly doubled every two years, the semiconductor cost per unit area has remained relatively constant over many decades at about one billion dollars per acre (the only thing I know of that is more expensive than Silicon Valley real estate).
It is this larger, economic part of Moore’s Law which has made it the powerful driving force behind the impact technology has made in our lives over the last 50+ years.
Thus, if you define Moore’s Law as simply the doubling of devices every couple of years, achieved by any means imaginable, then you can make a sort of weak argument that it will continue. I will stipulate that technology will continue to progress. Jim is a smart guy and I think that was the point he was making.
But 3D memory processes, multichip packaging, through-silicon-vias, die-stacking, and new vertical FinFet processes, definitely don’t meet the criteria that Gordon originally stated of no fundamental research needed to replace existing processes. Just ask the thousands of engineers involved in driving these chip technologies forward.
So call me stupid if you will, but Moore’s Law is Dead.
… at least Moore’s Law writ large, in the way Gordon originally articulated it.
Former Intel CTO Justin Rattner even admitted as much, when talking about the end of the classical phase of Moore’s Law in this interesting conversation at the Computer History Museum back in 2013 when he said: “for Intel, Moore’s Law for silicon gate MOS technology ended 3 generations back…”
At some point, one or more of these Dennard knobs hits practical limits, which puts stress on the overall scaling that underpins Moore’s Law. As dimensions shrink and geometries approach those of atomic diameters, second order electrical and even quantum effects become dominant. At some point, these second order effects require fundamental changes in process technology to stay on the Moore’s Law progression. In fact, such roadblocks were encountered in the previous decade, forcing the silicon foundries to abandon the standard silicon gate planar process and the simple “engineering effort” that served the first four decades of chips.
So achieving each process step has become increasingly difficult, and key parts of both Moore’s Law and Dennard Scaling have broken. For example, the doubling of device performance stalled around 2002 and chips simply have not gotten much faster since.
Individual transistors have continued to speed up, however at a chip level practical issues like noise, cross talk, clock distribution and jitter, on-chip variation, have all conspired to limit continued performance improvement. But even as processor clock speeds stalled, parts of Moore’s Law marched along. Indeed just a few years later in 2005 AMD introduced their first multi-core processors, and were followed closely by Intel. So no longer did we get faster processors but instead more and more of them.
But other parts of Dennard Scaling have hit a wall too – in particular as vertical tunneling of electrons through gate oxides became significant, constant power density collapsed, and chips consumed more and more power. This in turn required intricate and expensive packaging and cooling technologies at both the chip and system level.
And even allowing for jumping through all sorts of process hoops to maintain the crushing requirements imposed by any exponential law, former Intel CEO Brian Krzanich admitted on Intel’s Q2 2015 earnings call that the Moore’s Law Tick-Tock “cadence today is closer to 2.5 years than two.”
This “BK Brake” was a stunning admission by the Intel CEO, but in fact turned out to be wildly optimistic in terms of the actual delays that would occur moving from 14nm to the 10 nm node. It also highlighted the Tick Tock Model – one of the vaunted advantages that Intel pioneered to take advantage of Moore’s Law.
To understand this Tick-Tock model of chip development you first need to understand how chips were developed before Intel invented it. In the time before Tick-Tock, process engineers, CPU architects, and chip designers were all part of one big, fully integrated team and moved in lock step together. A big new program was kicked off that would have a new chip architecture built on a brand-new process node. This created giant leaps in performance with each new product, but had a couple of big flaws. First of all you would only get a new product out every two years as shown in the diagram below:
The second problem is it requires solving two extraordinarily complex problems at the same time – both trying to bring up a new manufacturing process with good yields in high volume, while simultaneously bringing up and debugging a new architecture. Solving two problems concurrently is always much more difficult and riskier than just one. It is like the adage of the man who chases one rabbit and has supper vs the man who chases two rabbits and starves.
But once you become fully convinced and committed to the two-year cadence of new process nodes promised by Moore’s Law, a new model becomes possible. This fundamental business model innovation takes advantage of parallel, rather than serial development.
The TICK-TOCK-TICK-TOCK model of chip development simply staggers the process and architecture development by one year. Each development still requires the same amount of time, but because they are staggered it allows a once a year cadence of new product introduction. One year the new chip is a process shrink of an existing architecture. The next year the new chip uses the same process node but introduces a new architecture.
The beauty of this chip development model is that it is a business innovation that takes advantage of a classic pipeline that hides latency to achieve throughput.
By staggering by one year the development of a new architecture from that of a new process you can pipeline the introduction of new products. That is you can hide the two year development cycle of architecture and process, by overlapping and staggering them. So even though it takes around two years to come up with a new architecture, and also roughly two years to develop a new process – you still manage to introduce a new product every year. By adopting this Tick-Tock development pipeline you not only get twice as many new products in a given time but also reduce the risk by tackling only one of the two new developments with each new product.
In a nutshell, this is the Tick-Tock model of chip development and it worked extremely well for many, many years. But like all pipelines, it’s latency is impacted by the slowest stage – and unfortunately one of the stages has slowed down to a crawl. And now Moore’s Law is Dead … and with it, the Tick-Tock Model.
That is five successive architectural TOCK products without a new process TICK. Now to be fair this problem is not unique to Intel. All chip companies are experiencing the too much TOCK problem, but its impact on Intel is particularly pronounced. After all Intel originated the TICK-TOCK model of processor development more than a decade ago, and this cadence became a core part of their business strategy.
But all chip companies share the same laws of semiconductor physics, and as we approach the atomic limits of scaling, Moore’s Law has slowed to a crawl for everyone. So with the end of scaling comes the end of Moore’s Law (although I would argue it is the laws of economics at least as much as physics that are contributing to its demise). And for a whole bunch of different reasons, it simply won’t be possible anymore to introduce a new manufacturing process every two years. So even if it is not dead, Moore’s Law is deeply fractured. And unfortunately, the Tick-Tock model has become broken along with it. So the vaunted Tick-Tock model has been stuck at 14nm, with now 4 or 5 new devices in the same process node coming out on a TOCK-TOCK-TOCK cadence.
So call me stupid, but rather than deny the demise of Moore’s Law, I prefer to embrace this new reality, understand the implications, and develop new strategies to cope with this brave new world. In my next blog I’ll discuss the implications and recommend some strategies to thrive in the post Moore’s world.