Energy efficiency is (if you’ll pardon the pun) a hot topic. Foundries and semiconductor manufacturers now trumpet their power saving initiatives with the same fervor they once reserved for clock speed improvements and performance improvements. AMD is no exception to this trend, and the company has just published a new white paper that details the work it’s doing as part of its ’25×20′ project, which intends to increase performance per watt by 25x within five years.
If you’ve followed our discussions on microprocessor trends and general power innovation, much of what the paper lays out will be familiar. The paper steps through hUMA(Heterogeneous Unified Memory Access) and the overall advantages of HSA, as well as the slowing rate of power improvements delivered strictly by foundry process shrinks. The most interesting area for our purposes is the additional information AMD is offering around Adaptive Voltage and Frequency Scaling, or AVFS. Most of these improvements arespecific to Carrizo — the Carrizo-L platform doesn’t implement them.
AVFS vs DVFS
There are two primary methods of conserving power in a microprocessor — the aforementioned AVFS, and Dynamic Voltage and Frequency Scaling, or DVFS. Both AMDand Intel have made use of DVFS for over a decade. DVFS uses what’s called open-loop scaling. In this type of system, the CPU vendor determines the optimal voltage for the chip based on the target application and frequency. DVFS is not calibrated to any specific chips — instead, Intel, AMD, and other vendors create a statistical model that predicts what voltage level a chip that’s already verified as good will need to operate at a given frequency.
DVFS is always designed to incorporate a significant amount of overhead. A CPU’s operating temperature will affect its voltage requirements. And since AMD and Intel don’t know if any given SoC will be operating at 40C or 80C, they tweak the DVFS model to ensure a chip won’t destabilize. In practice, this means margins of 10-20% at any given point.
According to Naffziger, this overhead corresponds to a great deal of waste, because modern CPU power consumption tends to scale by the square of the voltage increase (and closer to the cube if leakage is considered). A 10% increase in voltage is therefore roughly a 20% increase in power consumption.
AVFS, in contrast, uses a closed-loop system in which on-die hardware mechanisms manage the voltage — by taking real-time measurements of the junction temperature and current frequency, and adjusting the voltage to match them. This method eliminates the power waste discussed above by eliminating the traditional guard bands that are required to ensure proper operation of every piece of silicon.
There’s another advantage of AVFS that Naffziger doesn’t mention, though it’s not clear how relevant this is to AMD’s own interests: It can reduce the impact of process variation.
One of the differences between semiconductor manufacturing and more mundane products is that a semiconductor manufacturer doesn’t really “know” what kind of chips it built until it tests them. Every wafer of chips will have its own unique characteristics — some chips will use less voltage than others, some will hit higher clock speeds, and some won’t work at all. A well-known SoC design built on a mature node will suffer much less variation than a cutting-edge chip on brand-new process technology. But there’s always some degree of variance.
This variance is sometimes referred to as the process strength.
This graph shows three types of devices. The “Weak” device runs at 800MHz and draws the least power, but has the lowest available clock headroom. This chip won’t overclock well at all. A “Nominal” device will draw slightly more power, but has more frequency headroom, should the manufacturer choose to use it. A “Strong” device has the highest potential frequency headroom and requires less voltage to hit its maximum clock than a “Nominal” device — but its overall power consumption is higher at the same frequency and voltage because it leaks more power. Under DVFS, the manufacturer would fix each of these CPUs at 1V — the amount of voltage the weakest chip needs in order to ensure smooth operation.
AVFS, on the other hand, offers more options. By measuring and adjusting the voltage in real time, AVFS can determine that only the “Weak” chip needs a full 1V to operate. The “Nominal” chip can run at 0.95V, while the Strong chip can actually run at 0.9V. AVFS thus compensates for variation in process technology to ensure a more uniform product experience, and may improve yields as well.
Will AVFS give Carrizo a boost?
After years of promises from AMD with precious little to show for the company’s CPU efforts (unless you count the collapse of its core business as a victory), enthusiasts are understandably skeptical about what Carrizo will offer. The good news on this front is that AVFS isn’t just an idea AMD pulled out of its hat. It’s an understood tradeoff that adds design complexity on the chip, but can help compensate for foundry variation, and is generally believed to improve power consumption by at least the 20% figure that AMD is claiming. Some publications claim benefits up to 45% depending on workload (bear in mind that these are very different chips and target markets).
Reducing power consumption by 20% should allow for better battery life. But it’s not clear yet how this power tuning will impact performance. In theory, AMD should be able to hit higher boost frequencies, but this white paper notes AMD “has designed power management algorithms to optimize for typical use rather than peak computation periods that only occur (briefly) during the most demanding workloads. The result is a number of race-to-idle techniques to put a computer into sleep mode as frequently as possible to reduce average energy use.”
AMD is also baking in new support for the S0i3 idle state with Carrizo. S0i3 idles the chip in a deeper sleep than previous modes, and this should improve overall laptop battery life when the system isn’t in use.
It’s an open question, at this point, whether these techniques and strategies are flexible enough to enable higher performance during those high-use periods while still cutting overall power consumption. While a few Carrizo benchmarks have leaked to date, they aren’t very useful, at least without knowing more about the power and frequency bands the leaked chips are targeting — especially since leaked data is based on engineering samples, and may not be representative of final performance.
Here’s what I expect overall: Carrizo includes a number of power management techniques that are generally touted as reducing power consumption. It integrates more components on-die (another power reduction measure) and it offers support for idle power modes that previous AMD chips couldn’t use. When you pack all these improvements together, it’s reasonable to assume Carrizo will offer significant improvements in battery life. Exactly how much will depend on the OEMs themselves — we’ve seen plenty of evidence around Core M to illustrate that the decisions OEMs make around cooling and components have drastic impacts on the devices themselves.
Performance information suggests that Carrizo’s top-end chips may be slightly faster than Kaveri, but AMD’s own guidance suggest that the chip will be strongest in the low power bands below 35W. If Carrizo mimics Kaveri in this regard, it will see significant improvements in the 15-20W range, but may tie its predecessor’s performance at the 35W band.
After talking to Sam Naffziger, we don’t expect these improvements and capabilities to be a one-off function. AMD has yet to reveal any specifics of its plans for Zen in this regard, but there’s every reason to think we’ll see future chips leveraging capabilities like AVFS as well.
No comments :
Post a Comment