Wednesday, September 10, 2014

Apple announces iPhone 6 and iPhone 6 Plus using Apple A8 SoC

As expected, Apple announced several new products on September 9, most prominently the iPhone 6 and iPhone 6 Plus smartphones.

The iPhone 6 has a 4.7" LCD screen with a non-standard 750x1334 resolution (slightly higher than 720p), designed for convenient pixel scaling for existing iOS apps. The iPhone 6 Plus has a 5.5" LCD screen with standard 1080x1920 (1080p) resolution. Because of the limited dynamic pixel scaling ability of iOS, for compatibility apps can technically be scaled by a factor of three (from the Apple standard 414x736 to 1242x2208) and then downscaled to 1080p.

Both models are thin smartphones with a thickness of about 7mm. Connectivity has improved, with more LTE bands and support for VoLTE (voice over LTE) and 802.11ac WiFi. LTE is limited to Category 4, unlike certain high-end competitive devices such as the Galaxy Alpha en Galaxy Note 4 that support LTE Category 6.

Apple A8 at 20nm: Modest performance and power improvements


The process improvement from 28nm to 20nm facilitates a transistor count increase (estimated to have doubled from one billion to two billion), resulting in a relatively large chip, even at 20nm, although the reported die size of 89mm² is smaller than the 104mm² of the Apple A7. However, performance improvement compared to the Apple A7 is reported to be relatively moderate. CPU performance improvement quoted by Apple is 25%, with GPU performance (reported to be powered by a PowerVR GX6450 with four clusters) being improved by 50%. Early Geekbench test results show a modest CPU processing improvement (from around 1400 single-core/2500 dual-core on the iPhone 5S to roughly 1630/2920 on the iPhone 6), aided by a relatively minor CPU clock speed increase.

Despite the larger battery size (1810 mAH) in the iPhone 6 vs 1560mAh in the iPhone 5S) allowed by the larger physical dimensions, offset somewhat by increased power use caused by the larger screen, Apple quotes only minor improvement in battery life over the iPhone 5S for most applications (slightly better on the iPhone 6 Plus), which is disappointing since the iPhone 5S's battery life has been a major weak point in practice. However, the iPhone 6 Plus contains a significantly larger 2915 mAh battery which should allow it to have improved battery life compared to the iPhone 6 and iPhone 5S.

It is possible that gains in power efficiency are larger in practice, with the official battery-life specifications coming closer to reality relative to the more optimistic specifications for the iPhone 5S. Apple has been quoted as saying that the Apple A8 draws up to 50% less power than the previous chip (which is a statement open to interpretation, because it does not specify the level of improvement with typical use), and focusing only on the CPU logically the transition to the 20nm TSMC process with a similar configuration and clock speed should result in measurable power savings, all things being equal.

Similar CPU, but transistor count significantly increased


However, the large increase in transistor count (from roughly one to two billion) may be associated with lower power saving benefits than would otherwise be expected from applying more advanced manufacturing process to, for example, similar CPU cores. Exactly what functionality contributes to the transistor count increase is unclear; the GPU and its associated caches should be significantly larger, a requirement because of the increased number of screen pixels, and the L2 CPU cache (1MB in the Apple A7) may have increased performance (although still using the same 1MB size). The Apple A7 has been reported to already contain a large 4MB L3 cache (which uses a lot of transistors/die area) that may have increased in size in the Apple A8 or otherwise have improved performance (CPU caches of a given size can be made better-performing by using more transistors).

AnandTech has reported that L3 cache latency has improved with capacity remaining the same at 4MB, which is consistent with an increase in L3 transistor count, matching the increased proportion of the die used for the L3 cache (roughly half of the chip) on the Apple A8 when compared to the Apple A7 (the L3 cache scales down less than other blocks in the transition to 20nm). What exactly Apple's goal is by including a large L3 cache is unclear, but for applications that have a memory working set that fits entirely into the cache there would be obvious performance and power consumption benefits. Part of the benefit may be allowing most of the display framebuffer to be stored in the L3 cache, reducing memory overhead related to graphics operations and screen refresh, which would be more apparent with higher resolutions such as the iPhone 6 Plus or potentially iPads, although the L3 cache would have to be large enough to hold the actively changing part of the framebuffer (a full 1080p 32-bit framebuffer is 7.9 GB), with the amount of required memory potentially lowered by compression or encoding optimizations.

Instead of integrating the baseband into the SoC, the Apple A8 continues Apple's practice of using an external Qualcomm baseband modem chip (MDM9625M) in combination with the Apple SoC. The MDM9625M is limited to LTE Cat 4 and does not support LTE Cat 6 (Qualcomm already offers new LTE Cat 6 stand-alone modem chips, used with SoCs like the Snapdragon 805). Because it is a separate chip, Apple may be able to provide updated versions of the iPhone 6 with LTE Cat 6 with a relatively minor redesign during its lifetime.

Although few specific details are yet available, reports suggest that the Apple A8 SoC continues to use a CPU configuration similar to that of the Apple A7, most likely a dual-core performance-oriented CPU with CPU cores very similar the Cyclone cores used in the Apple A7, which may be closely related to ARM's Cortex-A57. Reports indicate that CPU performance is close that of the A7 with a modest clock speed increase from 1.3 to 1.4 GHz contributing to  somewhat increased performance. Some of the performance increase is likely to be contributed by speed improvement and/or size increase of the L2 and L3 caches. AnandTech has observed a few micro-architectural cycle time improvements, for example for integer multiplication and floating point addition, contributing to increases in synthetic benchmark scores. In terms of die area, the CPU block on the block has been estimated to have decreased from about 17mm² to 12mm² thanks to the 20nm process

Performance scaling of Cortex-A57-class CPUs and other out-of-order, speculative issue superscalar CPUs


The disappointing performance and power-efficiency scaling of high-performance CPU cores such as Cortex-A15, Cortex-A17 and Cortex-A57, and also the processor core used by Apple in the Apple A8, which all implement out-of-order superscalar pipelines with speculative issue, when transitioning to more advanced processes (such as 28 and 20nm) has already been noted in the industry. Relative to the benefits expected from the more advanced 20nm HPM process at TSMC compared to the Samsung 28nm HKMG process used for Apple's previous processor, the performance increase and and to some extent power efficiency improvement is modest, with potential for clock speed increases apparently limited.

In contrast, small, medium-performance and extremely power-efficient in-order pipeline cores such as Cortex-A53 (and earlier Cortex-A7) are showing dramatically better scaling with newer processes (including advanced 28nm and 20nm processes), with the ability to optimize for performance or power-efficiency with "core hardening", with greater increases in clock speed and the ability to use a many-core (such as octa-core) configuration with limited implications for cost (die size) and power consumption. This results in dramatically better performance/Watt characteristics and much lower chip cost even for high-performance applications, with the only disadvantage being somewhat limited single-thread performance and the requirement of more extensive utilization of multi-threading in the operating system and application software.

Apple A8 still not very power efficient, but larger batteries used


As a result, Apple's new SoC seems to be widening Apple's SoC efficiency deficit when compared to SoCs being introduced in competitive devices, with the Apple A8 being relatively high-cost and uneconomical to manufacture, providing only modest performance improvement despite the larger size and higher cost when compared to the Apple A7, and a significant and increasing disadvantage in power consumption and performance/Watt compared to SoCs from competitors, despite Apple's head start with 20 nm process technology. Although high chip cost is not a major issue for Apple due to its large profit margins, the disadvantage in power efficiency will continue to be reflected in the end-user experience.

However, at least for the 5.5" iPhone 6 Plus, Apple has significantly increased battery capacity to 2915 mAh, much larger than the 1810 mAh battery in the 4.7" iPhone 6 and the 1560 mAh of the iPhone 5S. This will result in improved battery life in the iPhone 6 Plus compared to other recent iPhones, as the increase in capacity more than offsets the somewhat larger power requirement for the screen. Testing of the new iPhone models based on automated battery life benchmarks for light browser shows a competitive score, increased somewhat over iPhone 5S, and more significantly for the iPhone 6 Plus. However, in this type of benchmark, the iPhone 5S already scored higher than what one would expect based on actual battery life in practice, and more detailed battery life benchmarks have been published that indicate battery life of the iPhone 6  models is still rather average compared to the competition.

Memory subsystem in Apple SoCs favors synthetic benchmarks


There seems to be a tendency of recent iPhones (including the previous generation iPhone 5S) to show benchmark scores often near the top of the chart for CPU performance or even battery life in synthetic benchmarks However, in practice (in typical every-day use), performance characteristics, especially battery life, tend to be considerably less than would be expected based only on these benchmarks (competitors that perform similarly in synthetic battery life test tend to have much better practical battery life). Differences in the operating system (iOS versus Android) could be a factor, but characteristics of the memory subsystem of recent iPhones may be the most significant factor affecting battery life.

Large on-chip cache memories (such as the 4MB L3 cache in Apple A7 and A8, unprecedented for a mobile SoC) have long been known in the computing world to result in particularly high scores in certain synthetic benchmarks as long as their memory footprint fits within the cache, positively affecting both CPU performance benchmarks and battery life tests (access of external RAM or flash storage is slower and uses significantly more power). Also well known is the fact that such benefits quickly diminish when a typical memory footprint no longer fits inside the cache, which may not be common for synthetic benchmarks, but may actually be pretty common for every-day use. Moreover, the relatively limited 1GB RAM of the iPhone 6 models results in more frequent flash memory access (e..g. reloading of tabs in Apple's browser, in addition to less visible activity), which can have a strong negative effect on battery life that is not apparent in many synthetic battery life tests.

In summary, the memory subsystem (large on-chip caches and limited RAM) of recent iPhones is likely to be associated with relatively high scores in most synthetic performance and battery life benchmarks that are not fully representative of the practical experience (particularly for battery life).

Apple continues strategy of reducing cost of other components


Apart from the SoC, Apple continues to emphasize profit margins by keeping cost down on several of the other hardware components, especially memory size, with some of its technical specifications already for some time having been superseeded in competitive high-end devices.

Cost-reducing features include limiting DRAM to 1GB (which can have performance repercussions for common uses cases) and internal flash storage to 16GB for the standard models (which will be the primary sellers), a mature LTE modem chip that does not support the latest LTE Cat 6 speeds, a rear-mounted mono speaker and a 8MP camera (although with most likely good performance due to relatively large sensor pixel size).

Source: AnandTech (iPhone 6 announcement), Wikipedia (iPhone 6 article), Wikipedia (Apple A8 article), MacRumors, iFixit, ChipWorks (Apple A8 analysis), AnandTech (Discussion of ChipsWorks analysis), AnandTech (iPhone 6 review)

Updated October 2, 2014 (Add SoC details from recent full AnandTech review, discussion of impact of memory subsystem synthetic benchmarks).
Updated October 20, 2014 (Add comment about potential use of L3 cache for framebuffer memory).

No comments: