Friday, September 26, 2014

New Amazon Kindle tablets use MediaTek SoC -- but will it help MediaTek?

Amazon has introduced two new low-priced tablets for the US market, the Kindle Fire HD 6 and Kindle Fire HD 7, priced at $99 and $149 respectively. Both tablets are expected to be available in October. The new models are reported to feature an unspecified quad-core MediaTek SoC. Although some news articles suggest the use of the high-performance (but somewhat inefficient) MediaTek MT8135 SoC, about which little has been heard since its announcement more than a year ago, which would match reports from last year about Amazon using the MT8135 for future models, use of the newer and much more cost-effective and power-efficient MT8127 would make much more sense.

A recent tear-down by iFixit however proves that the tablets do use a MT8135V SoC, although the memory interface is limited to a single channel 32-bit configuration compared to the dual channel configuration originally announced for the MT8135. As will be explained below, the use of the relatively expensive (because of a relatively large die area) and not very power-efficient MT8135 featuring Cortex-A15 cores and high-performance PowerVR GPU, a SoC originally announced for high-end tablets, in low budget devices like the new Kindle models does not make economical sense at all, especially from MediaTek' s standpoint, while MediaTek's existing MT8127 would have provided clear advantages for cost and power efficiency while still meeting performance goals.

Amazon targeting different segment of the market


The new tablet models are relatively small. The 6" Kindle Fire HD 6 is one of the few tablets of that size, while smartphones of a similar size (sometimes dubbed "phablets") are becoming more popular. Both tablet models do not have cellular connectivity and require a WiFi connection to connect to the internet. The tablets have a very robust design, being considerably thicker than most tablets. There are also versions with a software and accessory package specifically targeted at children.

Amazon uses a customized version of Android KitKat, without access to Google's Play Store and other Google applications, instead focusing on its own Amazon AppStore, with a somewhat different target demographic than higher-priced tablets.

MT8315 Amazon design win reported as early as August 2013, but use of MT8127 would be more economical


Already in August 2013, reports surfaced that Amazon would be using MediaTek's MT8135 in tablets to start shipping in 2014. Amazon has confirmed that a quad-core 1.5GHz MediaTek processor used in the new models. Current specifications mention to processor cores running up to 1.5 GHz and two cores up to 1.2 GHz. The MT8135 was announced more than a year ago as a relatively high-end chip and was originally expected to be commercially available much earlier. It was MediaTek's first chip using ARM's big.LITTLE architecture, using two Cortex-A15 cores clocked up to 1.7GHz and two Cortex-A7 cores.

The MT8127, announced this spring, is based on a proven and efficient quad-core Cortex-A7 CPU configuration and adds a relatively fast GPU (although limited to OpenGL ES 2.0 API support) and is listed with a maximum clock speed of 1.5GHz.

Power efficiency of big.LITTLE MT8135 likely to be problematic


ARM Cortex-A15 cores are notorious for high power consumption, and few Cortex-A15-based SoC designs have been commercially successful for mobile applications (especially smartphones), with problematic heat production and power drain often being reported. Cortex-A15 cores also take up considerably more die area than efficient cores like Cortex-A7 or Cortex-A53, resulting in larger, more costly chips.

Although power consumption and battery life of the Kindle tablets has not yet been tested, battery life specifications by Amazon are the same as for Kindle Fire HD models from previous years. Since the MT8135V is actually used in the new models, maintaining battery life is likely to be a challenge, while if Amazon had actually chosen the MT8127, the devices would most likely have provided much longer battery life.

The case against the use of the MT8135


Even though it has been established that the new Kindle tablets do use a version of the MT8135, several drawback are apparent. Although only two Cortex-A15 cores are used in the MT8135 instead of the four present in most existing big.LITTLE designs, a small form factor tablet would most likely not allow a large battery (the Kindle Fire HD 6 in fact has only a 3400 mAh battery, limited by the form factor) and power consumption could be problematic.

The relatively high performance PowerVR Series 6 GPU in the MT8135 should also contribute to high power consumption, for example when playing games, as well as being seemingly overpowered for the relatively low screen resolution since it is heavily oriented towards the use of dual-channel memory interface and a high display resolution.

On the positive side, MediaTek has experience balancing power consumption with its CorePilot technology (for example in octa-core CPUs), although this has not yet been proven for big.LITTLE CPU designs. MediaTek also originally announced its HMP (heterogeneous multi-processing) capability in conjunction with the MT8135, with all four cores being able to run concurrently.

In addition to the relatively large die area of the CPU and GPU (resulting in a relatively large, expensive chip), as well as increased manufacturing cost due handle potentially high heat production, a hypothetical design using the MT8135 would likely be using a relatively expensive dual-channel memory interface (matching the choice of CPU and GPU), further increasing cost at several levels. However, as it turns out the new Kindle tablets limit the memory interface to 32 bits in conjunction with the MT8135V SoC used.

Consistent with the cost characteristics of the chip platform, the MT8135 was originally announced as being targeted at the mid-to-high tier of the tablet OEM market. Clearly, this does not match the $99 price of the Kindle Fire HD 6, making the actual use of the MT8135 somewhat silly.

MediaTek already transitioning away from big.LITTLE


MediaTek has also announced a big.LITTLE smartphone platform, the MT6595 using four Cortex-A17 and four Cortex-A7 cores. However,  although providing performance competitive with or surpassing current high-end platforms like Snapdragon 801, the MT6595 platform does not appear to have been widely adopted, which makes sense considering the relatively high power consumption of associated with the Cortex-A17 CPU cores and higher cost of the SoC, which make it stand out compared to other MediaTek SoCs, which tend to be low cost and power efficient.

In fact, MediaTek has already announced the MT6795, to be available this year not long after the MT6595, which does away with big.LITTLE and instead uses an efficient octa-core ARM Cortex-A53 configuration, with the other specifications being similar to the MT6595. This provides strong evidence that MediaTek is no longer focusing on big.LITTLE designs, including the MT8135, supporting the case that if MediaTek would make the decision, the new Amazon tablets in fact would not use the MT8135, but instead the newer, much more efficient MT8127.

Good game performance would have been achieved with MT8127 as well


Amazon has demonstrated relatively good performance of the new tablet models, for example when playing games, compared to competitive devices such as certain models from Samsung's Galaxy Tab 4 series. This is not unexpected, since the PowerVR Series 6 GPU in the MT8135 clearly provides high performance.

However, the Mali-450 GPU inside the MT8127 is actually a relatively recent GPU that is significantly faster than the Mali-400 commonly used in entry-level devices, and combined with the modest 1280x800 display resolution of the new Kindle tablets would have given respectable 3D game performance, not far from the performance of the actual MT8135V-equipped models. Although Mali-450 does not support the OpenGL ES 3.x API, OpenGL ES 2.0 continues to dominate, for which Mali-450 provides an efficient implementation (in terms of performance/Watt and performance/dollar).

The MT8127 is clearly a much more cost-effective (and more more power-efficient) chip. The MT8127 is likely to be dramatically more cost-effective than the MT8135, with much lower chip cost, much better battery life, and significantly lower manufacturing cost of the PCB and other manufacturing aspects, altogether a much better fit given the price segment of the new tablets.
Although the single-thread CPU performance of the quad-core Cortex-A7-based MT8127 is significantly lower than the Cortex-A15-based MT8135, this is not a critical issue in practice, and Android can already take significant advantage of multi-threading with a quad-core processor, mitigating the impact of single-thread performance bottlenecks.

Large-scale production of MT8135 does not make financial sense, unlike MT8127


Given the high manufacturing cost of the MT8135 (especially when compared to much more cost-effective tablet chips from MediaTek like the MT8127), it unlikely that MediaTek is making much of a profit on the chip even when selling millions of chips to Amazon.

In fact, because MediaTek is likely to be facing a critical shortage of wafer capacity at its foundry TSMC (being squeezed between juggernauts Apple and Qualcomm buying up capacity), the production of the MT8135, with its low profit margin, has probably cannibalized MediaTek profits as well as revenues, because, for example, for each MT8135 sold MediaTek would have been able to sell two or more much more cost-efficient and higher margin chips such as the MT8127 or MT6582.

Indeed, for this reason, the use of the MT8127 in inside the new Kindle tablets would have been much more logical. A prior commitment with Amazon for producing and shipping the MT8135, as reported previously in 2013, probably left MediaTek with no other options.

Few signs of financial gain from Amazon design win


As described in an earlier post, MediaTek's sequential revenue growth in Q3 is unlikely to be much greater than 10%, already low considering the normally expected seasonal increase expected in Q3. This provides additional evidence that MediaTek is severely affected by wafer shortages at TSMC, as well as the late introduction of smartphones SoCs with integrated 4G LTE baseband, and general price pressure on its chips. Despite probably shipping millions of MT8135V chips to Amazon, this probably has had the effect of limiting shipment of other, higher-margin MediaTek chips to other customers, because of an inability to fulfill demand. Indeed, tablets using more cost-effective MT8127 have been very slow to appear on the market, suggesting that MediaTek has been prioritizing tablet processor production of the MT8135V for Amazon because of capacity constraints. So while MediaTek has gained prestige from this design win, the financial gain is likely to be limited or even negative.

Strong prospects for new products, clouded by capacity concerns


Although the performance of MediaTek's upcoming Cortex-A53-based smartphone SoCs is likely to very competitive and they have been reported to to have gained widespread adoption in China for new designs, while also contributing to MediaTek's increasing competitiveness in high-performance segments, recent reports suggest competition for wafer capacity at TSMC will continue to be intense, bringing into question MediaTek's ability to translate any product strength (ranging from new and existing smartphone platforms to tablet chips like the MT8127) into actual sales and profit growth in the near term. If MediaTek continues to be obligated to produce the MT8135V in high volume for Amazon, that will most likely continue to negatively affect MediaTek's sales and profits.

Sources: CNET (Kindle Fire HD 6 and 7 announcement), DigiTimes (2013 MediaTek Amazon Kindle article)MediaTek (MT8127 announcement press release)iFixit tear-down article

Updated October 24, 2014 (Update to reflect the fact that the tablets actually do use the MT8135V SoC).
Updated November 2, 2014.

Monday, September 22, 2014

Early test results suggest Cortex-A53 wil revolutionize performance, cost and efficiency across all segments

The ARM Cortex-A53 is a very small and power-efficient in-order-pipeline CPU core that is the successor to the similar and very successful ARM Cortex-A7 core. Although Cortex-A53 supports the 64-bit ARMv8 instruction set (as well as having full compatibility with 32-bit ARMv7), it can take advantage of the 32-bit version of ARMv8 with architectural improvements, and it has other significant internal architectural improvements leading to increased performance on current leading-edge process nodes compared to Cortex-A7. Although also used as power-efficient core in combination with ARM's high-performance Cortex-A57 core in a big.LITTLE configuration for high-end designs, Cortex-A53 cores have also been widely adopted as a stand-alone CPU in leading smartphone and other mobile SoC designs, with the first designs currently starting to appear in commercially available devices. Upcoming Cortex-A53-based designs span virtually the whole performance spectrum from entry-level to premium devices.

Early benchmarks show strong performance of the Cortex-A53 core, especially for the latest revisions


Early evidence of the performance of new SoCs exclusively using ARM Cortex-A53 processor cores, based on recent entries in Geekbench's result database, suggests that the performance improvement of Cortex-A53 compared to Cortex-A7 at an equivalent clockspeed, especially when running with the 32-bit ARMv8 machine model as implemented in Android 4.4.4, may be greater than originally expected.

There is evidence that several revisions of the Cortex-A53 core already exist, including the original r0p0, the r0p1 and the r0p2 revision (with r0p3 also being listed on ARM's website). Although these are minor revisions that do not signficantly alter the IP blocks, the later revisions seem to be associated with significant performance improvements when compared to earlier revisions, possibly because of the correction of bugs or performance bugs in earlier revisions. In particular, r0p0 revision devices such the first incarnation of Snapdragon 410 (MSM8916) appear to be limited to ARMv7 compatibility mode, while SoCs with later revisions appearing to be configured with support for the 32-bit version of the ARMv8 instruction set (AArch32) in association with Android 4.4.4.

Full 64-bit ARMv8 machine model not likely to be of great benefit on mobile devices


The full 64-bit ARMv8 instruction set (AArch64) as supported by Cortex-A5x is not yet supported in Android, and there are reasons to believe that using it might not result in much benefit in today's devices over AArch32. For example, much of the benefit of the new ARMv8 instruction set is already delivered by AArch32, and actual use of 64-bit registers/variables and operations on them is relatively uncommon in program code (this is true of most program code, including typical code executed using the x86_64 instruction set used in PCs and Atom-based mobile devices). Additionally the ARMv7 instruction set (and AArch32) already contain some instructions that operate on 64-bit values, which can be conveniently taken advantage of for these uncommon cases, without requiring the use of the full 64-bit ARMv8 instruction set.

Moreover, in the ARM world, data processing algorithms that might benefit from 64-bit processing are often better served by using ARM's NEON SIMD extension, which is also available on AArch32 and most ARMv7-A devices.

Although AArch64 makes memory management more flexible by extending the addressing space beyond 4 GB, the doubling of the storage size of all pointers (memory addresses) from 32 bits to 64 bits negatively impacts performance because of greater code and data memory usage, which for mobile SoCs, given their relatively small internal SoC buffers, cache memories and RAM, are especially sensitive. PAE support already allows 32-bit ARM machine models to take advantage of a larger addressing space, reducing the necessity of switching to a full 64-bit model.

32-bit version of ARMv8 instruction set brings benefits


Android support for the 32-bit version of ARMv8 is a very recent development, taking advantage of new ARMv8 instructions that improve performance, and probably also the architectural changes in ARMv8 (such as the removal of the optional conditional predication of instructions present in ARMv7-A) that benefit modern CPU cores such as Cortex-A53 and Cortex-A57. Geekbench takes advantage of the new machine model, and the majority of Android applications, largely consisting of device-independent Java code that is translated into machine code on demand, is also likely to benefit. However, to what extent ARMv7-A native code, which is commonly used in applications that require more CPU processing, is affected by the new machine model is unclear.

SoC-specific CPU optimizations are common, but impact power consumption more than speed


Variation between different implementations of Cortex-A53 cores at a similar process node can also occur because of core hardening optimization in the SoC design. This can involve trading performance for power efficiency and vice versa, although it should not in principle affect metrics such as IPC (instructions per cycle) or indeed Geekbench CPU scores as long as they do not depend on factors outside of the CPU core such as a more extensive memory footprint. However, apart from L2 CPU cache memory size, CPU cache latencies may also be configurable through core hardening, and the latter may impact even small memory footprint benchmarks, including CPU tests used in Geekbench.

Geekbench result round-up for smartphone SoCs, including new designs using Cortex-A53


(Click to enlarge)

The table above shows a summary of Geekbench results for smartphone models using popular smartphone SoCs, as well as new smartphone SoCs using Cortex-A53 cores. Note that the MSM8939 entry in the table is incorrectly labeled as Snapdragon 610, it actually represents an early version of Snapdragon 615.

The results were gathered after examining the range of benchmark results for a common SoC and CPU clock frequency configuration (which tends include numerous lower-than-expected scores, probably mostly due to background CPU activity when running the benchmark or the effects of CPU throttling), and choosing a representative result close to the high end of the range, while trying to make sure the result is not an outlier or giving indications of overclocking. As much as possible, entries using the most recent version of Geekbench (3.2.1 or 3.2.0) and the underlying Android version (preferably 4.4.x) was selected.

While the Integer and Float scores reported in the table are likely to be closely tied to the processor core, SoC and the clock frequency used, the memory score and overall score depend on the external memory implementation and speed and other factors related to a particular device model.

Analyzing Geekbench performance of existing SoCs


Looking at previous generation SoCs, among SoCs with a quad-core Cortex-A7 CPU configuration, based on Geekbench results, MediaTek SoCs are very competitive against Qualcomm SoCs long considered mid-range such as a 1.2 GHz Snapdragon 400. For example, MediaTek's MT6582, despite usually being found in much cheaper (often entry-level) devices than Snapdragon 400, is quite competitive. Samsung's Exynos 3470, used in the Galaxy S5 Mini, appears to be worst performer in this class in terms of performance per MHz.

Looking at higher performance SoCs, the octa-core MT6592 holds the middle ground based on strong multi-core CPU performance (with memory performance being a relative bottleneck), while Qualcomm's Snapdragon 801/805 are a clear step up, especially in terms of single-thread and memory performance. Snapdragon 805 appears to be very similar to Snapdragon 801 in terms of CPU architecture, with very similar performance at the same clock speed, and being reported basically as major version bump of the Krait-400 core used in the Snapdragon 801 by Geekbench, although Qualcomm described the CPU cores inside Snapdragon 805 as Krait-450. Exynos 5430 provides a similar level of performance, but the power efficiency of the latter may be in doubt.

The following Geekbench model names associated with entries using existing SoCs were used for performance comparisons. A link to the results page used for each model is provided.
  • Qualcomm MSM8226 (Snapdragon 400) (Cortex-A7r0p3): HTC HTC Desire 610 (Geekbench 3.2.1 ARMv7, Android 4.4.2)
  • Samsung Exynos 3470 (Cortex-A7r0p3): samsung SM-G800F (Geekbench 3.2.1 ARMv7, Android 4.4.2)
  • MediaTek MT6582 (Cortex-A7r0p3): HUAWEI H30-U10 (Geekbench 3.2.1, ARMv7, Android 4.4.2)
  • MediaTek MT6589T (Cortex-A7r0p2): LENOVO Lenovo S960 (Geekbench 3.2.0 ARMv7, Android 4.4.2)
  • Qualcomm MSM8226 (Snapdragon 400) (Cortex-A7r0p3): HTC HTC Desire 816 dual sim (Geekbench 3.2.1 ARMv7, Android 4.4.2)
  • MediaTek MT6592 (Cortex-A7r0p4): LENOVO Lenovo A806 (Geekbench 3.2.1 ARMv7, Android 4.4.2)
  • Qualcomm MSM8974AC (Snapdragon 801): Motorola Moto X (2014) (Geekbench 3.2.1 ARMv7, Android 4.4.4)
  • Samsung Exynos 5430: samsung SM-G850F (Geekbench 3.2.1 ARMv7, Android 4.4.4)
  • Qualcomm APQ8084 (Snapdragon 805): samsung SAMSUNG-SM-N910A (Geekbench 3.2.1, Android 4.4.4)

Performance of new Cortex-A53-based SoCs


Qualcomm's first generation 1.2 GHz Snapdragon 410 (MSM8916), with four Cortex-A53r0p0 cores, has higher performance than a similarly clocked Snapdragon 400, although not dramatically so. A faster clocked Snapdragon 410 prototype (with MSM8916_32 SoC) with a later revision of the Cortex-A53 core shows a clear improvement in Geekbench Integer Performance over the previous Snapdragon 410 when adjusting for the clock rate. However, this is for a large part due to the availability of the Aarch32 instruction set in the newer device, allowing Geekbench to take advantage of new cryptography instructions that greatly speed up certain subtests that are part of the Integer benchmarks.

MediaTek's upcoming MT6752 with an octa-core configuration of the more recent r0p2 revision of the Cortex-A53 core shows impressive performance, with the caveat that this is based on a single reported benchmark score of a prototype device. Overall integer performance as reported by Geekbench is especially impressive, being comparable to Snapdragon 801 for single-thread performance and blowing past it in terms of multi-core performance. However, the use of Aarch32 is likely to inflate the overall Integer scores relative to typical performance in practice because of the relatively large influence of new cryptography instructions available with AArch32 on Geekbench's Integer Performance scores, although other benefits of AArch32 are also apparent. Memory efficiency also appears to be significantly improved when compared to previous generation Cortex-A7-based devices. Despite relatively high performance, the MT6752 is likely to be power-efficient and very cost-effective, due to the characteristics that the Cortex-A53 core has inherited from Cortex-A7.

The following Geekbench model names associated with entries using a SoC with Cortex-A53 cores were used for performance comparisons. A link to the results page used for each model is provided.
  • Qualcomm MSM8916 (Snapdragon 410) (Cortex-A53r0p0): HTC Desire 510 (Geekbench 3.2.1 ARMv7, Android 4.4.3)
  • Qualcomm MSM8916_32 (Snapdragon 410) (Cortex-A53r0p1): unknown msm8916_32 (Geekbench 3.2.1 AArch32, Android 4.4.4)
  • Qualcomm MSM8939 (Snapdragon 615) (Cortex-A53r0p1): HTC HTC 0PFJ1 (Geekbench 3.2.0 Aarch32, Android 4.4.4)
  • MediaTek MT6752 (Cortex-A53r0p2): alps k2v1 (Geekbench 3.2.1 AArch32, Android 4.4.4)
  • Samsung Exynos 5433 (Cortex-A57r1p0 + Cortex-A53): samsung SM-N910C (Geekbench 3.2.0 AArch32, Android 4.4.4) 

Cortex-A53 blows Cortex-A57 away in terms of efficiency


Samsung's new Exynos 5433, the first SoC with publicly disclosed Cortex-A57 cores, sets a new high mark for single-thread performance, being considerably faster than Snapdragon 801, but surprisingly finds itself beaten on multi-core integer performance in early results for the MT6752, a mid-range SoC. Both devices use AArch32, so the relatively heavy weighing of new AArch32 cryptography instructions by Geekbench is not as important as when comparing with previous generation devices.

Exynos 5433 contains four Cortex-A53 cores in addition to the four Cortex-A57 cores in a big.LITTLE configuration, and more detailed examination of the benchmark results (more specifically primarily CPU-bound subtests such as JPEG Compress) provide evidence that the Cortex-A53 cores do contribute to multi-core performance, with a multi-core performance scaling factor of 4.46 (about 4.0 would be expected when just the Cortex-A57 cores are utilized), suggesting Global Task Switching (allowing all eight cores to run concurrently) is working, although not providing a great boost in overall processing performance, with more significant benefits for overall power efficiency and CPU scheduler efficiency.

It has to be noted that the MT6752, which closes in on the performance of a high-end design like Exynos 5433, is a mid-range chip with a cost-effective 32-bit memory interface, and is likely to be considerably cheaper and much more power-efficient than Exynos 5433 and other high-end platforms, dramatically illustrating the great efficiency of Cortex-A53-based SoCs against the relative inefficiency of Cortex-A57. Cortex-A57 provides superior single-thread performance, but compares poorly in terms of performance/dollar and performance/Watt. High performance Cortex-A53 designs such as MediaTek's upcoming octa-core MT6795 (which is targeting a higher clock frequency and has a premium dual-channel memory interface) are likely to make the comparison even more compelling.

Low-power Cortex-A53 has significant advantages related to performance scaling and thermal restrictions


Key to this development is the apparent tendency of in-order pipeline cores such as Cortex-A53 (and previously Cortex-A7) to show much greater performance scaling on new, more advanced process nodes, primarily because of much greater increases in maximum clock speed. For example, clock speed increase has been limited for SoCs with high-performance CPU cores in the same class as Cortex-A57 (generally out-of-order pipeline, speculative issue architectures with a large die size) such as Exynos models with Cortex-A15 and Apple A7/A8 with Cyclone, despite the transition to 20 nm manufacturing.

In addition, practical performance of Cortex-A53 is likely to be much less affected by CPU throttling (periodic reduction of the CPU clock speed because of the temperature increasing beyond a certain threshold in order to maintain stability), thanks to the power efficiency of Cortex-A53, which may aid actual performance in practice more than is apparent from the results of common CPU benchmarks.

Finally, the current comparison of Cortex-A53 with Cortex-A57 as implemented in Exynos 5433 is not apples-to-apples because Exynos 5433 is manufactured at 20 nm, with significant associated performance benefits, while Cortex-A53-based devices (which for the moment are mostly targeted at cost-sensitive applications) are still manufactured at 28 nm. Although there is as of yet not much information about how Cortex-A53 will scale on 20 nm, I believe there is potential for additional performance scaling that could be disruptive in terms of performance and efficiency advantages when compared to high-performance cores like Cortex-A57.

Comparison of Cortex-A53 CPU core revisions 


  • Cortex-A53r0p0 (part 3331, variant 0, revision 0 as reported by Geekbench) is the first revision. This appears to be the version used in a quad-core configuration in MSM8916, the first generation of Qualcomm's Snapdragon 410, which is the first Cortex-A53-based SoCs to be commercially available in devices such as HTC Desire 510 and several currently ramping devices, including Samsung Mega 2 (SM-G7508Q) and Samsung Galaxy A5 (SM-A500F). The clock speed is typically set at 1.19 GHz. Devices using this chip appear to be limited to ARMv7, not being able to take advantage of the 32-bit ARMv8 (Aarch32) instruction set. Already on July 1, 2014, Qualcomm's Android for MSM Project stopped providing support for this SoC for new Android versions, with the latest supported version being Android 4.4.3.
  • Cortex-A53r0p1 (part 3331, variant 0, revision 1 as reported by Geekbench) is the second revision. It is used in a Qualcomm prototype device result reported as MSM8916 or MSM8916_32 (a chip designation similar to already shipping devices using a Snapdragon 410 with the first revision of Cortex-A53), equivalent to a SoC referred to by Qualcomm as MSM8916_32, running at a higher maximum clock rate (1.54 GHz vs 1.19 GHz) and showing a significant additional performance improvement beyond that expected from the clock speed increase only. The combined Geekbench integer performance score for r0p1 is about 30% higher for single and multi-core performance than r0p0 at the same clock speed, although that is largely the result of the cryptography instructions enhancement offered by AArch32, but other improvements are also apparent. Floating point performance remains the about same. Memory performance may also be higher, but that also depends on the speed of the memory used in the tested devices.
  • Cortex-A53r0p1 is also used in Qualcomm's octa-core MSM8939 (Snapdragon 615), which has two clusters of four Cortex-A53 cores, one running at a higher and the other at a lower a speed. Geekbench results for a HTC prototype using this chip (running at maximum CPU speed of 1.34 GHz) are consistent with the performance per MHz found in the r0p1-based MSM8916_32, with gains in multi-core performance over the quad-core chips suggesting that the device supports heterogeneous multi-processing (also called Global Task Switching), allowing all eight processor cores to run simultaneously, although the gain is significantly lower than what would be expected when all CPU cores are fully utilized (even allowing for a relatively low CPU speed of the second cluster, say 0.7 GHz).
  • Cortex-A53r0p2 (part 3331, variant 0, revision 2 in Geekbench) appears to be the latest revision of the Cortex-A53 that has been implemented in SoCs. A benchmark result for a device based on MediaTek's upcoming mid-range octa-core MT6752 SoC provides evidence for the existence of this core. The CPU cores are clocked at 1.69 GHz, and the benchmark results are impressive, helped by ability of the eight cores to run concurrently at full speed. Integer and floating point performance when corrected for clock speed appears to be further improved slightly over the previous r0p1 revision, based on single-core performance, although this could also be due to characteristics of the SoC.
  • Multi-core performance of the r0p2-based the MT6752 is very impressive, although not quite scaling linearly with the doubling of the amount of cores. Multi-core performance does appear to be scaling significantly better than the asymmetrically clocked cores in the Snapdragon 610 prototype, even when allowing for a very low clock speed of the second cluster of the latter. This is not unexpected because multi-threading, especially in a benchmark, is likely to be significantly more efficient when dealing with equivalently-clocked CPU cores.
  • Memory performance of the MT6572 test device is impressive for its class, with a significant increase over the Cortex-A53r0p1-based Qualcomm SoCs, and being dramatically higher than existing designs that also utilize an economical 32-bit memory interface. Although higher-clocked memory is likely a factor, data rate and memory controller improvements in the r0p2 revision of the Cortex-A53 core are likely to be more significant. ARM has alluded to improvements in the memory subsystem and data rates in Cortex-A53, which may be more fully realized in the r0p2 revision and its implementation in the MT6572 SoC.

Other new ARM IP technology contributes to performance and efficiency improvement


The Cortex-A53 has become available together with other IP products from ARM that improve performance and efficiency. These include a faster and more efficient interconnect bus, compression and other data rate reduction techniques such as ARM Frame Buffer Compression (AFBC), Smart Composition, and Transaction Elimination, and new Mali GPU cores (such as Mali-T760 and Mali-T72x) which together have the potential to dramatically improve performance and especially power consumption for graphics-related tasks (including typical device use), while also alleviating the memory bandwidth bottleneck in cost-sensitive devices with a limited memory subsystem, such as the 32-bit external memory interface used in most entry-level to mid-range mobile devices.

Favourable comparison with existing high-performance designs


Judging from these early benchmark results, an octa-core Cortex-A53 can achieve performance rivalling existing high-end platform such as Snapdragon 801 in several metrics. The test results of the MT6752-based device show a dramatically higher Geekbench multi-core integer performance score when compared to Snapdragon 801, with single-core integer performance being similar. However, the scores are inflated due to the heavy weighting of new cryptography instructions available with MT6752's support for AArch32, although in general AArch32 is likely to bring benefits for most applications. Multi-core floating performance is also higher. Single-core floating point and memory performance are clearly lower than Snapdragon 801, although not dramatically so. Nevertheless, considering the fact that the MT6752 is supposed to and likely to be using only a 32-bit memory interface, its memory performance is very impressive, being a large improvement over existing devices with a 32-bit memory interface.

The strong "premium level" performance of devices like the MT6752 is associated with a dramatically decreased chip manufacturing cost when compared to existing high-end SoCs such as Snapdragon 801. The Cortex-A53 cores, even in an octa-core configuration, are likely to be significantly smaller than out-of-order high-performance cores such as the Krait-400 cores used in the Snapdragon 801, resulting in chips with a much smaller die size (similar comparisons can be made with ARM's high-performance cores such as Cortex-A1x and Cortex-A57). Power consumption is also likely to be dramatically improved.

Revolution on the cards for performance, cost and power efficiency


Coupled with the cost reductions allowed by the 32-bit memory interface (as compared to the 64-bit or 32-bit dual-channel interfaces of existing high-end devices), with Cortex-A53 a revolution in performance/dollar and performance/Watt for high-performing devices appears to be on the cards. At the same time, lower-end devices (using, for example, a quad-core Cortex-A53r0p2 configuration) will see dramatic performance improvement.

When Cortex-A53 cores are combined with other high-end features such as a wider memory interface and a high-performance GPU (such as implemented in MediaTek's upcoming MT6795), there is potential to further close in or even surpass the performance of existing premium-level architectures, with greatly increased (power) efficiency and reduced cost. Although single-thread performance is not likely to quite reach the level of existing premium devices, other metrics (including multi-core performance, power consumption and cost) are likely to see a dramatic improvement. Early reports already indicate that SoCs such as the MT6795 will be disruptive in terms of cost and efficiency for high-performance mobile applications.

In conclusion, the emergence of Cortex-A53-based designs and associated IP is likely to revolutionize performance, cost and efficiency in mobile devices, bringing higher performance to cost-sensitive entry-level and mid-range devices, reducing cost for high-end devices while also improving the performance of premium devices with much greater efficiency and reduced cost.

Sources: Geekbench result database, ARM, EE Times (Comments about adoption of MT6795)

Updated September 28, 2014 (Fix revision designations of Cortex-A53 based on feedback; revisions reported by Geekbench are minor revisions of major revision r0, as in r0pN).
Updated September 30, 2014 (Use more representative benchmarks for some SoCs, provide information about Geekbench and Android version as well a weblink for all tabulated benchmark results, discuss merits of different ARMv8 instruction set models, make note of cryptography instructions in AArch32 inflating Geekbench Integer Performance, and other improvements).
Updated October 3, 2014 (Include early reports about octa-core Cortex-A53 MT6795 adoption for high-performance devices).
Updated November 13, 2014 (Correct information about effectiveness of GTS on Exynos 5433).
Updated December 45, 2014 (MSM8939 is Snapdragon 615, not Snapdragon 610).

To do: Cortex-A53 Geekbench scores are likely to be inflated because of support for 32-bit ARMv8 mode in the most recent versions of Geekbench, which enables the use of cryptography instructions that significantly increase the scores of certain subtests of the Geekbench CPU Integer performance tests, while not accurately reflecting the CPU performance increase for most applications. This will be further investigated in the near future. As I did in subsequent blog posts, concentrating on Geekbench subtests that better represent integer CPU performance such as the JPEG Compress test, rather than the overall integer performance scores, should give an much better picture.

Thursday, September 18, 2014

Widespread adoption of MediaTek's upcoming 4G solutions, capacity at TSMC remains tight

4G smartphone prices to drop in China, MediaTek to see widespread adoption


In an article from 16 September, DigiTimes quoted a China Mobile Device official saying that the lowest prices of 4G smartphones will drop towards CNY500 ($80) near the end of the year, compared to current prices of about $160 and below. MediaTek is reported to be providing chip solutions for these devices.

Similarly, a DigiTimes article on September 15 reported that MediaTek has won general adoption of its 4G chip solutions from China-based smartphone vendors, pushing Qualcomm to offer the inexpensive Snapdragon 210 in competition. MediaTek's announced integrated 4G chips solutions include the MT6732 and MT6752. Although the MT6732 is targeting the entry-level market, its performance and cost characteristics (associated with the quad-core Cortex-A53 CPU and Mali-T760MP2 GPU) put it somewhat above the lowest-cost part of the 4G smartphone market. It is likely that MediaTek will introduce new very low cost integrated 4G SoCs in the first part of 2015, competing with Qualcomm's new low-end Snapdragon 210 platform which is expected to be shipping around that time.

Early reports also indicate that MediaTek's new high-performance SoC, the MT6795 with an octa-core Cortex-A53 configuration, dual-channel memory interface and high-end GPU, will be disruptive in terms of cost and efficiency for high-performance smartphones, with potential sales volume as high as 30 million already being mentioned for Q1 2015.

Meanwhile, according to DigiTimes demand for 3G smartphones from emerging markets continues to be strong due to the slow pace of 4G network introductions, leading some smartphone companies to again compete with 3G models after earlier focusing more on 4G-enabled models.

MediaTek sales growth respectable, but helped by MStar acquisition


The finalization of the acquisition of MStar Semiconductor has made comparisons of MediaTek's financial results less transparent. Going back to last year, MediaTek reported revenues NT$39.0 billion for Q3 2013 with strong growth of smartphone chip demand from China and emerging markets. For Q4 2013, MediaTek reported revenues of NT$39.8 billion.

For Q1 2014, MediaTek reported revenues of NT$46.0 billion, but it mentioned that the increase was largely due to the merger with MStar, which became effective on February 1, 2014. For Q2 2014, MediaTek reported revenues of NT$54.1 billion, with the increase coming from the MStar acquisition as well as revenue growth in the smartphone and tablet product lines.

The monthly revenues of NT$19.4 billion for July 2014 and NT$19.7 billion for August, and expectations for a revenue level around NT$20 billion in September, put MediaTek's estimated Q3 2014 revenues at about NT$60 billion. This reflects a sequential revenue growth rate of 11% for the seasonally stronger third quarter, compared to sequential revenue growth of 17% in Q3 2013, when MStar's product lines were not yet included.

MStar's quarterly revenues were about NT$8.5 billion in Q3 2013. Estimating the current rate of MStar-derived product lines is difficult; some MStar products (including touch screen controllers and advanced smart-TV SoCs) may have seen significant sales growth recently, while older products may have declined. Assuming a similar revenue contribution in Q3 2014 (which is only a rough estimation), an apples-to-apples comparison of MediaTek's traditional product lines would suggest year-over-year revenue growth from NT$39 billion in Q3 2013 to roughly NT$52 billion Q3 2014, a growth rate of 33%, which is respectable, but lower than the apparent 54% growth rate when the MStar acquisition is not accounted for.

MediaTek's estimated 11% sequential growth rate in Q3 is likely to have been negatively affected by the tightness of production capacity at TSMC, which matches earlier news articles from DigiTimes about MediaTek's wafer production. Without these restrictions, the growth rate may have been significantly higher.

Capacity at TSMC to remain tight


In an article on 16 September, DigiTimes reported that follow-on orders for Apple iPhone 6 devices are extending order visibility at foundry houses such as TSMC to the first quarter of 2015, particularly for 8" wafer production, which is used for chips such as LCD driver ICs, power ICs and other peripheral chips.

The article also mentions that leading-edge 12" capacity utilization (reflecting 28 and 20 nm processes used for advanced SoCs for smartphones and other applications), which was earlier expected to drift downward in Q4 2014, is now expected to continue to see high utilization, with a rebound of order from chip suppliers including Qualcomm and Broadcom.

Given the tight capacity situation that has existed in this segment for a while, this comes as no surprise since several companies are likely to have low inventories because they could not get sufficient chips from TSMC in the recent period. The priority that was given to Apple has resulted in a large amount of capacity being reserved for production of the Apple A8 SoC at 20 nm, with other players being squeezed. While Qualcomm has deep pockets and can invest billions of dollars in purchase commitments, some smaller players are likely to have had difficulty obtaining sufficient production capacity. MediaTek is an important player in this segment that is likely to have been affected by the capacity situation.

Sources: DigiTimes, MediaTek

Updated October 5, 2014 (Include early report about MT6795 adoption for high-performance smartphones).

Wednesday, September 17, 2014

High-performance tablet processor designs: Moving away from big.LITTLE and Cortex-A1x?

Rockchip and Allwinner are both Chinese fabless semiconductor companies that for a number of years have seen success supplying SoCs for tablets. During most of the last few years, either Rockchip or Allwinner has had the highest market share for tablet processors worldwide in terms of unit shipments, being used in high volume and low cost tablet models from Chinese manufacturers, often with a substantial lead on this metric over leading high-end tablet SoC providers such as Apple or Samsung.

Rockchip's RK3288 and Allwinner's A80 processor, both ambitious, high-performance application processors at least originally targeted at tablets, seem to have suffered from design and production issues, mostly related to the high power consumption of the ARM Cortex-A12 (or less likely A17) and A15 cores they implement and other hardware-related issues. Instead of appearing in mass-market tablets in volume, the chips are currently pitched mainly towards lower volume applications like media boxes and development boards, which because of their low volumes are unlikely to make the investment that went into the design and mass production of the chips profitable for the respective companies. Even for these niche markets, there appear to be problems related to performance, heat production, hardware design and related issues when using these chips.

Allwinner A80: Not quite ready


In early testing of the A80-based A80 OptimusBoard, benchmark results were disappointing given the CPU specification (quad-core Cortex-A15 up to 1.8 GHz and quad-core 1.2 GHz Cortex-A7 in big.LITTLE configuration), with evidence of either aggressive throttling or low actual clock frequencies (or even non-utilization in many circumstances) of the Cortex-A15 cores. CPU-specific test results are lower than expected, with the Vellamo benchmark's Multicore result being in line with a quad-core 1.2 GHz Cortex-A7 Snapdragon 400, and a particularly low Physics score in the 3DMark Ice Storm Extreme test, with evidence of very conservative CPU scheduling from kernel logs. In fact, these CPU-specific scores would be consistent with only the four Cortex-A7 cores running at about 1.2GHz (similar to the Snapdragon 400) actually being used. However, the firmware (from July) was relatively early, and preliminary test results of the Phoronix Test Suite in a Linux environment generally show better performance than Cortex-A9 based processors such as the RK3188, despite the A80's Cortex-A15 cores reported to be running at only 1.2 GHz, and falling short of the performance that would normally expected from competitive Cortex-A15-based CPUs such as Samsung's Exynos Octa family. All things being equal, a Cortex-A15 core should be significantly times faster than a Cortex-A9 core at the same clock speed.

Allwinner announces A83T, an octa-core Cortex-A7 tablet processor


In fact, Allwinner on September 4 already announced a new processor, the A83T, containing an octa-core Cortex-A7 configuration, which is expected to be commercially available in devices before the end of the year. According to the announcement, all cores can run simultaneously at up to 2.0 GHz, although it also mentions big.LITTLE, which appears to be an inappropriate term for this type of architecture, although it would be possible to, for example, optimize one four-core cluster for performance and the other for power efficiency. Allwinner prominently advertises the low power consumption of the chip despite high performance, which makes sense given the characteristics of Cortex-A7 and the contrast with the problematic power consumption of Allwinner's A80 (which is a true big.LITTLE chip that includes power-consuming Cortex-A15 cores). The A83T is also advertised as having a PowerVR GPU (probably a Series 6 Rogue GPU, as used by other chip designers including MediaTek and Apple).

The CPU architecture of the A83T appears to be strikingly similar to MediaTek's octa-core Cortex-A7 MT6592, which has already been proven and has been shipping since the end of 2013, and more specifically the MT8392 variant targeting tablets. However, both MediaTek chips incorporate a built-in 3G baseband, and MediaTek currently does not appear to be offering a similar solution for WiFi-only tablets without baseband (probably not intentionally. maybe because it had expected its big.LITTLE MT8135 to be viable). Because of the use of extremely power-efficient Cortex-A7 cores, MediaTek's octa-core chips such as MT6592 have proven to be fast but power-efficient in practical use while remaining relatively cheap and economical to manufacture.

New 28HPC process at TSMC provides cost, power and performance benefits


The Allwinner A83T uses TSMC's new 28 nm HPC (28HPC) process, which is reported to have been widely adopted for new designs by many chip design companies because of performance, power, and cost benefits. According to TSMC, it provides 10% smaller die size and 30% lower power at all levels of speed, or over 20% speed improvement at the same power when compared to its 28LP process. When compared to its currently popular leading-edge 28HPM process, it offers comparable performance but smaller die size.

More evidence of big.LITTLE being superseeded by low-power "true" octa-core designs


Allwinner appears to be on a path to quickly replace the big.LITTLE architecture in the A80 (with appears to have limited potential for tablets) with the much more efficient octa-core Cortex-A7 configuration of the A83T for higher-performance tablets. MediaTek has already made a similar move for its upcoming high-end smartphone platforms, with the big.LITTLE MT6595 looking likely to be quickly superseeded by the much more efficient MT6795 using Cortex-A53 CPU cores.

Allwinner has plenty of experience with Cortex-A7 cores, being one of the first chip companies to adopt it in its A31 quad-core tablet processor manufactured at 40 nm starting from the end of 2012. Its quad-core A33, which has been announced to be currently ramping, also uses a quad-core Cortex-A7 CPU. Despite this fact, Allwinner, like most of the rest of the industry, seems to have underestimated the potential of octa-core configurations of the Cortex-A7 (and subsequently Cortex-A53), because otherwise the new chip would already have been available and the A80 might not even have existed.

Looking ahead, it seems like that companies like Allwinner will transition to the efficient 64-bit ARM Cortex-A53 core (primarily because it is a new and faster successor of the Cortex-A7, rather than its support ARM's 64-bit ARMv8 instruction set, which is not yet important) sooner rather than later, and Cortex-A53 has already been widely adopted for smartphone processors by leading smartphone SoC companies MediaTek and Qualcomm across several segments ranging from entry-level to premium (see my earlier articles).

Based on a reported roadmap of upcoming chips, even Rockchip is finally moving to more cost-effective and less power-consuming chip architectures. Rockchip's  RK3126 and RK3218 are low-cost quad-core Cortex-A7-based tablet processors with a Mali GPU, closely matching Allwinner's new A33 chip. Additionally, the future "MayBach" SoC will contain an octa-core ARM Cortex-A53 CPU, the type of configuration that has already started to show  impressive performance test results in early benchmarks while likely relatively low power-consumption, which is likely to be much more viable than Cortex-A1x and big.LITTLE for the higher-performance segment, while still allowing the option of a big.LITTLE-like power saving technique because of the ability to optimize a portion (e.g. one cluster) of the procesor cores for power consumption rather than performance.

RK3288 appears to have limited potential for tablets


As evidenced by the review of a an early RK3288-based TV box at CNXSoft, Rockchip's RK3288 is fast but there still seem to be issues that will affect its viability. Some stability problems were noted, and the device became relatively hot. Such heat production would make the chip problematic for tablets, which is normally the main market generating high sales volumes for Rockchip. Without tablets, it is doubtful that production of this chip will be profitable at all.

Widespread early design activity for alternative devices like TV boxes, with indications of hardware and possibly software issues related to the chip, could point to a relatively large inventory of RK3288 chips with certain hardware defects held by Rockchip, which because of the unviability for tablet production, Rockchip is trying to unload onto the TV box market (which has a greater tolerance for heat production and greater ability to work around hardware problems).

Rockchip not transparent about presence of Cortex-A12 cores


The CPU-Z app for Android was run on the device, and shows fairly conclusive evidence that the CPU cores in the RK3288 are really Cortex-A12 cores. This is not at all surprising, given that Rockchip's foundry GlobalFoundries was the first to adopt the Cortex-A12 core and announced volume production a few months ago (very likely for the RK3288)  and the Cortex-A17 core is currently more closely associated with TSMC, although chips using it seems to have had little success ramping to full production so far.

Earlier reports (1) (2) already suggested that the chip in fact contains Cortex-A12 cores, as was originally reported based on information by Rockchip in 2013. It appears that Rockchip at some point made a marketing decision that the chip should be advertised as having Cortex-A17 cores, despite actually having Cortex-A12 cores, because it sounds better, is a higher number and is known to be faster than the Cortex-A12. This behaviour is questionable at best, and raises questions about Rockchip's company structure and culture (for example, many engineers would probably feel uncomfortable with blatantly incorrect marketing information, while a senior manager might insist on it, especially if that manager had earlier embarked on commitments with overstated specifications).

There are some similarities with the behaviour of Actions Semiconductor, a smaller Chinese player in the tablet processor market, at the start of 2013, when the ATM7029, one of the first affordable quad-core tablet processor chips, was for a long time advertised as having more attractive Cortex-A9 cores instead of the actually present, much slower (but power-efficient) Cortex-A5 cores. Actions went as far as to modify the OS kernel to hide the processor core information, although they later became more open about the actual CPU specifications. Hopefully, Rockchip will not go as far.

Despite problems, RK3288 offers high CPU and GPU performance


The early review of a TV box product with a RK3288 shows that the chip has great performance potential, even with Cortex-A12 cores, which are not significantly slower than Cortex-A17. In the TV box, the clock frequency went up to 1.8 GHz, which given the high performance per cycle of the Cortex-A12 (being in the same family as the Cortex-A15 and Cortex-A17) leads to impressive CPU performance. Performance is also helped significantly by the dual-channel memory interface, providing significantly more memory bandwidth when compared to previous Rockchip products.

Performance reported by benchmarks was not entirely consistent, with Antutu 4.x reporting a high score (but relatively low CPU integer score), while it shows strongly in the most recent Vellamo benchmark with a strong Multicore test result and Browser and Metal test results close to that of the fastest Qualcomm Snapdragon 801-based devices.

Although performance in games in practice appeared to be excellent (but with a drop-off when going from 720p to 1080p resolution), the 3DMark Ice Storm Extreme result was disappointing compared to high-end Qualcomm Snapdragon platforms. Since the RK3188 is one of the first chips to be benchmarked with (as far as can be confirmed) an Mali-T764 GPU (which is presumably equivalent to Mali-T760 MP4), this could reflect performance characteristics of the Mali-T760 GPU architecture.

Ice Storm Extreme is an OpenGL ES 2.0 benchmark that renders at 1080p and uses relatively demanding (large) textures and post-processing effects. It is possible that the memory subsystem (including the amount of memory bandwidth and the efficiency of the L2 cache and memory interface) is a bottleneck in the tested RK3288 device, as that would result in significantly lower performance when using high resolutions and large textures. Size and speed of the GPU cache memory (which is usually configurable by the chip design company) could also be involved. That practical game performance appeared to be excellent is probably helped by the fact that OpenGL ES 2.0 games tend to have limited texture size and shader complexity in order to be compatible with a wider range of devices. Ice Storm Unlimited test results (which tests performance offscreen using a fixed 1080p resolution but standard textures) would provide better comparison material.

Sources: CNXSoft, TSMC (28HPC process announcement), Allwinner (A83T announcement), CNXSoft (Rockchip product roadmap)
 
Updated September 30, 2014.

Friday, September 12, 2014

Qualcomm announces entry-level Snapdragon 210 and 208 smartphone SoCs

Qualcomm has announced the Snapdragon 210 and Snapdragon 208 SoCs for entry-level smartphones. The new products fill gaps in Qualcomm's product line in the low end, replacing the relatively unsuccessful and outdated Snapdragon 200, allowing Qualcomm to target cost-sensitive markets without having to resort to using its more costly Snapdragon 400 and 410 platforms. The platforms use 32-bit (ARMv7) ARM Cortex-A7 CPUs cores, not the new 64-bit Cortex-A53 cores used in platforms such as Snapdragon 410. The use of Cortex-A7 cores mostly likely reduces cost compared to Cortex-A53, while remaining suitable for applications that do not put high demands on CPU performance. Both platforms are expected to be shipping in the first part of 2015.

Snapdragon 210: A more cost-effective Snapdragon 400


The Snapdragon 210 uses a quad-core Cortex-A7 CPU up to 1.1 GHz, with proven cost-effectiveness and power efficiency, with a new Adreno 304 GPU (with likely significant cost savings compared to the Adreno 305 GPU in the Snapdragon 400) and integrated 4G LTE Advanced modem, supporting carrier aggregation, with support for dual SIM. The chip targets entry-level smartphones with a display resolution up to 720p, and supports hardware decoding and encoding of 1080p H.264 video and 1080p HEVC (H.265) decoding. Performance is increased relative to Snapdragon 200 by supporting mainstream 533 MHz LPDDR3 memory. The chip is manufactured using a 28nm LP (low power) process. Overall, the specifications for a large part overlap with older quad-core Cortex-A7-based Snapdragon 400 SoCs, but with adjustments for the lower requirements of entry-level devices and significantly reduced chip manufacturing cost.

Snapdragon 208 targets entry-level 3G smartphones


The Snapdragon 208 is a cost-reduced version of the Snapdragon 210, with a dual-core Cortex-A7 CPU and a baseband that limits cellular network support to 3G. It limits video decoding and encoding resolutions to 720p, although H.265 decoding is supported, and has lower maximum resolutions for screen (960x540) and camera, and limits the maximum speed of the memory interface to 400 MHz. This chip competes with MediaTek's MT6572 platform, which has been shipping for some time and has similar features, and the MT6571.

For both platforms, Qualcomm advertises compatibility with its cost-reducing RF chips such as the RF360 and third generation 28nm RF transceiver.

Cost reduction for entry-level makes sense


Based on information such as new models announced at IFA by manufacturers targeting cost-sensitive markets such as Alcatel and Lenovo, Qualcomm is already aggressively targeting the entry-level market (primarily the part of the market that is transitioning to integrated 4G) with SoCs from its Snapdragon 400 and 410 platforms. It is likely that Qualcomm is currently subsidizing the relatively high cost of these chips (in relation to their use in entry-level devices), especially for 3G-only devices, selling them with a relatively small profit margin, which it can easily afford to through the leverage of the very high royalty rates that it strives to enforce on the wholesale price of all 3G/4G smartphones. Because Qualcomm is currently the only provider of viable SoCs with integrated 4G, it can also ask a certain premium for 4G. However, through royalties, Qualcomm is technically able to continue to achieve a high profit margin when selling the Snapdragon 400 or 410 into an entry-level platform, even when making little money on the SoC chip itself based on the nominal selling price adjusted for the entry-level segment. With the Snapdragon 210/208, Qualcomm can improve its profit margins in this segment without having to rely on patent royalties.

Competition: MediaTek


MediaTek has dominated the SoC market for entry-level 3G smartphones with efficient and cost-effective platforms such as MT6572 and MT6582. However, integrated cellular network support in these platforms is limited to 3G, and Qualcomm is currently taking advantage of MediaTek's delayed introduction of cost-effective SoCs with integrated 4G by targeting version of its Snapdragon 400 and 410 platforms at entry-level manufacturers. Despite the relatively high chip manufacturing cost in relation to the entry-level segment, Qualcomm can afford to subsidize these chips in conjunction with its royalty schemes and financial leverage. In the extremely tight capacity environment in TSMC's 28/20nm fabs, Qualcomm is also likely to have gained a larger proportion of its desired chip capacity than MediaTek, because of its ability to invest billions of dollars into purchase commitments.

MediaTek has announced SoCs with integrated 4G modems, including the MT6732 and several chips targeting higher segments, which are likely appear on the market very soon. However, the MT6732 uses slightly more performance-oriented Cortex-A53 cores and is not targeted at the lower regions of the entry-level segment like Snapdragon 210 is. Whether MediaTek will be able to effectively target entry-level 4G segments partly depends on the cost and efficiency its new integrated 4G baseband architecture. Its current stand-alone MT6290 4G modem chip, when combined with existing SoCs like MT6582, clearly does not fit the cost profile for entry-level devices.

Sources: Qualcomm, Ars Technica

Wednesday, September 10, 2014

Apple announces iPhone 6 and iPhone 6 Plus using Apple A8 SoC

As expected, Apple announced several new products on September 9, most prominently the iPhone 6 and iPhone 6 Plus smartphones.

The iPhone 6 has a 4.7" LCD screen with a non-standard 750x1334 resolution (slightly higher than 720p), designed for convenient pixel scaling for existing iOS apps. The iPhone 6 Plus has a 5.5" LCD screen with standard 1080x1920 (1080p) resolution. Because of the limited dynamic pixel scaling ability of iOS, for compatibility apps can technically be scaled by a factor of three (from the Apple standard 414x736 to 1242x2208) and then downscaled to 1080p.

Both models are thin smartphones with a thickness of about 7mm. Connectivity has improved, with more LTE bands and support for VoLTE (voice over LTE) and 802.11ac WiFi. LTE is limited to Category 4, unlike certain high-end competitive devices such as the Galaxy Alpha en Galaxy Note 4 that support LTE Category 6.

Apple A8 at 20nm: Modest performance and power improvements


The process improvement from 28nm to 20nm facilitates a transistor count increase (estimated to have doubled from one billion to two billion), resulting in a relatively large chip, even at 20nm, although the reported die size of 89mm² is smaller than the 104mm² of the Apple A7. However, performance improvement compared to the Apple A7 is reported to be relatively moderate. CPU performance improvement quoted by Apple is 25%, with GPU performance (reported to be powered by a PowerVR GX6450 with four clusters) being improved by 50%. Early Geekbench test results show a modest CPU processing improvement (from around 1400 single-core/2500 dual-core on the iPhone 5S to roughly 1630/2920 on the iPhone 6), aided by a relatively minor CPU clock speed increase.

Despite the larger battery size (1810 mAH) in the iPhone 6 vs 1560mAh in the iPhone 5S) allowed by the larger physical dimensions, offset somewhat by increased power use caused by the larger screen, Apple quotes only minor improvement in battery life over the iPhone 5S for most applications (slightly better on the iPhone 6 Plus), which is disappointing since the iPhone 5S's battery life has been a major weak point in practice. However, the iPhone 6 Plus contains a significantly larger 2915 mAh battery which should allow it to have improved battery life compared to the iPhone 6 and iPhone 5S.

It is possible that gains in power efficiency are larger in practice, with the official battery-life specifications coming closer to reality relative to the more optimistic specifications for the iPhone 5S. Apple has been quoted as saying that the Apple A8 draws up to 50% less power than the previous chip (which is a statement open to interpretation, because it does not specify the level of improvement with typical use), and focusing only on the CPU logically the transition to the 20nm TSMC process with a similar configuration and clock speed should result in measurable power savings, all things being equal.

Similar CPU, but transistor count significantly increased


However, the large increase in transistor count (from roughly one to two billion) may be associated with lower power saving benefits than would otherwise be expected from applying more advanced manufacturing process to, for example, similar CPU cores. Exactly what functionality contributes to the transistor count increase is unclear; the GPU and its associated caches should be significantly larger, a requirement because of the increased number of screen pixels, and the L2 CPU cache (1MB in the Apple A7) may have increased performance (although still using the same 1MB size). The Apple A7 has been reported to already contain a large 4MB L3 cache (which uses a lot of transistors/die area) that may have increased in size in the Apple A8 or otherwise have improved performance (CPU caches of a given size can be made better-performing by using more transistors).

AnandTech has reported that L3 cache latency has improved with capacity remaining the same at 4MB, which is consistent with an increase in L3 transistor count, matching the increased proportion of the die used for the L3 cache (roughly half of the chip) on the Apple A8 when compared to the Apple A7 (the L3 cache scales down less than other blocks in the transition to 20nm). What exactly Apple's goal is by including a large L3 cache is unclear, but for applications that have a memory working set that fits entirely into the cache there would be obvious performance and power consumption benefits. Part of the benefit may be allowing most of the display framebuffer to be stored in the L3 cache, reducing memory overhead related to graphics operations and screen refresh, which would be more apparent with higher resolutions such as the iPhone 6 Plus or potentially iPads, although the L3 cache would have to be large enough to hold the actively changing part of the framebuffer (a full 1080p 32-bit framebuffer is 7.9 GB), with the amount of required memory potentially lowered by compression or encoding optimizations.

Instead of integrating the baseband into the SoC, the Apple A8 continues Apple's practice of using an external Qualcomm baseband modem chip (MDM9625M) in combination with the Apple SoC. The MDM9625M is limited to LTE Cat 4 and does not support LTE Cat 6 (Qualcomm already offers new LTE Cat 6 stand-alone modem chips, used with SoCs like the Snapdragon 805). Because it is a separate chip, Apple may be able to provide updated versions of the iPhone 6 with LTE Cat 6 with a relatively minor redesign during its lifetime.

Although few specific details are yet available, reports suggest that the Apple A8 SoC continues to use a CPU configuration similar to that of the Apple A7, most likely a dual-core performance-oriented CPU with CPU cores very similar the Cyclone cores used in the Apple A7, which may be closely related to ARM's Cortex-A57. Reports indicate that CPU performance is close that of the A7 with a modest clock speed increase from 1.3 to 1.4 GHz contributing to  somewhat increased performance. Some of the performance increase is likely to be contributed by speed improvement and/or size increase of the L2 and L3 caches. AnandTech has observed a few micro-architectural cycle time improvements, for example for integer multiplication and floating point addition, contributing to increases in synthetic benchmark scores. In terms of die area, the CPU block on the block has been estimated to have decreased from about 17mm² to 12mm² thanks to the 20nm process

Performance scaling of Cortex-A57-class CPUs and other out-of-order, speculative issue superscalar CPUs


The disappointing performance and power-efficiency scaling of high-performance CPU cores such as Cortex-A15, Cortex-A17 and Cortex-A57, and also the processor core used by Apple in the Apple A8, which all implement out-of-order superscalar pipelines with speculative issue, when transitioning to more advanced processes (such as 28 and 20nm) has already been noted in the industry. Relative to the benefits expected from the more advanced 20nm HPM process at TSMC compared to the Samsung 28nm HKMG process used for Apple's previous processor, the performance increase and and to some extent power efficiency improvement is modest, with potential for clock speed increases apparently limited.

In contrast, small, medium-performance and extremely power-efficient in-order pipeline cores such as Cortex-A53 (and earlier Cortex-A7) are showing dramatically better scaling with newer processes (including advanced 28nm and 20nm processes), with the ability to optimize for performance or power-efficiency with "core hardening", with greater increases in clock speed and the ability to use a many-core (such as octa-core) configuration with limited implications for cost (die size) and power consumption. This results in dramatically better performance/Watt characteristics and much lower chip cost even for high-performance applications, with the only disadvantage being somewhat limited single-thread performance and the requirement of more extensive utilization of multi-threading in the operating system and application software.

Apple A8 still not very power efficient, but larger batteries used


As a result, Apple's new SoC seems to be widening Apple's SoC efficiency deficit when compared to SoCs being introduced in competitive devices, with the Apple A8 being relatively high-cost and uneconomical to manufacture, providing only modest performance improvement despite the larger size and higher cost when compared to the Apple A7, and a significant and increasing disadvantage in power consumption and performance/Watt compared to SoCs from competitors, despite Apple's head start with 20 nm process technology. Although high chip cost is not a major issue for Apple due to its large profit margins, the disadvantage in power efficiency will continue to be reflected in the end-user experience.

However, at least for the 5.5" iPhone 6 Plus, Apple has significantly increased battery capacity to 2915 mAh, much larger than the 1810 mAh battery in the 4.7" iPhone 6 and the 1560 mAh of the iPhone 5S. This will result in improved battery life in the iPhone 6 Plus compared to other recent iPhones, as the increase in capacity more than offsets the somewhat larger power requirement for the screen. Testing of the new iPhone models based on automated battery life benchmarks for light browser shows a competitive score, increased somewhat over iPhone 5S, and more significantly for the iPhone 6 Plus. However, in this type of benchmark, the iPhone 5S already scored higher than what one would expect based on actual battery life in practice, and more detailed battery life benchmarks have been published that indicate battery life of the iPhone 6  models is still rather average compared to the competition.

Memory subsystem in Apple SoCs favors synthetic benchmarks


There seems to be a tendency of recent iPhones (including the previous generation iPhone 5S) to show benchmark scores often near the top of the chart for CPU performance or even battery life in synthetic benchmarks However, in practice (in typical every-day use), performance characteristics, especially battery life, tend to be considerably less than would be expected based only on these benchmarks (competitors that perform similarly in synthetic battery life test tend to have much better practical battery life). Differences in the operating system (iOS versus Android) could be a factor, but characteristics of the memory subsystem of recent iPhones may be the most significant factor affecting battery life.

Large on-chip cache memories (such as the 4MB L3 cache in Apple A7 and A8, unprecedented for a mobile SoC) have long been known in the computing world to result in particularly high scores in certain synthetic benchmarks as long as their memory footprint fits within the cache, positively affecting both CPU performance benchmarks and battery life tests (access of external RAM or flash storage is slower and uses significantly more power). Also well known is the fact that such benefits quickly diminish when a typical memory footprint no longer fits inside the cache, which may not be common for synthetic benchmarks, but may actually be pretty common for every-day use. Moreover, the relatively limited 1GB RAM of the iPhone 6 models results in more frequent flash memory access (e..g. reloading of tabs in Apple's browser, in addition to less visible activity), which can have a strong negative effect on battery life that is not apparent in many synthetic battery life tests.

In summary, the memory subsystem (large on-chip caches and limited RAM) of recent iPhones is likely to be associated with relatively high scores in most synthetic performance and battery life benchmarks that are not fully representative of the practical experience (particularly for battery life).

Apple continues strategy of reducing cost of other components


Apart from the SoC, Apple continues to emphasize profit margins by keeping cost down on several of the other hardware components, especially memory size, with some of its technical specifications already for some time having been superseeded in competitive high-end devices.

Cost-reducing features include limiting DRAM to 1GB (which can have performance repercussions for common uses cases) and internal flash storage to 16GB for the standard models (which will be the primary sellers), a mature LTE modem chip that does not support the latest LTE Cat 6 speeds, a rear-mounted mono speaker and a 8MP camera (although with most likely good performance due to relatively large sensor pixel size).

Source: AnandTech (iPhone 6 announcement), Wikipedia (iPhone 6 article), Wikipedia (Apple A8 article), MacRumors, iFixit, ChipWorks (Apple A8 analysis), AnandTech (Discussion of ChipsWorks analysis), AnandTech (iPhone 6 review)

Updated October 2, 2014 (Add SoC details from recent full AnandTech review, discussion of impact of memory subsystem synthetic benchmarks).
Updated October 20, 2014 (Add comment about potential use of L3 cache for framebuffer memory).

Tuesday, September 9, 2014

Tablet processor shipment projections for Q3: Rockchip growth too good to be true?

Tablet chip growth in Q3, Rockchip expected to dominate


In a recent update on the Chinese tablet application processor market, DigiTimes reported projected growth of the tablet chip market in Q3 from seasonal demand, while chip companies are rolling out new chips with an improved cost structure.

The article projects that Rockchip's shipments will increase significantly in Q3, while its cooperation with Intel will bear fruit with plans to ship its first processors with integrated baseband processor in Q4 2014. Rockchip already was the number one provider of tablet processors in China in Q2 2014. It also mentions growth for MediaTek, a steep decline in market share for Allwinner, and growing shipments from Intel, which is likely to pass Allwinner and reach the number three position. The article is otherwise devoid of the more specific figures that DigiTimes tends to publish after the end of the quarter, and the article may be based on preliminary expectations and not reflect final numbers.

Rockchip's booming shipments too good to be true?


However, some of the information of the article appears to be based on very optimistic projections of sources very close to Rockchip. Expectations that the Intel collaboration (using an entirely new processor architecture, new and unfamiliar IP blocks, a different foundry, and a significantly more complex integrated SoC chip design than previous Rockchip products) will "bear fruit" in Q4 2014, several months ahead of the planning described in the Intel announcement press release, has to be met with skepticism. Rockchip's track record, which includes severe delays and problems bringing to market significantly less complex chips such as RK3168 and RK3288, puts further doubt on the optimistic projections.

Rockchip is continuing to rely on its existing products RK3188T, RK3168 and RK3026, which are not likely to be very cost-effective in terms of manufacturing cost. Information from the supply chain from sources such as ARMDevices does not show evidence of any new cost-effective chips from Rockchip shipping in new tablets. Only Allwinner (with their new A33 chip) and MediaTek are showing evidence of providing new chips with an improved cost structure.

It is not unlikely that Rockchip's relatively recent low-end RK3026 (a dual core 1.0 GHz Cortex-A9-based SoC manufactured using a trailing-edge 40nm process) is seeing traction in the less visible ultra-low-end market (shipping primarily to cost-sensitive end markets) and is shipping in high volume (taking over from bottom-end chips from companies such as Allwinner and Actions), partly explaining the increase in shipments. However, none of Rockchip's current line-up of chips is likely to be very cost-effective due to their design (especially the use of large, outdated Cortex-A9 CPU cores). The same design choices also make them less power-efficient (with average battery life or worse) than a number of competitive products.

Rockchip addressing cost concerns in upcoming chips


However, Rockchip appears to be finally addressing the high manufacturing cost and limited efficiency of its chips with new more cost-effective designs that will appear on the market in the near future. The quad-core Cortex-A7 RK3126/RK3128 with Mali-400 MP2 GPU, manufactured at 40nm, have recently appeared on Rockchip's website, closely matching the specifications of the recently announced A33 from Allwinner and in a more general sense SoCs that MediaTek has been shipping for some time. A roadmap of upcoming Rockchip SoCs also provides evidence of a high-performance octa-core Cortex-A53 "MayBach" SoC, which is likely to address the issues that makes the RK3288 virtually unsuitable for tablets.

Explanations for Rockchip's paradoxical market share dominance in Q3 2014


To explain to current market success of Rockchip in terms of volume shipments and market share despite chip cost concerns, one can speculate about possible explanations:
  • The combination of the utter failure by Allwinner to maintain a competitive product offering (their new A33 may be too late to be significantly reflected in Q3, and may not be free from the concerns that earlier caused the A20 and A23 to be utter failures), and the chip supply shortage faced by MediaTek, which otherwise may have been able to fill much of the gap because of its competitive and cost-effective products, may have conspired to make Rockchip (which has plenty of capacity available at struggling GlobalFoundries) the only source of sufficient numbers of tablet chips in the current quarter. In principle, this would cause in temporary rebound in tablet processor prices, making the cost structure of Rockchip's solutions somewhat less problematic.
  • Rockchip is buying market share, selling large numbers of chips for little profit or even losses, while taking advantage of significant debt financing, using up financing provided by partnership deals or other undisclosed sources of funds. In the Chinese technology world, similar irrational high-risk boom-and-bust schemes executed by a company are a fairly common occurrence. This may include putting in very large orders at foundries with little concern about whether the volume of chips can be sold, whether the chip design is free of defects and issues that could make it unusable, or whether a profit is actually made on sale of the chips (the only thing that counts is ambition and volume).
  • GlobalFoundries has been struggling, with Rockchip being one of the few significant clients for its 28nm HKMG process outside of AMD. It is possible that Rockchip is receiving preferential treatment and significant discounts for the manufacturing of chips such as the RK3188T and RK3168, because otherwise GlobalFoundries would see even more severe underutilization of its fabs. Additionally, tight capacity at most other foundries (especially TSMC) may not only have affected MediaTek, but also many of the other smaller players in the field.
The truth is probably a combination of some of the mentioned explanations and possibly other reasons that may become more clear in the future.

Update (November 2014)


Based on a recent DigiTimes report from November 11 on tablet processor market share in China in Q3 2014, it seems the estimates for Rockchip were indeed much too optimistic, with a unit shipment decline of 12% in Q3 instead of the strong growth implied by the statements from early September. Rockchip's shipments have been affected by lower than expected overall market growth, and also by Intel's penetration of the Chinese tablet market, whose chips to a large extent cover the segment of Rockchip's high-volume mid-range RK3188(T) chip, which is also becoming less competitive from a manufacturing cost standpoint.

Although based on information from DigiTimes Rockchip still held the leadership position in terms of unit shipments in China in Q3 2014 (while being passed by MediaTek in terms of worldwide shipments, including global brand-name manufacturers), for Q4 2014 its market share will come under pressure according to a forecast by DigiTimes that says that MediaTek will take over the number one position in the Chinese market in Q4 2014, probably because of lessened production capacity constraints. Rockchip is transitioning to the more cost-effective quad-core Cortex-A7-based RK3126 for the cost-sensitive market, but it appears that this will not prevent Rockchip from losing its leadership position in China to MediaTek.

Sources: DigiTimes

Updated September 28, 2014 (Add information about upcoming, more efficient Rockchip SoCs).
Updated November 26, 2014 (Update with recent information).

Sunday, September 7, 2014

New smartphone models introduced at IFA, Qualcomm dominates

At the IFA consumer electronics trade show, which took place in Berlin last week, many manufacturers announced new smartphone and tablet models.

Based on preliminary specifications, noticeable trends for smartphones include:
  • Smartphones are gravitating to larger screen sizes, with the sweet spot at 5.0 inch for low-end and larger for mid-range and premium devices.
  • Most manufacturers are making the sensible decision of using a 720p screen in mid-range models, which provides acceptable display quality while preserving optimal performance and battery life. Because of memory bandwidth limitations associated with economical SoCs like the Snapdragon 400 and 410 series, using a 1080p display would seriously degrade the user experience because of the negative impact on performance and battery life.
  • Manufacturers serving cost-sensitive markets such as Alcatel, Lenovo and ZTE are introducing 4G LTE network connectivity in several models, mostly using Snapdragon 400 or 410 platform SoCs with integrated 4G modem, impacting MediaTek which previously supplied chips for a large proportion of smartphone models from these manufacturers, and is left supplying only lower-end 3G-only devices.
  • Several new models with Cortex-A53-based SoCs from Qualcomm have been announced, mainly using the cost-effective Snapdragon 410 (quad-core) platform, and HTC announced a model with Snapdragon 615 (pseudo-big.LITTLE octa-core Cortex-A53). A large number of smartphones from most manufacturers, using various configurations of Cortex-A53 cores, targeting the entire spectrum from entry-level to mid-range to premium, are likely to be introduced within the next six months. Apart from Qualcomm, MediaTek is also introducing a competitive product line with comparable specifications.
  • Qualcomm's Snapdragon 801 continues to be used for new high-end models, including Moto X (2014) and Sony Xperia Z3.
  • Although one model from Lenovo was announced using the MT6595M, which is not likely to be manufactured in significant volume, there was a noticeable absence of any devices featuring MediaTek's upcoming Cortex-A53-based chip family, such as MT6732, MT6752 and MT6795, despite the fact that these platforms are expected to appear in the market fairly soon (before the end of the year). This probably does not mean that models using these chips are not in development, but rather that the competitive environment and the current sensitivities with Qualcomm royalties and licensing make early announcement of any such models problematic.

Qualcomm encroaching on entry-level segment


In terms of market share, it appears that Qualcomm is taking advantage of its leadership with integrated 4G basebands to encroach onto the low-end that MediaTek has previously dominated. MediaTek's new chips with integrated 4G baseband have not yet reached the market, and it does not look like MediaTek's temporary stand-alone 4G modem solution (MT6290 used in combination with MT6582 or MT6592) is seeing significant traction.

Despite the backdrop of the complications related to the Chinese investigation into Qualcomm's royalty and licensing practices and alleged monopoly, it seems Qualcomm has reasserted its dominance in the market in the short term. It also appears that, with billions of dollars of capital at hand for purchase commitments, Qualcomm has been able to retain sufficient manufacturing capacity at TSMC, where the large production ramp for Apple (which has made similar large purchase commitments) has exacerbated the tight supply situation since earlier this year, with MediaTek's supply of chips likely to be squeezed as a result of the shortage. In July, DigiTimes reported that MediaTek had negotiated a 10% increase in wafer starts for Q3 at TSMC, and in August it reported that MediaTek was seeking additional 28nm capacity at UMC, both indications of a very tight supply situation.

List of models announced at IFA


Some of the new smartphone models announced at IFA:

Acer
  • Liquid Jade, a compact smartphone with 5.0" screen with MediaTek MT6582.
Alcatel
  • One Touch Hero 2, a 6.0" "phablet" with octa-core MT6592 and LTE modem (probably MT6290).
  • Pop 2 series, with Snapdragon 400 or 410 with 4G.
  • Idol 2, probably with Mediatek MT6582.
  • Idol 2 S with Snapdragon 400 or 410.
  • Idol 2 Mini with a 1.2 GHz quad-core SoC with 3G.
  • Idol 2 Mini S with Snapdragon 400 or 410.
Huawei
  • Ascend Mate 7, a 6.0" 1080p "phablet" with HiSilicon Kirin 925 chip.
  • Ascend G7, 5.5" 720p mid-range with Snapdragon 410.
  • Honor 3C Play, 5.0" value with MediaTek MT6582.
 HTC
  • HTC Desire 510, a low-end 4.7" (480p) model with Snapdragon 410 (quad-core Cortex-A53).
  • HTC Desire 820, 5.5" 720p screen, mainstream with Snapdragon 615 (octa-core Cortex-A53).
 Lenovo
  • VIBE X2, 5.0" high-end with MediaTek MT6595M (octa-core big.LITTLE). However, this high-end chip is unlikely to be produced in significant volume, being superseded by the MT6795 (octa-core Cortex-A53) in the near future.
  • VIBE Z2, 5.5" 720p mid-range, Snapdragon 410.
LG
  • LG G3 S, 5.0" 720p mid-range with Snapdragon 400.
 Motorola
  • Moto G (2014), 5.0" 720p screen, refresh of last year's Moto G, with same Snapdragon 400 SoC with 3G modem and primarily targeted at cost-sensitive markets including India and Brazil. A subsequent model with updated specs (such as better processor and 4G) is likely to be introduced later as a successor to the old Moto G that will more specifically target Western markets.
  • Moto X (2014), 5.2" 1080p screen, high-end with Snapdragon 801.
Nokia
  • Lumia 730/735, 4.7", low/mid-range with Snapdragon 400.
  • Lumia 830, 5.0"  with Snapdragon 400.
Samsung
  • Note 4, 5.7" QHD (1440p) high-end with Snapdragon 805 (N910S) or Exynos 5433 (N910C).
  • Note Edge, 5.6" curved screen (high-end) with Snapdragon 805.
Sony
  • Sony Xperia Z3, 5.2" 1080p high-end with Snapdragon 801.
  • Sony Xperia Z3 Compact, 4.6" 720p high-end with Snapdragon 801.
  • Sony Xperia E3, 4.5" low/mid-range with Snapdragon 400.
ZTE
  • Blade Vec 3G, 5.0" low-end with MT6582.
  • Blade Vec 4G, 5.0" low-end with Snapdragon 400.

Sources: GSMArenaPocketnow, Alcatel, DigiTimes

Updated September 9, 2014.