Monday, September 1, 2014

Apple's 64-bit processor cores: Strong similarities with standard ARM cores?

Cortex-A5x series announced in 2012, but Apple first-to-market with ARMv8 in 2013

ARM announced the Cortex-A57 and Cortex-A53 cores with support for the 64-bit ARMv8 instruction set in October 2012, with chips using these cores expected to be shipping in 2014. At that point in time, several companies had already licensed the cores as announced by ARM. However, ARM announced that chips using the cores would be available in 2014, after a two year delay that seems usually long. Normally, early adopters of new ARM cores would be expected to come to market earlier, initially for specific, lower-volume applications that require less of a learning curve for high-volume production.

In 2013, Apple released the Apple A7 chip used in the iPhone 5S, utilizing its 64-bit Cyclone processor cores and the first chip compatible with ARM's 64-bit ARMv8 instruction set architecture. That Apple was the first to produce a ARMv8 processor-based chip in volume, a year ahead of anyone else, was in itself a bit unusual, because whatever the size of the investment Apple has made in its internal processor design division, a major processor IP company like ARM with many years of experience would normally be expected to be first to market with a core for a new architecture that itself designed, through early adopters that license the respective cores from ARM for lower volume applications. First application of a new standard processor architecture in a very high volume consumer electronics platform is in itself very unusual.

Cyclone has performance characteristics similar to Cortex-A57

Looking at the Cyclone cores inside the Apple A7 chip, strong similarities with ARM's Cortex-A57 are readily apparent. Although Apple as a matter of policy has disclosed little information about its processor core apart from being used in a dual-core ARMv8-compatible CPU configuration, both cores are in a similar ballpark with respect to performance per MHz, performance per watt and die size, being relatively large, high-performance cores that emphasize performance over power efficiency. Reported details (such as execution pipeline type, namely out-of-order pipeline speculative issue, and the pipeline depth) of the internal architecture of Cyclone are roughly similar to that of the Cortex-A57. The Apple A7 has also been reported to probably use ARM's TrustZone security technology, which is closely associated with ARM's Cortex processor cores.

Given the above, one could ask whether Cyclone is in fact based on Cortex-A57 or a large part of its design. That Apple would develop its 64-bit processor core working with ARM and with guidance from ARM from its experience developing Cortex-A5x cores is entirely logical. It would save Apple much of the tremendous long-term investment required for a completely new architecture, as well as reducing risk, and I think that it is likely that this is actually the case.

Is Cyclone in fact a Cortex-A57?

One could go further and ask the question whether Cyclone *is* in fact a Cortex-A57 or an early variant of it. On the available evidence, that is certainly a possibility, and it would have been a clear advantage for Apple for achieving the time-to-market that it achieved with the Apple A7 processor. As part of this arrangement, there is the possibility that Apple signed a limited exclusivity arrangement with ARM for the Cortex-A57 core, leading to delayed introduction of ARM's Cortex-A5x series cores by other chip companies.

ARM's webpage on its POP core optimization platform contains the following description: "ARM currently offers the ARM® Cortex™-A57 POP IP products at popular foundries for 28nm geometries. The ARM Cortex-A57 POP technology supports single core, dual core or quad core configurations.  The Cortex-A57 processor is often used as a stand alone multi-core solution for high performance mobile applications." Although cryptic, this, and especially the last sentence, seems to refer to the use of the Cortex-A57 in Apple's A7 SoC, since this has been the only high-performance ARMv8 implementation available in volume, and the description matches the Apple A7, manufactured at 28nm, being a dual-core configuration targeting high performance mobile applications.

There simply have not been any other mobile Cortex-A57 implementations (or indeed for any application) available in the market until very recently, and no stand-alone Cortex-A57-only solutions for mobile applications have even been announced by anyone else, so the statement by ARM would hardly make sense if the Apple A7 would not actually contain processor cores (Cyclone) that can be described as a Cortex-A57.

Comparison of architectural details between Cyclone and ARM Cortex-A57

(Click to enlarge)

The table above lists the known or presumed architectural details of the Cyclone cores in the Apple A7 and A8 SoCs and the ARM Cortex-A57 CPU core. None of the information for Cyclone is based on official specifications; most of the details resulted from testing and informed guesses, primarily by AnandTech. The details about Cortex-A57 are based on information from ARM.

Based on known details, no definite conclusions can be drawn about whether or not Cyclone is very similar or even identical to Cortex-A57. Although the pipeline type is similar (out-of-order, speculative issue), typical of high-performance CPU cores, instruction bandwidth metrics within the pipeline are not really comparable because they seem to be reported in different ways.

The IPC (Instructions Per Cycle) as represented by DMIPS is certainly in the same ballpark, although DMIPS information for Cyclone is hard to come by and the listed figure is based on materials originally from NVIDIA that showed Cyclone having a 1.3x greater DMIPS rating than the the Cortex-A15r3 cores inside the Tegra K1 SoC, which have a DMIPS rating of at least 3.5.

The L1 cache size, in particular the L1 data cache size of 64 KB as reported for Cyclone, is different from the fixed 32 KB size per core for Cortex-A57. If there really is a difference, that would imply that Cyclone is at least not identical to Cortex-A57. However, whether the reported cache size for Cyclone is actually based on the combined L1 data cache size of the two processors cores in the Apple SoCs, or the actual size for a single core is not entirely clear.

Major revision 0 (r0p0 and rp0p1) of Cortex-A57 remains confidential

A further hint that Cortex-A57 has already had a production history before its recent official introduction in chips such as Samsung's Exynos 5433 comes from the fact that Exynos 5433 already contains a Cortex-A57 core that has signficantly evolved from initial qualified versions, because its revision level (r1p0), as shown in early Geekbench results (which reports the major revision as "variant" and the minor revision as "revision"), is already at a major revision level of 1. In comparison, cores such as Cortex-A7 that been used for a while still reflect a major revision of 0, with progression in major revision level mainly being evident in cores with a long history such as Cortex-A9 an Cortex-A15 (which both have progressed to major revision level 3). If Cyclone is in fact a Cortex-A57 with major revision level 0 (probably r0p0), perhaps the updated core in Apple A8 reflects major revision level 1, or the first minor revision of revision 0, listed as r0p1 on ARM's website.

In fact, ARM's website describes document versions A and B of the ARM® Cortex®-A57 MPCore Processor Technical Reference Manual, which reflect r0p0 and r0p1 (major revision 0), as confidential in the Release Information section. The subsequent C to G versions, which reflect updates for revisions r1p0 to r1p3 of Cortex-A57, is marked non-confidential, similar to most other ARM Technical Reference Manuals. This seems to imply a special status of major revision 0 (r0) of Cortex-A57.

Apple using Cortex-A5x-based processors makes sense in terms of risk management

Apple's use of an ARM core such as Cortex-A57 would be associated with a tremendous reduction of investment risk, because a new processor core designed from the ground up over many years requires a very large investment and specific expertise, and is inherently associated with risks such as significant delays, failure to meet performance targets when actually implemented in silicon, and other factors that are difficult to foresee. In Apple's prior planning, a planned move to a 64-bit processor architecture in 2013 would be associated with great risks and potential delays if it depended entirely on a new processor core developed internally from the ground up.

Going further, now that the Cortex-A53 core is emerging as a very efficient CPU solution even for higher-end platforms (as a many-core configuration of medium-performance, but very power-efficient and cost-effective cores), it would not surprise me if Apple will actually use this core, or a core with similar characteristics, for upcoming products. Although the Apple A8 that has already ramped to high volume production for new Apple products has been established to contain processor cores that are similar to Cyclone, a move to a more efficient (in terms of cost and power efficiency) design would make sense. This could involve a move to a big.LITTLE configuration containing both Cortex-A57 and Cortex-A53 cores, similar to Samsung's recent Exynos 5433, or perhaps even a many-core Cortex-A53 design which is gaining strong traction for upcoming SoCs from other companies.

Sources: ARM (Cortex-A50 series announcement)ARM (POP page), Wikipedia (Apple system on a chip), AnandTech (Apple´s Cyclone Microarchitecture Detailed), Wikipedia (List of ARM microachitectures), AnandTech (The iPhone 6 Review)

Updated October 5, 2014 (Add paragraph about confidential nature of r0 of Cortex-A57; add section comparing architectural specifications).
Updated October 6, 2014.

No comments: