Zhaoxin, a China-based CPU developer with an x86 license, has but to formally introduce its next-generation KaiSheng KH-40000 processors with as much as 16 cores for datacenters. Nevertheless, it has already began to submit benchmark outcomes to the Geekbench 5 database. The brand new CPUs present noticeable microarchitecture-related efficiency enhancements over their predecessors however can barely meet up with fashionable CPUs from AMD and Intel.
Mysterious CPUs
Zhaoxin, co-owned by Through Applied sciences and the Shanghai Municipal Authorities, has been steadily leveraging microarchitectures designed by Through (or fairly by Centaur) for the reason that mid-2010s, and its upcoming KaiSheng KH-40000 collection processors for datacenters are based mostly on the CentaurHauls microarchitecture that some declare resembles Intel’s Haswell microarchitecture from 2013. Â
The KaiSheng KH-40000/16 and KaiSheng KH-40000/12 CPUs run at 2.20 GHz, have 16 and 12 cores, and are outfitted with 32MB and 24MB of L3 cache, respectively. As well as, the 16-core mannequin appears to characteristic simultaneous multithreading expertise (SMT), so it could actually course of as much as 32 threads concurrently, assuming that Geekbench 5 accurately reads its capabilities. Based mostly on specs of Zhaoxin’s KaiSheng KH-40000/16 and KaiSheng KH-40000/12 printed within the Geekbench 5 database, these CPUs look similar to Centaur’s never-released CHA processor unearthed earlier this 12 months. Â
There are variations although: CHA had eight cores, didn’t assist SMT, and was architected for TSMC’s N16 node, whereas KaiSheng KH-40000 has as much as 16 cores, appears to characteristic SMT, and is believed to be designed for TSMC’s N7 fabrication course of. Moreover, processor IDs of each KH-40000 CPUs learn ‘CentaurHauls Household 7 Mannequin 11 Stepping 3’ (1, 2), whereas the processor ID of Centaur’s CHA is ‘CentaurHauls Household 6 Mannequin 71 Stepping 2,’ so the CPUs in query use totally different silicon. Â
What’s odd, although, is that each CHA and KH-4000 function at 2.20 GHz, so if we didn’t know CPU IDs, we may speculate that the mannequin KH-4000/16 makes use of two eight-core CHA dies produced on TSMC’s N16 node and glued collectively utilizing an interconnect.
Mediocre Efficiency
For Zhaoxin, CentaurHauls must be a major microarchitectural development from its LuJiazui microarchitecture from 2019. Moreover, the improved core rely ought to make KaiSheng KH-40000 CPUs extra aggressive on the server market. So, let us take a look at the efficiency numbers submitted by the CPU developer.
Zhaoxin KH-40000/16 | Zhaoxin KH-40000/12 | Centaur CHA | Zhaoxin KX-U6780A | AMD FX-8350 | Core i9-12900K | Ryzen 9 5950X | ||
---|---|---|---|---|---|---|---|---|
Basic specs | 16C/32T, 2.20GHz, 32MB L3 | 12C/12T, 2.20GHz, 24MB L3 | 8C/8T, 2.20GHz, 16MB L3 | 8C/8T, 2.70GHz, 8MB L3 | 4C/8T | 8P, 8E, 3.20 ~ 5.10GHz, 30MB | 16C, 3.40 ~ 5.0 GHz, 64MB | Basic specs |
Microarchitecture | CentaurHauls | CentaurHauls | CentaurHauls | LuJiaZui | Bulldozer/Piledriver | Golden Cove + Gracemont | Zen 3 | Microarchitecture |
OS | UnionTech OS DT 20 Professional | Home windows 10 Professional | Home windows 10 Professional | Home windows 10 Professional | ? | Home windows 11 Professional | Home windows 10 Professional | OS |
Single-Core | Integer | 450 | 439 | 476 | 366 | 670 | 1830 | 1435 | Single-Core | Integer |
Single-Core | Float | 559 | 538 | 541 | 318 | 607 | 2189 | 1881 | Single-Core | Float |
Single-Core | Crypto | 1039 | 934 | 782 | 583 | 1040 | 6064 | 4089 | Single-Core | Crypto |
Single-Core | Rating | 512 | 493 | 511 | 362 | 670 | 2149 | 1702 | Single-Core | Rating |
Multi-Core | Integer | 9293 | 3452 | 3307 | 2364 | 3570 | 20631 | 16695 | Multi-Core | Integer |
Multi-Core | Float | 11875 | 4176 | 3723 | 2089 | 3563 | 23205 | 18695 | Multi-Core | Float |
Multi-Core | Crypto | 5233 | 2119 | 4825 | 3390 | 2431 | 17413 | 8145 | Multi-Core | Crypto |
Multi-Core | Rating | 9915 | 3603 | 3508 | 2333 | 3511 | 21242 | 16868 | Multi-Core | Rating |
Hyperlink | https://browser.geekbench.com/v5/cpu/15706425 | https://browser.geekbench.com/v5/cpu/16875254 | https://browser.geekbench.com/v5/cpu/12878360 | https://browser.geekbench.com/v5/cpu/12878360 | https://browser.geekbench.com/v5/cpu/15900997 | https://browser.geekbench.com/v5/cpu/15911328 | https://browser.geekbench.com/v5/cpu/9506672 | Hyperlink |
With regards to single-threaded efficiency, Zhaoxin’s (or Centaur’s) CentaurHaul microarchitecture considerably outpaces the corporate’s earlier technology LuJiazui microarchitecture each in integer (by 22%) and floating level (by 75%) workloads despite the fact that the brand new CPU operates at 2.20 GHz. In distinction, the older one works at 2.70 GHz. The FPU efficiency uplift appears fairly dramatic, however one ought to keep in mind that we’re coping with an artificial benchmark.
Whereas the brand new microarchitecture is considerably higher than the previous one, KaiSheng KH-40000 CPUs with 12 and 16 cores can not compete towards any fashionable CPUs. Furthermore, their single-threaded efficiency is even decrease than that of ill-fated AMD’s Bulldozer/Piledriver structure from mid-2012.
As for multi-thread efficiency, we see a fairly odd benefit that Zhaoxin’s 16-core KaiSheng KH-40000/16 with SMT has over 12-core KaiSheng KH-40000/12 CPU. Whereas, in concept, the 16C/32T chip can course of 2.66 instances extra threads than its 12C/12T brethren (and we’ve by no means seen this type of SMT effectivity from any well-known CPU microarchitecture to this point), its precise efficiency benefit is greater than even hypothetical 2.66X (2.69X in integer, 2.84X in float). As we’re coping with a state of affairs when one CPU solely has 4 extra cores than its rival, but its efficiency is nearly 3 times greater, we imagine that there are components past the variety of cores which have such an impact on efficiency.Â
Conserving in thoughts that Home windows 10/11 doesn’t all the time work optimally with schedulers of unfamiliar multi-core CPUs, we imagine that the 12-core KaiSheng KH-40000/12 CPU outcomes obtained on Home windows 10 Professional don’t replicate its true potential.Â
But, even below Home windows 10 Professional and with out SMT, CentaurHoals is considerably quicker than LuJiazui in multi-threaded integer (by 40%) and multi-threaded floating level (78%) workloads. The issue is that absolute efficiency numbers demonstrated by each KaiSheng KH-40000 and Centaur CHA CPUs are poor by at the moment’s requirements.Â
Apparently, multi-threaded efficiency numbers demonstrated by Zhaoxin’s 12-core KaiSheng KH-40000/12 below Home windows and with out SMT are similar to AMD’s FX-8350 processor (4 modules, eight threads), which the corporate as soon as marketed as an eight-core CPU. We are able to hardly name the efficiency of a decade-old processor aggressive by at the moment’s requirements, at the very least in Geekbench 5, which isn’t the perfect benchmark.
Some Ideas
Whereas 12-core and 16-core configurations appear okay for desktops and entry-level servers, 12 and 16 cores from Zhaoxin don’t ship efficiency similar to that of 12-core or 16-core processors from AMD and Intel. Below Home windows and judging solely by Geekbench 5 scores, Zhaoxin appears to be a decade behind AMD and Intel concerning efficiency. Even when Zhaoxin allows SMT on its upcoming CentaurHoals-based CPUs (for shopper and server purposes) and Home windows ‘learns’ the right way to correctly use these cores, KaiSheng KH-40000/16 will nonetheless be two instances slower than 2021 processors from AMD and Intel with the identical core rely.