x299 retrospective: a bad platform bought by the wrong customers

“There are no bad products, only bad prices” is missing an important part of the picture: there are a lot of wrong customers out there. I used to hate x299, but i would say i should blame the customers for building it for the wrong use cases.

Intel started to split consumer and server/ high end workstation CPUs with Westmere that brought 2 main changes from the previous Nehalem i7:
– integrated GPU – as a different on the same package, something that AMD brought back with huge success with Zen 2 then followed by Intel with tiles
– reducing memory channels from 3 back to 2

Since then, all consumer CPUS would keep iGpu (on die since Sandy Bridge), have 2 memory channels, a 16x PCIE link from the core usually used for the GPU and 4x to the chipseta , and up until 2018 the i7 would be 4 core / 8 threads

Server silicon starting with Sandy Bridge – E got 4 channels, no igpu, increasing core count up to 28 on the Skylake-X XCC die and PCI-E lanes avaiable through the CPU up to – afaik – 44.

Through different generations of derivatives, these CPUS were based on the same consumer CPU architecture on the core up until SkylakeX.

The main target for these chips were servers, but the same silicon was also sold to consumers marketed as high end desktop – HEDT – CPUs, as the X platform, starting with x79, then x99 and finally x299.

Let’s get through the good part: you were able to buy a single machine to act as home server, run many VM at the same time, keep massive amount of storage and expansion cards running straight through PCIe and SATA, video editing and by a smaller extent even 2d graphics performance would also benefit from the increased bandwith.

But in reality, most customers were fooled into thinking that they would get a huge leap in performance because of the increasing amount of cores while their workloads were actually single threaded and/or latency bound, as music production, gaming and everyday computing tasks are. The fact that intel kept the 4 core limit for their desktop mainstream silicon did not help, as developers were not encouraged to implement multithreading into their codebase. A famous example is Adobe suite, wich even retired some multithreaded code in the early 2010s as the bugs and hussle was not worth the reduced user base that would actually see real improvements from it – something that only got back in 2021.

Now, with x79 and x99, the multicore performance was mostly useless but at least the latency penality was basically non-existent (limited to the penality given by physical distance between cores being longer).

Here came x299, base on the Skylake core, wich brought very minor improvement over the previous Haswell/Broadwell core, but with dtwo takeaway differences:

– a new interconnect between cores called mesh in order to simplify the design of different dies at different core counts, with high bandwith but much higher – and higher variance – internal latencies compared to the old proven ringbus used in previous generations

– new Avx512 instruction set, wich massively improved throughput for some mathematic / vector calculations ONLY IF the code was compiled to take advantage of it, while requiring a lot of die space – wich means longer distances for signals, resulting in higher latencies…

with these huge latency regressions compared to previous Broadwell-E x99 chips, reviews on tech channels were quick to highlight the negative impact on many use cases, expecially gaming performance

In my video, i highlight the fact that Intel probably decided to use their power budget more on multicore throughput while keeping the mesh clock down, further exacerbating the issue, resulting in awful realtime performance for the whole family of chips.

Meanwhile, from the reborn AMD with their Zen architecture brought 8 core to mainstream with the Ryzen 7 1700x and up to 16 cores with their HEDT Threadripper series at much lower cost and power. This forced intel to increase core count on their own mainstream offerings with the 8700k, 9900k and 10900k.

As a result, sales for x299 were abysmal but still so many uninformed customers bought into the platform purely on proven track records from the successful x99 past.

The platform was actually good for a couple of specific use cases:
– running parallelized code that can take advantage of AVX512
– doing many DIFFERENT things at once, with high core count and memory support – like virtual machines or web services, the kind of code that servers built on the Xeon named chips was supposed to run in the first place
– video editing (color correction in particular) with high resolution (4k+) files without proxies, using internal NVMe RAID arrays
– graphic design on very big files, where the slightly reduced single core performance compared to mainstream platform is more than compensed by the huge 4 channel bandwith