Tag Archives: Intel

Intel Prepares GCC Compiler Support For BFloat16


Intel developers continue prepping the Linux support for next-generation Intel Xeon “Cooper Lake” processors, particularly around its addition of the new BFloat16 instruction.

BFloat16 is a new floating-point format optimized for machine learning workloads. Besides being found in next-gen Cooper Lake processors, BF16 is also found within Intel’s Nervana neural network processors and FPGAs.

Earlier this month Intel developers added BFloat16 support for GNU Gas while now they have sent out their latest patch enabling BFloat16 support within the GNU Compiler Collection (GCC).

The patch enables the compiler-side work around the new instructions for BFloat16: VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS. These AVX512BF16 instructions allow converting two packed single data to one packed BF16 data, converting packed single data to packed BF16 data, and performing a dot product of BF16 pairs accumulated into packed single precision.

The patch is now out for review. We’ll see if it manages to slide into trunk for GCC 9 with GCC 9.1’s release being imminent or will have to wait until next year’s GCC 10 compiler release.

Intel Xeon Cascade Lake Running Even Faster With Clear Linux – Six Linux Operating Systems Benchmarked

Following the initial launch benchmarks earlier this week of the Intel 2nd Gen Xeon Scalable “Cascade Lake” 8280 processors, I proceeded to run some benchmarks of different Linux distributions (operating systems) to ensure the Linux support panned out across the major platforms and while at it also comparing the performance between these different flavors of GNU/Linux. With this powerful Gigabyte Server sporting dual Xeon Platinum 8280 processors for a combined 56 cores / 112 threads, 12 x 32GB DDR4-2933MHz memory, and Samsung NVMe storage, Ubuntu 18.04.2 LTS, Ubuntu 19.04 Beta, Fedora 29, CentOS 7, Debian 9.8, and Clear Linux were tested to look at the performance of the brand new Cascade Lake.

Benchmarks looking at the performance (and compatibility) with the BSDs (namely FreeBSD and DragonFlyBSD) are being worked on for next week as well as seeing how the performance compares to Windows Server 2019, but for your viewing pleasure this Friday are some cross-Linux distribution benchmarks from these six operating systems tested this week. The Gigabyte S451-3R0 server platform has been the basis for our Cascade Lake testing thus far with this 4U chassis providing plenty of ventilation while sporting thirty-six SATA/SAS drive bays and dual 1200 Watt 80 PLUS power supplies. With all of the Linux distributions tested thus far, everything has “just worked” fine without any installation woes or other troubles.

Then again, for many years now Intel hardware — and especially their server/workstation platforms — have been greeted by great launch-day Linux support. This is especially the case with Cascade Lake supporting existing Xeon Scalable motherboards, etc. From the CPU side, if you are looking forward to “-march=cascadelake” / AVX-512 VNNI targeting, that support is present in the soon-to-be-released GCC 9.1 as one exception. Also for the Intel Optane DC Memory / NVDIMMs, the very latest kernels continue to evolve that support but at the moment we don’t have any of the new Optane persistent memory modules for testing to verify. But overall, Linux support shouldn’t be an issue for Cascade Lake with any recent major Linux distribution releases.

The operating systems benchmarked using clean installs of each on this powerful Xeon Platinum server included:

Ubuntu 18.04.2 LTS – The latest point release of the Ubuntu Bionic Beaver has the Linux 4.18 kernel, GCC 7.3, EXT4 file-system.

Ubuntu 19.04 Beta – This next release of Ubuntu due out later this month is on the Linux 5.0 kernel, GCC 8.2.0, EXT4 file-system.

Fedora Workstation 29 – Fedora 29 with all available updates has the Linux 5.0 kernel, GCC 8.3.1, EXT4 file-system.

CentOS 7 – The CentOS 7 / EL7 installation with current updates is on its patched Linux 4.10 kernel, the very old GCC 4.8.1 kernel, and XFS by default.

Debian 9.8 – While Debian 9 will be replaced by Debian 10 later this year, this current Debian release has the Linux 4.8 kernel, GCC 6.3, and EXT4.

Clear Linux 28660 – Intel’s rolling-release Linux distribution has the Linux 5.0 kernel, GCC 8.3.1 by default, and EXT4.

All of these Linux distributions were tested out-of-the-box with the same hardware: 2 x Intel Xeon Platinum 8280, Gigabyte S451-3R0, 384GB of RAM, and Samsung 970 PRO 512GB NVMe SSD storage.

All of these Linux benchmarks of Intel Cascade Lake were carried out using the open-source Phoronix Test Suite.

Intel Xeon Scalable “Cascade Lake” Processors Launch – Initial Xeon Platinum 8280 Linux Benchmarks Review

Intel’s 2nd Gen Xeon Scalable Cascade Lake processors are officially launching today! Last month we were briefed out at one of Intel’s campuses in Oregon and have been testing the new Xeon Platinum 8280 processors in recent days. In this article is a look at what’s new with Cascade Lake as well as our preliminary Ubuntu Linux performance figures for the Xeon Platinum 8280 processors.

With Cascade Lake there is now a higher memory frequency (DDR4-2933 rather than DDR4-2666 amd now an overall capacity up to 4.5TB system memory processor), Intel AVX-512 VNNI / DL BOOST for helping AI workloads and related fields, support for Intel Optane DC Persistent Memory, mitigations for Spectre vulnerabilities, and increased frequency / power efficiency compared to the previous Skylake Xeon Scalable processors.

Intel’s Xeon Scalable Platinum 8200 series offers up to 28 cores / 56 threads while their Platinum 9200 series offers up to 56 cores / 112 threads per package but hitting a 400 Watt TDP. Like the original Xeon Scalable line-up, the CPUs support six memory channels per processor while the native support for DDR4-2933 has the ability of providing a nice speed bump.

The side-channel mitigations in-hardware with Cascade Lake are for Spectre Variants 2, 3, 3a, 4, and L1TF/Foreshadow.

Besides the hardware Spectre/Foreshadow mitigations helping performance, AVX-512 VNNI / DL BOOST has the potential to be a big boost to performance for deep learning workloads with yielding much better throughput and efficiency.

Intel Optane DC Persistent Memory is the long-awaited non-volatile DIMMs allowing 128GB to 512GB of persistent storage per module and can function in a volatile memory mode for delivering greater RAM capacities at lower cost (and slightly lower performance than traditional RAM) or in the “app direct” mode for offering the Optane persistent memory for storage.

A look at all the different Cascade Lake SKUs…

We were supplied by Intel with two Xeon Platinum 8280 processors for our initial Cascade Lake testing. The Xeon Platinum 8280 has 28 cores / 56 threads, 2.7GHz base frequency, 4.0GHz turbo frequency, 38.5MB cache, and a 205 Watt TDP. The RCP pricing on the Xeon Platinum 8280 is $10,009 USD per processor. In comparison, the first-gen Xeon Scalable Platinum 8180 has 28 cores but a lower 2.5GHz base frequency with 3.8GHz turbo and DDR4-2666 vs. DDR4-2933 MHz memory frequency while having the same TDP and cache size.

Intel Iris Driver Gets ~5% Performance Boost With Direct3D 9 Support On Gallium Nine


The Gallium Nine state tracker providing Direct3D 9 API support for Windows games/applications running on Linux under Wine will now be a little bit faster when using Intel’s new Iris Gallium3D driver.

Simply having access to Gallium Nine is already a big advantage to the new Intel Iris driver where as Intel’s current i965 “classic” Mesa driver isn’t Gallium3D based and thus doesn’t work with the state tracker. While Gallium Nine has been working out well with Iris ever since the state tracker landed NIR support to complement the TGSI IR support but now it’s going to be even faster.

With Intel’s Iris driver being thread-safe, Gallium Nine’s black-listing no longer blocks Iris/Intel from enabling Command-Stream Multi-Threading (CSMT).

According to Andre Heider with the patch, enabling CSMT helps boost the performance by about 5%. This isn’t to be confused with Wine’s CSMT feature but is internal multi-threading for Gallium Nine to help the performance and is already used by the RadeonSI and R600 Gallium3D drivers with this D3D9 state tracker.

Intel Sends Out Initial Linux Graphics Driver Support For “Elkhart Lake”


It’s busy as ever for the open-source Intel Linux graphics driver developers bringing up support for upcoming hardware like the recently published driver patches for Comet Lake, continuing to tweak the maturing Icelake “Gen 11” graphics, and also plotting the necessary re-engineering of the driver needed to bring-up Intel’s in-development “Xe” discrete graphics. And Intel developers this evening sent out their initial enablement work for Elkhart Lake.

Elkhart Lake, thankfully, isn’t yet another 14nm CPU revision nor based on the long-standing “Gen 9” graphics but is an Icelake offshoot. Elkhart Lake is the SoC successor to Gemini Lake that will be based on Icelake. Public details on Intel’s Elkart Lake are still light, but the patches out on Wednesday confirm that it’s indeed featuring Gen 11 graphics very similar to what is found in the Icelake processors.

The initial volley of Elkhart Lake Linux support are 9 patches to Intel’s i915 kernel DRM driver. Patches for the rest of the Intel Linux graphics stack (namely their Mesa drivers) have yet to be published but will likely be out in short order.

The code presents just some basic differences between Elkhart Lake “EHL” and Icelake “ICL”. At least for now there are just four PCI IDs for the Elkhart Lake graphics adapters: 0x4500, 0x4571, 0x4551, and 0x4541. These initial kernel bits for the Elkhart Lake SoC will likely end up being introduced into the Linux 5.2 kernel cycle this summer, still giving plenty of time for the released kernel to work its way into distributions before the SoCs debut.