Tag Archives: Benchmarks

Fedora 30 Wayland vs. X.Org Graphics Benchmarks On GNOME Shell


FEDORA --

In the run up to the Ubuntu 19.04 release I ran various gaming/graphics benchmarks looking at different desktops and X.Org vs. Wayland sessions. Check that article out if interested in the situation while this posting is just some complementary data I did from Fedora Workstation 30 when looking at the graphics performance under GNOME Shell’s X.Org and Wayland sessions.

From the Threadripper 2990WX box with Radeon RX Vega 56, I compared the performance of various graphics/gaming tests under (X)Wayland to that of a pure X.Org session.

Considering the versions of the key software components are similar to that of Ubuntu 19.04 and those tests I carried out just a few weeks ago, these are just some quick weekend graphics tests for reference.

As expected at this point, the Wayland/XWayland performance is largely comparable to a traditional GNOME X.Org session. See all of the benchmark results via this OpenBenchmarking.org result file.


NVIDIA GeForce GTX 1650 Linux Gaming Performance & Benchmarks Review


This week NVIDIA introduced the $149 USD Turing-powered GTX 1650 graphics card. On launch day I picked up the ASUS GeForce GTX 1650 4GB Dual-Fan Edition (Dual-GTX1650-O4G) graphics card for Linux testing and have out now the initial GTX 1650 Linux performance benchmarks under Ubuntu compared to an assortment of lower-end and older AMD Radeon and NVIDIA GeForce graphics cards.

For $149+ USD, the GeForce GTX 1650 features 896 CUDA cores, 1485MHz base clock with 1665MHz boost clock, 4GB of GDDR5 video memory, Volta-based NVENC video capabilities (not the newer Turing NVENC, but still good enough especially compared to older generations of NVIDIA GPUs), and has just a 75 Watt TDP meaning no external PCI Express power connector is required.

In the case of the ASUS Dual-GTX1650-O4G, I was able to acquire it on launch day for $160 USD though there were other models indeed hitting the $149 price point. This particular ASUS SKU does use the same 1485MHz base clock but its GPU boost clock can reach 1725MHz compared to the 1665MHz reference clock. There is also an ASUS GPU Boost Clock mode under Windows to reach 1755MHz. No manual overclocking was attempted with this graphics card since you can read about GPU overclocking on plenty of other websites while we focus on the Linux support and performance aspects.

The ASUS GeForce GTX 1650 Dual-Fan Edition features outputs for DVI-D, HDMI 2.0b, and DisplayPort 1.4. The GTX 1650 does support driving three displays simultaneously. This ASUS graphics card with two fans is a standard dual-slot form factor and the card measures in at 20.4 x 11.5 x 3.7 cm.

This GTX 1650 graphics card was working fine under Linux in conjunction with the new NVIDIA 430.09 beta Linux driver. The initial round of tests were from Ubuntu 19.04 x86_64 with the Linux 5.0 kernel. No problems were encountered in the time spent thus far benchmarking a variety of OpenGL and Vulkan Linux games (including Steam Play / DXVK titles) and some OpenCL/CUDA compute workloads.


Intel Xeon Scalable “Cascade Lake” Processors Launch – Initial Xeon Platinum 8280 Linux Benchmarks Review


Intel’s 2nd Gen Xeon Scalable Cascade Lake processors are officially launching today! Last month we were briefed out at one of Intel’s campuses in Oregon and have been testing the new Xeon Platinum 8280 processors in recent days. In this article is a look at what’s new with Cascade Lake as well as our preliminary Ubuntu Linux performance figures for the Xeon Platinum 8280 processors.

With Cascade Lake there is now a higher memory frequency (DDR4-2933 rather than DDR4-2666 amd now an overall capacity up to 4.5TB system memory processor), Intel AVX-512 VNNI / DL BOOST for helping AI workloads and related fields, support for Intel Optane DC Persistent Memory, mitigations for Spectre vulnerabilities, and increased frequency / power efficiency compared to the previous Skylake Xeon Scalable processors.

Intel’s Xeon Scalable Platinum 8200 series offers up to 28 cores / 56 threads while their Platinum 9200 series offers up to 56 cores / 112 threads per package but hitting a 400 Watt TDP. Like the original Xeon Scalable line-up, the CPUs support six memory channels per processor while the native support for DDR4-2933 has the ability of providing a nice speed bump.

The side-channel mitigations in-hardware with Cascade Lake are for Spectre Variants 2, 3, 3a, 4, and L1TF/Foreshadow.

Besides the hardware Spectre/Foreshadow mitigations helping performance, AVX-512 VNNI / DL BOOST has the potential to be a big boost to performance for deep learning workloads with yielding much better throughput and efficiency.

Intel Optane DC Persistent Memory is the long-awaited non-volatile DIMMs allowing 128GB to 512GB of persistent storage per module and can function in a volatile memory mode for delivering greater RAM capacities at lower cost (and slightly lower performance than traditional RAM) or in the “app direct” mode for offering the Optane persistent memory for storage.

A look at all the different Cascade Lake SKUs…

We were supplied by Intel with two Xeon Platinum 8280 processors for our initial Cascade Lake testing. The Xeon Platinum 8280 has 28 cores / 56 threads, 2.7GHz base frequency, 4.0GHz turbo frequency, 38.5MB cache, and a 205 Watt TDP. The RCP pricing on the Xeon Platinum 8280 is $10,009 USD per processor. In comparison, the first-gen Xeon Scalable Platinum 8180 has 28 cores but a lower 2.5GHz base frequency with 3.8GHz turbo and DDR4-2666 vs. DDR4-2933 MHz memory frequency while having the same TDP and cache size.


Some Additional Chrome vs. Firefox Benchmarks With WebRender, 67 Beta / 68 Alpha


DESKTOP --

A few days ago I posted some Chrome vs. Firefox benchmarks using the latest Linux builds. Some readers suggested Firefox could be more competitive if forcing WebRender usage and/or moving to the latest nightly builds, so here are some complementary data sets looking at such combinations.

In addition to Firefox 66 stable and Chrome 73 stable, here are results when using Firefox 67 Beta 4 and Firefox 68 Alpha 1 as the latest at the time of testing. In addition to testing those two development channels, additional runs were done on each of them after forcing WebRender with the “MOZ_ACCELERATED=1 MOZ_WEBRENDER=1” environment variables.

Here are the benchmark results via the Phoronix Test Suite:

In the case of ARES-6, Firefox 67 Beta 4 is faster than Firefox 66 stable while Firefox 68 was slightly slower. But Firefox still wasn’t competing with Chrome in this benchmark.

In the old Octane browser benchmark, the newer releases came in a little bit slower than Firefox 66 stable.

WebXPRT is the lone test where Firefox beats out Google Chrome 73 and there wasn’t any benefit to the newer releases.

With Basemark, Firefox is still a great deal behind Chrome.

The MotionMark benchmark with it being focused on the graphics performance is a benchmark where WebRender is stressed and does pay off albeit still doesn’t make it as fast as Google Chrome.

There wasn’t much difference out of the Speedometer web browser benchmark.

Lastly is a look at the geometric mean of the benchmarks carried out. Personally, as a devout Firefox user going back to the Firebird/Phoenix days, this is sad to see albeit are seeing similar results on other Linux desktop systems too between Chrome and Firefox. If any premium supporters have any other web browser benchmark requests, be sure to let me know.


GCC 9 Compiler Tuning Benchmarks On Intel Skylake AVX-512


Recently I carried out a number of GCC 9 compiler benchmarks on AMD EPYC looking at the performance benefits of “znver1” compiler tuning and varying optimization levels to see when this level of compiler tuning pays off. There was interest from that in seeing some fresh Intel Skylake-X / AVX-512 figures, so here are those benchmarks of GCC 9 with various tuning options and their impact on the performance of the generated binaries.

This round of testing was done with an Intel Core i9 7980XE as the most powerful AVX-512 HEDT CPU I have available for testing. The Core i9 7980XE was running Ubuntu 18.10 with the Linux 4.18 kernel and I had manually built the GCC 9.0.1 2019-02-17 compiler snapshot (the most recent at the time of testing) in its release/optimized form.

The CFLAGS/CXXFLAGS used for this GCC 9 compiler tuning benchmarks were:

-O0

-O1

-O2

-O2 -march=skylake-avx512

-O3

-O3 -march=x86-64

-O3 -march=skylake

-O3 -march=skylake-avx512

-O3 -march=skylake-avx512 -flto

-Ofast -march=skylake-avx512

This offers a look from no GNU Compiler Collection optimizations through all the standard optimizations, looking at Skylake vs. Skylake-AVX512 tuning, the benefits of link-time optimization on this new compiler, and also being aggressive with performance but at potentially unsafe math via the “-Ofast” level.

71 benchmarks were run at each of these optimization levels on the Intel i9-7980XE system. All of these compiler benchmarks were facilitated in a fully-automated and reproducible manner using the open-source Phoronix Test Suite benchmarking software.