Tag Archives: Spectre

Benchmarking AMD FX vs. Intel Sandy/Ivy Bridge CPUs Following Spectre, Meltdown, L1TF, Zombieload


Now with MDS / Zombieload being public and seeing a 8~10% performance hit in the affected workloads as a result of the new mitigations to these Microarchitectural Data Sampling vulnerabilities, what’s the overall performance look like now if going back to the days of AMD FX Vishera and Intel Sandybridge/Ivybridge processors? If Spectre, Meltdown, L1TF/Foreshadow, and now Zombieload had come to light years ago would it have shaken that pivotal point in the industry? Here are benchmarks looking at the the performance today with and without the mitigations to the known CPU vulnerabilities to date.

As I’ve already delivered many benchmarks of these mitigations (including MDS/Zombieload) on newer CPUs, for this article we’re looking at older AMD FX CPUs with their relevant Spectre mitigations against Intel Sandybridge and Ivybridge with the Spectre/Meltdown/L1TF/MDS mitigations. Tests were done on Ubuntu 19.04 with the Linux 5.0 kernel while toggling the mitigation levels of off (no coverage) / auto (the default / out-of-the-box mitigations used on all major Linux distributions for the default protections) / auto,nosmt (the more restricted level that also disables SMT / Hyper Threading). The AMD CPUs were tested with off/auto as in the “auto,nosmt” mode it doesn’t disable any SMT as it doesn’t deem it insecure on AMD platforms.

The processors used for testing were the:

– Intel Core i3 2120

– Intel Core i5 2500K

– Intel Core i7 2700K

– Intel Core i7 3700K

– AMD FX-8320E

– AMD FX-8370E

– AMD FX-8370

Based on what I had available in a still working state and not running into any other issues (like motherboard problems preventing the FX-9590 from being tested) as well as time constraints. All of the systems were running with their latest BIOS, 2 x 4GB DDR3 system memory, and SATA 3.0 SSD storage (primarily the Samsung SSD 850). The ASRock Z68 Pro3 was the primary Intel Sandy/Ivy test platform while on the AMD side was the MSI 970 GAMING.

Ubuntu 19.04 x86_64 with the Linux 5.0.0-15-generic kernel was at play with all available Disco Dingo stable release updates. Via the Phoronix Test Suite a wide range of benchmarks were carried out focused on looking at the impact of the CPU vulnerability mitigations for these aging Intel and AMD desktop platforms.


OpenSUSE’s Spectre Mitigation Approach Is One Of The Reasons For Its Slower Performance


SUSE --

OpenSUSE defaults to IBRS for its Spectre Variant Two mitigations rather than the Retpolines approach and that is one of the reasons for the distribution’s slower out-of-the-box performance compared to other Linux distributions.

A Phoronix reader pointed out this opensuse-factory mailing list thread citing a “huge single-core performance loss” on a Lenovo laptop when using openSUSE. There’s a ~21% performance loss in single-threaded performance around the Spectre Variant Two mitigations, which itself isn’t surprising as we’ve shown time and time again about the performance costs of the Spectre/Meltdown mitigations.

OpenSUSE’s kernel is using IBRS (Indirect Branch Restricted Speculation) with the latest Intel CPU microcode images while most Linux distributions are relying upon Retpolines as return trampolines. The IBRS mitigation technique has the potential of incurring more of a performance loss than Retpolines, which has been known to incur a greater performance hit due to the more restricted speculation behavior when paired with the updated Intel CPU microcode.

Switching over to Retpolines for the workload in question restored the performance, per the mailing list discussion.

OpenSUSE users wanting to use that non-default approach can opt for it using the spectre_v2=retpoline,generic kernel command line parameter, which matches the behavior of most other Linux distributions’ kernels.

As for openSUSE changing their defaults, at least from the aforelinked mailing list discussion it doesn’t appear their kernel engineers have any interest in changing their Spectre mitigation default but are just blaming the poor performance on Intel as their problem.

Some have also suggested the openSUSE installer pick-up a toggle within its installer for informing users of security vs. performance preferences in better providing sane/informed defaults, but so far we haven’t seen any action taken to make that happen. It would make sense though considering some of openSUSE’s conservative defaults do have performance ramifications compared to most other Linux distributions, which we’ve shown in past benchmarks, albeit just written off by openSUSE as “mostly crap.”

Previously a barrier to Retpolines usage was needing the Retpolines compiler support, but that support has now been available for quite some time. There was also reported Retpolines issues with Skylake in the past, but those appear to have been resolved as well.


NXP PowerPC Processors Finally Being Mitigated Against Spectre V2 With Linux 4.21


SECURITY --

Nearly one year after the Spectre vulnerabilities were first published, Freescale/NXP PowerPC processors are being mitigated against Spectre Variant Two with the in-development Linux 4.21 kernel.

Queued for merging into Linux 4.21 is the Spectre V2 mitigation for these NXP PowerPC Book3E processors. Their approach is to flush the branch predictor whenever the privilege level has changed or kernel entry to protect user-space to user-space attacks and user-space attacks against the kernel. In the case of KVM virtualization, the branch predictor is flushed as well at each KVM entry.

For those that want to forego this mitigation to avoid the likely performance impact, the code does support a no_spectrev2 kernel command line parameter (the same as on x86-based platforms) that won’t enforce this frequent branch predictor flushing.

NXP developers working on this Spectre V2 mitigation hadn’t shared any of their expected performance costs of this mitigation.

The mitigation is landing as part of the PowerPC changes. That pull also has POWER DMA code changes, support for generating their system call tables from a text file, fixes to the transactional memory support, and other low-level changes.


Patches For The Better Spectre STIBP Approach Revised – Version 7 Under Review


LINUX KERNEL --

Version 7 of the task property based options to enable Spectre V2 userspace-userspace protection patches, a.k.a. the work offering improved / less regressing approach for STIBP, is now available for testing and code review.

Tim Chen of Intel sent out the seventh revision to these patches on Tuesday night. Besides the Spectre V2 app-to-app protection modes, these patches include the work for disabling STIBP (Single Thread Indirect Branch Predictors) when enhanced IBRS (Indirect Branch Restricted Speculation) is supported/used, and allowing for STIBP to be enabled manually and just by default for non-dumpable tasks.

The STIBP patches will no longer take the “big hammer” approach for cross-hyperthread Spectre Variant Two mitigation so the performance hit isn’t across the board but restricting it to non-dumpable tasks like OpenSSH rather than for every process as is currently done with Linux 4.20 Git and back-ported series like Linux 4.19.2+.

With the new V7 patches there is protection for SECCOMP tasks, bug fixes, updated the boot options to align with the other speculation mitigations, disabling the SMT code paths when irrelevant for the current system configuration, and other code changes. All the details can be found via this patch series.

While Linus Torvalds a few days ago criticized the current STIBP approach, he stopped short of calling for it to be reverted right away but is certainly wanting the default behavior to change, which will be by this patch series. However, until this patch series is ready for merging, Tim Chen is calling for the current STIBP code to be reverted. He noted, “Since Jiri’s patchset to always turn on STIBP has big performance impact, I think that it should be reverted from 4.20 and stable kernels for now, till this patchset to mitigate its performance impact can be merged with it.

Greg KH did release Linux 4.19.3 this morning and other stable point releases, but the STIBP code hasn’t been touched with today’s updates. Hopefully it won’t be much longer though until these cleaned up patches are mainlined as the current performance overhead is significant.


How Spectre and Meltdown Impact Data Center Storage


IT news over the last few weeks has been dominated by stories of vulnerabilities found in Intel x86 chips and almost all modern processors. The two exposures, Spectre and Meltdown, are a result of the speculative execution that all CPUs use to anticipate the flow of execution of code and ensure that internal instruction pipelines are filled as optimally as possible. It’s been reported that Spectre/Meltdown can have an impact on I/O and that means storage products could be affected. So, what are the impacts and what should data center operators and storage pros do?

Speculative execution

Speculative execution is a performance-improvement process used by modern processors where instructions are executed before the processor knows whether they will be needed. Imagine some code that branches as the result of a logic comparison. Without speculative execution, the processor needs to wait for the completion of that logic comparison before continuing to read ahead, resulting in a drop in performance. Speculative execution allows both (or all) branches of the logic to be followed; those that aren’t executed are simply discarded and the processor is kept active.

Both Spectre and Meltdown pose the risk of unauthorized access to data in this speculative execution process. A more detailed breakdown of the problem is available in two papers covering the vulnerabilities (here and here). Vendors have released O/S and BIOS workarounds for the exposures. Meltdown fixes have noticeably impacted performance on systems with high I/O activity due to the extra code needed to isolate user and system memory during context switches (syscalls). Reports range from 5%-50% additional CPU overhead, depending on the specific platform and workload.

Storage repercussions

How could this impact storage appliances and software? Over the last few years, almost all storage appliances and arrays have migrated to the Intel x86 architecture. Many are now built on Linux or Unix kernels and that means they are directly impacted by the processor vulnerabilities, which if patched, result in increased system load and higher latency.

Software-defined storage products are also potentially impacted, as they run on generic operating systems like Linux and Windows. The same applies for virtual storage appliances run in VMs and hyperconverged infrastructure, and of course either public cloud storage instances or high-intensity I/O cloud applications. Quantifying the impact is difficult as it depends on the amount of system calls the storage software has to make. Some products may be more affected than others.  

Vendor response

Storage vendors have had mixed responses to the CPU vulnerabilities. For appliances or arrays that are deemed to be “closed systems” and not able to run user code, their stance is that these systems are unaffected and won’t be patched.

Where appliances can run external code like Pure Storage’s FlashArray, which can execute user code via a feature called Purity Run, there will be a need to patch. Similarly, end users running SDS solutions on generic operating systems will need to patch. HCI and hypervisor vendors have already started to make announcements about patching, although the results have been varied. VMware for instance, released a set of patches only to recommend not installing them due to customer issues. Intel’s advisory earlier this week warning of problems with its patches has added to the confusion.

Some vendors such as Dell EMC haven’t made public statements about the impact of the vulnerabilities for all of their products. For example, Dell legacy storage product information is openly available, while information about Dell EMC products is only available behind support firewalls. I guess if you’re a user of those platforms, then you will have access, however, for wider market context it would have been helpful to see a consolidated response in order to assess the risk.

Reliability

So far, the patches released don’t seem to be very stable. Some have been withdrawn, others have crashed machines or made them unbootable. Vendors are in a difficult position, because the details of the vulnerabilities weren’t widely circulated in the community before they subsequently were made public. Some storage vendors only found out about the issue when the news broke in the press. This means some of the patches may be being rushed to market without full testing of the impact when they are applied.

To patch or not?

What should end users do? First, it’s worth evaluating the risk and impact of either applying or not applying patches. Computers that are regularly exposed to the internet like desktops and public cloud instances (including virtual storage appliances running in a cloud instance)) are likely to be most at risk, whereas storage appliances behind a corporate firewall on a dedicated storage management network are at lowest risk. Measure this risk against the impact of applying the patches and what could go wrong. Applying patches to a storage platform supporting hundreds or thousands of users, for example, is a process that needs thinking through.

Action plan

Start by talking to your storage vendors. Ask them why they believe their platforms are exposed or not. Ask what testing of patching has been performed, from both a stability and performance perspective. If you have a lab environment, do some before/after testing with standard workloads. If you don’t have a lab, ask your vendor for support.

As there are no known exploits in the wild for Spectre/Meltdown, a wise approach is probably to wait a little before applying patches. Let the version 1 fixes be tested in the wild by other folks first. Invariably issues are found that then get corrected by another point release. Waiting a little also gives time for vendors to develop more efficient patches, rather than ones that simply act as a workaround. In any event, your approach will depend on your particular set of circumstances.



Source link