Tag Archives: DragonFlyBSD

DragonFlyBSD Replacing Their 48-Core Opteron Infrastructure With Ryzen 9 3900X CPUs


BSD --

DragonFlyBSD is replacing their 48-core Opteron server named “Monster” with two of the new AMD Ryzen 9 3900X “Zen 2” processors as well as a spare Xeon server. DragonFlyBSD lead developer Matthew Dillon continues to be mighty impressed by AMD’s latest processor offerings.

Last year Matthew Dillon professed his love for the performance of AMD Ryzen Threadripper CPUs while in recent weeks he’s been quick to get Ryzen 3000 CPUs working on DragonFlyBSD and has been impressed by their performance.

With two Ryzen 9 3900X CPUs and a dual Xeon server he had extra at his house, he’s replacing their 48-core Opteron server and blades build infrastructure with these systems that use around half the power consumption and deliver significantly better performance. Dillon’s post from their mailing list is quite interesting with his commentary so it’s embedded below:

The goal is to clear out a little power budget in the colo and to really beef-up our package-building capabilities to reduce the turn-around time needed to test ports syncs and updates to the binary package system. Currently we use two blades to do most of the building, plus monster sometimes. The blades take almost a week (120 hours+) to do a full synth run and monster takes around 27.5 hours. But we need to do three bulk builds more or less at the same time… one for the release branch, one for the development branch, and one for staging updates. It just takes too long and its been gnawing at me for a little while.

Well, Zen 2 to the rescue! These new CPUs can take ECC, there’s actually an IPMI mobo available, and they are fast as hell and cheap for what we get.

The new machines will be two 3900X based servers, plus a dual-xeon system that I already had at home. The 3900X’s can each do a full synth run in 24.5 hours and the Xeon can do it in around 31 hours. Monster will be retired. And the crazy thing about this? Monster burns 1000W going full bore. Each of the 3900X servers burns 160W and the Xeon burns 200W. In other words, we are replacing 1000W with only 520W and getting roughly 6x the performance efficiency in the upgrade. This tell you just how much more power-efficient machines have become in the last 9 years or so.

This upgrade will allow us to do full builds for both release and dev in roughly one day instead of seven days, and do it without interfering with staging work that might be happening at the same time.

It’s interesting going with Ryzen CPUs + ECC over waiting for EPYC Zen 2 CPUs expected to debut this quarter, but the Ryzen 9 3900X route certainly provides great value on a budget.


HAMMER vs. HAMMER2 Benchmarks On DragonFlyBSD 5.6


BSD --

With the newly released DragonFlyBSD 5.6 there are improvements to its original HAMMER2 file-system to the extent that it’s now selected by its installer as the default file-system choice for new installations. Curious how the performance now compares between HAMMER and HAMMER2, here are some initial benchmarks on an NVMe solid-state drive using DragonFlyBSD 5.6.0.

With a 120GB Toshiba NVMe SSD on an Intel Core i7 8700K system, I ran some benchmarks of DragonFlyBSD 5.6.0 freshly installed with HAMMER2 and then again when returning to the original HAMMER file-system that remains available via its installer. No other changes were made to the setup during testing.

All of the benchmarks were carried out using the open-source Phoronix Test Suite benchmarking software.

Within the real-world PostgreSQL database server benchmarks, HAMMER2 is faster than HAMMER. In particular, the write performance is much better on this new version of HAMMER under development the past several years.

The BlogBench workload was also seeing much better performance.

In simple workloads like Git on the GTK source repository, the performance didn’t end up being measurable different.

And then for the more synthetic workloads it was just a mix. But overall HAMMER2 was performing well during the initial testing and great to see it continuing to offer noticeable leads in real-world workloads compared to the aging HAMMER file-system. HAMMER2 also offers better clustering, online deduplication, snapshots, compression, encryption, and many other modern file-system features.


DragonFlyBSD Is Seeing Better Performance Following A Big VM Rework


BSD --

DragonFlyBSD lead developer Matthew Dillon has been reworking the virtual memory (VM) infrastructure within their kernel and it’s leading to measurable performance improvements.

This mailing list post outlines the work around the kernel’s VM pmap code being restructured that results in possible memory conservation, helps with processes sharing lots of memory, and enhances concurrent page fault performance. The performance bits are what we’re after and they appear to be quite compelling at least with Dillon’s testing so far on both big (Threadripper) and small (Raven Ridge) AMD test systems:

These changes significantly improve page fault performance, particularly under heavy concurrent loads.

* kernel overhead during the ‘synth everything’ bulk build is now under 15% system time. It used to be over 20%. (system time / (system time + user time)). Tested on the threadripper (32-core/64-thread).

* The heavy use of shared mmap()s across processes no longer multiplies the pv_entry use, saving a lot of memory. This can be particularly important for postgres.

* Concurrent page faults now have essentially no SMP lock contention and only four cache-line bounces for atomic ops per fault (something that we may now also be able to deal with with the new work as a basis).

* Zero-fill fault rate appears to max-out the CPU chip’s internal data busses, though there is still room for improvement. I top out at 6.4M zfod/sec (around 25 GBytes/sec worth of zero-fill faults) on the threadripper and I can’t seem to get it to go higher. Note that obviously there is a little more dynamic ram overhead than that from the executing kernel code, but still…

* Heavy concurrent exec rate on the TR (all 64 threads) for a shared dynamic binary increases from around 6000/sec to 45000/sec. This is actually important, because bulk builds

* Heavy concurrent exec rate on the TR for independent static binaries now caps out at around 450000 execs per second. Which is an insanely high number.

* Single-threaded page fault rate is still a bit wonky but hit 500K-700K faults/sec (2-3 GBytes/sec).

Small system comparison using a Ryzen 2400G (4-core/8-thread), release vs master (this includes other work that has gone into master since the last release, too):

* Single threaded exec rate (shared dynamic binary) – 3180/sec to 3650/sec

* Single threaded exec rate (independent static binary) – 10307/sec to 12443/sec

* Concurrent exec rate (shared dynamic binary x 8) – 15160/sec to 19600/sec

* Concurrent exec rate (independent static binary x 8) – 60800/sec to 78900/sec

* Single threaded zero-fill fault rate – 550K zfod/sec -> 604K zfod/sec

* Concurrent zero-fill fault rate (8 threads) – 1.2M zfod/sec -> 1.7M zfod/sec

* make -j 16 buildkernel test (tmpfs /usr/src, tmpfs /usr/obj):

4.4% improvement in overall time on the first run (6.2% improvement on subsequent runs). system% 15.6% down to 11.2% of total cpu seconds. This is a kernel overhead reduction of 31%. Note that the increased time on release is probably due to inefficient buffer cache recycling.

DragonFlyBSD appears on track for a great 2019 with their other recent accomplishments being prompt handling of the MDS/Zombieload mess,DRM code updates, HAMMER2 improvements, flipping on compiler-based Retpoline support, and FUSE work, among other coding activities.


HAMMER2 File-System Performance On DragonFlyBSD 5.4.1


BSD --

With the newly released DragonFlyBSD 5.4.1 having a lot of HAMMER2 file-system work on top of all of the changes introduced by DragonFlyBSD 5.4 at the start of December, here is a fresh look at the HAMMER versus HAMMER2 file-system performance on this BSD operating system.

Using an Intel Core i9 7960X test system with Intel 800p 128GB NVMe SSD, fresh benchmarks were carried out of DragonFlyBSD 5.4.1 when installed with a root HAMMER file-system and again with the latest HAMMER2 file-system option that has matured quite nicely over the DragonFlyBSD 5.x releases.

This quick testing is just looking at the HAMMER vs. HAMMER2 file-system performance. Besides the performance, HAMMER2 offers a lot of features not found in the original HAMMER design. The latest HAMMER2 design information can be found here.

All of these BSD storage benchmarks were carried out using the Phoronix Test Suite.

SQLite was operating much faster with HAMMER2.

BlogBench that simulates the web server workload of running a web blog was yielding reads much faster on HAMMER2 but writes were faster with HAMMER1.

The CompileBench compile task was much faster on HAMMER2.

But in the I/O heavier initial create process, the original HAMMER was faster as of DragonFlyBSD 5.4.1.

HAMMER2 was faster for PostgreSQL with both reads and writes.

The FIO synthetic tests didn’t yield much of a difference except for 4K sequential writes being faster.

More tests, including a comparison against FreeBSD with ZFS, coming up as we get ready for more exciting benchmarks in 2019.


DragonFlyBSD 5.4.1 Released With HAMMER2 File-System Updates, New Intel Graphics Support


BSD --

Released at the start of December was DragonFlyBSD 5.4 that brought a number of new features and improvements while now v5.4.1 is available that collected a few weeks worth of fixes.

DragonFlyBSD 5.4 as a six-month update to this popular BSD operating system delivered GCC 8 as the default compiler, AMD Threadripper 2 CPU support, various NUMA performance improvements, DPorts updates, various kernel tuning, and a lot of work on maturing the project’s original HAMMER2 file-system support.

DragonFlyBSD 5.4.1 is now available and it is predominantly made up of various HAMMER2 fixes/improvements, including better unmounting, refactoring the sync code, and other stabilization work.

DragonFlyBSD 5.4.1 also has the new Intel DRM graphics driver PCI IDs added for Coffeelake / Whiskey Lake / Kabylake support, a keyboard fix, and other kernel fixes.

The small list of patches making up the DragonFlyBSD 5.4.1 release can be found here. Download links and more from DragonFlyBSD.org.

DragonFlyBSD 5.5-DEVELOPMENT meanwhile is the version being worked on in master that will premiere towards the end of the first half of 2019 as what will most likely be called DragonFlyBSD 5.6.