HDX Benchmarking: Known incompatibilities and caveats with third-party benchmarks

HDX uses a number of graphics modes (Thinwire, Thinwire Advanced (with H.264), DCR, Framehawk) and technologies. Many customers use third-party benchmarks to evaluate the performance and behavior of their deployments under known reference test conditions using standardized workloads.

Some benchmarks synthesize user interaction such as mouse clicks, scrolling, moving windows around, opening applications. Some benchmarks measure properties such as bandwidth, framerate (that is frames per second (fps)), CPU and RAM usage. Some benchmarks can be even more intrusive and inspect low-level parts of the graphics sub-structure such as framebuffers and so on.

This Article contains information on known issues or behaviors that users undertaking benchmarking of graphical workloads may encounter.

Known incompatibility between LoadRunner and thinwire graphics modes

Citrix is aware that our thinwire modes (legacy graphics mode and the enhanced thinwire technologies introduced in XenApp/XenDesktop 7.6 FP3) interact with Loadrunner’s own use of certain buffers and this may lead to graphical corruptions being seen when Loadrunner workloads are run. This should never occur in a real application as they do not attempt to access and compare the framebuffers in the manner Loadrunner does for benchmarking. Corruptions that have been seen include parts of text boxes pre-filled with older frames or text-boxes only partially updated.

Framehawk and LoginVSI are incompatible and should not be used together

At the time of writing the Framehawk technologies within all versions of XenApp and XenDesktop (latest version at time of writing XenApp/XenDesktop 7.6 FP3) are wholly incompatible with LoginVSI. LoginVSI can be run and will give results but the manner in which it synthesises user interaction affects and changes the behavior of Framehawk dramatically. This means that the results from a LoginVSI automated workload will not give meaningful data on how Framehawk behaves in production with the applications within the automated test.

The Framehawk virtual adapts its behavior in response to user interaction such as scrolling. The synthetic injection of user behavior occurs at different levels in the graphics stack to real user-interaction and as such Framehawk does not receive the same information, information on which its behavior is then adapted.

Browsers or applications failing to utilise the GPU

Citrix is aware of various interoperability bugs between certain browsers and certain hypervisors or certain remoting protocols. Browsers are often a core part of an operating system and optimised beyond normal applications and sometimes aware that they are running on a virtualized platform. Some browsers may disable GPU-acceleration on certain virtualization platforms or under certain remoting protocols based on various criteria. This will manifest itself when a user finds their GPU is not being utilized when they point their browser at a web based benchmark or graphical test e.g. Fishbowl or WebGL site. The user may find the option to enable GPU/hardware acceleration is greyed out or unavailable in the browser settings.

Many users benchmark within browsers. Some applications that can be used to do this can be found here: Citrix HDX: The BIG LIST of graphical benchmarks, tools and demos!.

It is worth verifying whether it is a specific issue to a specific browser by:

  • running a non-browser based benchmark and verifying the GPU is available to the VM and correctly configured
  • trying a different browser e.g. If the issue is with Internet Explorer or Firefox, try Chrome

Important: CTX202149 – Known Issues or Configuration Reasons: OpenGL/DirectX/GPU Acceleration Not Used contains details of certain applications and browsers where known issues or misconfigurations can result in the GPU not being used. Please check for known issues if you are observing this in that article.

Benchmarking NVIDIA GRID and vGPU technologies

Check the framebuffer requirements of the application and the vGPU profile

Take a close look at the Frame Buffer (Megabytes) of the vGPU type. Many benchmarks stipulate a minimum buffer size of 512MB (including many Unigine benchmarks), consequently the 2 types of NVIDIA vGPUs with 256MB, the K100 and K200 might behave quite oddly on graphically demanding benchmarks. Too small a framebuffer will often result in frame corruptions and visual artefacts.

Performance issues with applications that have very large data sets may arise if the framebuffer is insufficient and memory paging results. This is most likely to cause issues on XenApp where multiple users are sharing the GPU framebuffer. Advice on how to check the framebuffer usage and that the VM is adequately resourced can be found here: Monitoring NVIDIA GPU usage of the framebuffer for vGPU and GPU-passthrough. Applications that have very large data sets and framebuffer usage include: ESRi ArcGIS Pro, Petrel and some CAD/CAE products.

vSync

For many benchmarks it can be standard practice to turn an NVIDIA option called vsync off. This makes frame rate numbers more comparable and higher but can lead to visual “tears” in the visuals on very intensive benchmarks and so the numeric benchmark isn’t the whole story. In practice though many users will find their usage less intensive and that they can work in production with vsync off and not notice any effects. Not all GPUs and vendors support vsync and the ability to prevent tearing and so this option is often turned off when evaluating NVIDIA GPUs to ensure a level-playing field in the benchmark of the GPUs raw ability.

NVIDIA vGPU frame rate limiter

NVIDIA vGPU profiles are limited by a frame rate limiter (frl) to assist fair sharing across users sharing the GPU. The frl is sometimes removed in benchmarking to make comparisons to physical systems. In NVIDIA GRID v1.0 the profiles were limited to either 45fps (frames per second) or 60fps.

Non-graphical Benchmarks

Some Benchmarks can be run without a graphical interface or unattended e.g. Cadalyst. We would always advise that a human verifies the benchmark and image quality results as most technologies contain fall-back mechanism that may reduce the quality of the contents of the frames. Benchmarks that evaluate framerates and similar alone can be misleading.

Comparing images of screen capture to reference images from a previous test run

Citrix HDX graphics modes are adaptive and between versions we may change the default mode use as we introduce new codecs and graphics modes better suited to newer OSs and hardware. A notable example is in XenApp 6.5 we relied on the original thinwire graphics mode optimized for legacy Microsoft OSs such as Win 2008 R2 and Win 7 and earlier (this mode was then renamed “Legacy graphics mode”). In XenDesktop/XenApp 7.5 up the default was changed to H.264 technologies better suited to newer hardware, newer Microsoft OSs and richer applications, Aero themes and video. Different graphics modes have different degrees of determinism (repeatability on individual frames) on repeated test runs and apply different compression techniques dependent on network conditions etc. at the time of the run. As such it is not recommended that customers use the analysis of static images (screen captures) to evaluate comparable behavior as:

  • The results may vary highlighting difference that are not visually discernible to the user and generate many false-positive regressions.
  • Users risk tying themselves to older technologies and configurations in order to obtain “the same results” where in fact “better results” are possible.

Useful links and tools to assist identifying issues:

Related:

  • No Related Posts

Leave a Reply