WPF Drawing Performance

Starting with version 3.0, the .NET Framework provides two incompatible and unrelated graphics APIs, both aimed at general GUI application development:

  • Windows Forms wraps the GDI+ API introduced in Windows XP, which in turn extends the GDI (Graphics Device Interface) API that dates back to the first versions of Windows. The original interface languages for GDI and GDI+ are C and C++, respectively.
  • Windows Presentation Foundation (WPF) is based on DirectX and exposed exclusively through managed .NET code.

WPF was developed for Windows Vista whose new Desktop Window Manager (DWM) is likewise based on DirectX rather than GDI. The DWM is enabled by switching to the Aero desktop theme (the default on most editions), and disabled by switching to the Basic theme which emulates Windows XP.

Rumor has it that Vista was originally intended to use WPF for its entire GUI, but the performance of the new API was not up to the task. Certainly, developers outside Microsoft have frequently criticized WPF for its sluggish performance, especially compared to GDI/GDI+ on Windows XP.

On this page I attempt to measure the performance of simple drawing operations in both WPF and Windows Forms (i.e. GDI+) under a variety of conditions. For comparison, I implemented the same operations in Java’s AWT (Abstract Window Toolkit). Hopefully the results will prove useful to other developers. The test application and its source code are available for download, so you can run your own tests and modify the test cases as desired.

Before moving on, I’d like to recommend Jeremiah Morrill’s Critical Deep Dive into the WPF Rendering System. This post is largely unrelated to the following discussion, but it’s a fascinating examination of WPF performance at the lowest level.

Measuring WPF Performance

Attempting to measure the time WPF takes to fully render a window is surprisingly difficult. This is because two different threads collaborate in this task. To explain what that means, here’s a quick overview of how WPF shows things on the screen.

  • WPF operates in retained mode (as does JavaFX, not tested here). Calling a drawing method stores the indicated content in some internal format but does not alter the screen. At some unspecified later time, a background thread renders the prepared content to the screen. Rendering happens automatically and repeatedly when necessary, e.g. when an obscured window is uncovered.
  • By contrast, GDI/GDI+ and hence Windows Forms operate in immediate mode (as does Java AWT). Calling a drawing method immediately renders the indicated content to the screen before the method returns. However, this content is never stored, so the application must repeat those same method calls whenever the same content should be re-rendered.

An immediate mode API can simulate a primitive sort of retained mode by drawing to a memory buffer which is later copied to the screen. Such buffering is frequently used for better performance or smoother animations. Our test application exercises both direct and buffered GDI+.

Since WPF uses retained mode, all new content passes through two stages before appearing on the screen: first internal preparation, then the actual rendering. WPF implements these stages as follows:

1. Preparation — This includes computing the sizes and layout of all WPF objects to be rendered, as well as recording the actual drawing operations. Any WPF methods that you call explicitly, for example within an OnRender override, are part of this stage.

All preparations are handled synchronously by the message loop running on the (usually single) GUI thread, which is accessible through the Dispatcher property. This is the same mechanism that transmits user input to Windows applications, and it’s the reason why WPF won’t update the display until your topmost event handler has returned. The GUI thread cannot process any drawing operations while it’s in your code – it must return to the message loop first.

(There’s a dangerous trick to get around this, known as DoEvents after the eponymous Windows Forms method, which tells the Dispatcher to immediately work through all pending messages. The drawing test application uses this trick to clear the message queue before the test timer starts.)

2. Rendering — When all preparations are complete, a separate background thread eventually renders the prepared content to the screen. Unfortunately, this thread is completely hidden from user code, and WPF offers no (direct) way to tell when an object has finished rendering. This is a rather big problem for responsive GUI design which, as far as I’m aware, persists in .NET 4.6.

(This mechanism also explains why WPF, an API based on DirectX, doesn’t expose a DirectX interface for user drawings. Only the background render thread interacts with DirectX, so any user-supplied DirectX code would have to somehow insert itself into this thread. It’s difficult to see how that could work without messing up existing functionality.)

Measuring Windows Forms is easy: we start a timer before showing a test window, and stop it at the end of the window’s OnPaint handler. Since Windows Forms operates in immediate mode, the entire window has been fully rendered to the screen at that point. Java’s AWT likewise operates in immediate mode, and can be measured in the same way.

Measuring WPF is more difficult. Once again, we start a timer before showing a test window. But now we need to measure both stages of WPF’s retained mode to find the total time until the window has actually been rendered to screen.

Measuring Preparation

This stage is complete when the UI thread’s message loop has processed all pending messages, which (we assume and hope!) all originated from our drawing operations. WPF exposes a Window.­ContentRendered event that is perfect for this purpose. Despite its name, this event fires after all window contents have been prepared for rendering for the first time. We react by setting a flag in our test application that activates measurement of the second stage.

Measuring Rendering

We cannot directly access the rendering thread but WPF does offer one indirect point of access, namely through the CompositionTarget.­Rendering event. This event usually fires at the monitor refresh rate (typically 60 times per second), whether there’s any new content to render or not. It is primarily intended for custom animations that need to generate display updates as quickly as the monitor can show them.

However, the Rendering event is tied to the render thread in a way we can exploit: the event is not raised as long as the render thread is busy! It will be raised again at some point during the next refresh interval after the render thread has gone idle. Since we set a flag immediately after the preparations stage was complete, we can now examine that flag in our Rendering handler. If the preparations flag is still set, we know that a test window has just been rendered and we can record the elapsed time.

This trick is not foolproof. Sometimes the Rendering event fires just after a test window has been prepared, but before the render thread has actually started working on it. We circumvent this problem by comparing the event time to the time when the preparations flag was set. If the difference is less than 100 msec (a value tailored to our benchmark), we assume that rendering has not yet happened and wait for the next event to arrive.

Drawing Test Application

All results shown below were obtained with a small test application. The download package DrawingTest.zip (31.8 KB, ZIP archive) comprises the precompiled application for the .NET Framework 4.0 and the complete source code for Visual Studio 2010, as well as a version for Java’s AWT library.

The test application draws 10,000 triangles to a window’s client area, sized 400×400 screen pixels (for GDI+ and AWT) or device-independent units (for WPF). Each triangle is rotated 1° clockwise compared to the previous one. Triangles are drawn either as outlines using pens (“Pens Only”), filled shapes using brushes (“Brushes Only”), or both with different colors (“Pens & Brushes”). All colors are solid, with no patterns, shading, or animation effects of any kind.

The test application for Java’s AWT library is located in a separate folder and run from the command line – please see the enclosed ReadMe file for instructions. The test application for GDI+ and WPF provides a GUI with five buttons on the left start each test window, as follows:

  • GDI+ Direct (Alt+D) — Shows a WinForms window (a.k.a. Form) whose OnPaint handler calls Graphics.­DrawPolygon and/or Graphics.­FillPolygon to draw the triangles directly to the window. We enable alpha blending (SourceOver) and high-quality compositing to replicate WPF behavior, but testing with SourceCopy and high-speed compositing showed no measurable difference.
  • GDI+ Buffer (Alt+B) — Shows a WinForms window whose OnPaint handler creates a BufferedGraphics object covering the entire client area, then calls Graphics.­DrawPolygon and/or Graphics.­FillPolygon to draw the triangles to that buffer, and finally renders the buffer to the window. (This is equivalent to setting the DoubleBuffered or ControlStyles.­OptimizedDoubleBuffer flag on the Form.)
  • WPF Line (Alt+L) — Shows a WPF window whose OnRender handler calls DrawingContext.­DrawLine three times for each triangle. The DrawingContext class does not expose a method to fill arbitrary polygons, so this test supports only the “Pens Only” option.
  • WPF Path (Alt+P) — Shows a WPF window whose OnRender handler creates a PathFigure for each triangle, then a PathGeometry containing the figure, and finally calls DrawingContext.­DrawGeometry to draw that geometry.
  • WPF Stream (Alt+S) — Shows a WPF window whose OnRender handler creates a StreamGeometry for each triangle, which is once again drawn by DrawingContext.­DrawGeometry.

To minimize interference with the test timer, I recommend that you move the mouse cursor away from the application and test windows, and start all tests with keyboard shortcuts rather than mouse clicks. If you use high DPI mode, you’ll notice that the Windows Forms and AWT windows appear smaller than the WPF windows. This is correct and due to the fact that WPF automatically scales all coordinates by the current DPI setting, whereas Windows Forms and AWT do not.

Anti-Aliasing

Anti-aliasing, i.e. smoothing the edges of diagonal lines, turns out to have a huge performance impact on most tests. Anti-aliasing is disabled by default for GDI+ and AWT, and enabled by default for WPF. Use “Anti-Aliasing On/Off” to change this setting which is implemented as follows:

  • GDI+ Set SmoothingMode on the current Graphics object.
  • WPF Set RenderOptions.­EdgeMode for the current Window.
  • AWT Set RenderingHints.­KEY_ANTIALIASING on the current Graphics2D object.

On Windows XP, direct (unbuffered) GDI+ does not support AA at all; the corresponding SmoothingMode flag is simply ignored. This may be a limitation of GDI hardware acceleration on that platform.

WPF Freezing

The figures and geometries created by the WPF Path & Stream tests are always frozen. Testing showed that leaving them unfrozen makes no discernable difference. However, freezing the pens and brushes used by the three WPF windows makes a very big difference, so this feature is controlled by one last option. All WPF pens and brushes are initially unfrozen until you click the “Freeze WPF” button, at which point they remain frozen until the application is closed.

Limitations

The application tests exactly one thing: drawing the outlines and/or interiors of many triangles in solid colors. It does not test anything else, including the following:

  • Standard GUI elements, although those ultimately rely on primitive drawing operations such as the ones that are being tested.
  • Advanced features such as patterns, shading, or animation. WPF is much more powerful in this regard than GDI/GDI+.

If you are interested in the performance of some specific drawing operation that is not covered by the application, you should modify its source code to run your own customized tests on your target system. This is ultimately the only way to find reliable answers.

Sample Test Results

This section contains sample test results from a variety of systems, as detailed below. All times are in milliseconds and represent the average of several test runs. In each table, the first three rows were measured with anti-aliasing disabled and the last three rows (“AA +”) with anti-aliasing enabled.

Windows XP & 7

The first group of test results cover Windows XP and Windows 7, both running the .NET Framework 4.0, and were obtained on 27 July 2011. The system comprised an Intel DX58SO motherboard with an Intel Core i7 920 CPU (2.67 GHz), 6 GB RAM (DDR3-1333), and an AMD Radeon HD 6970 GPU (2 GB).

The first table shows the results for Windows XP SP3 (32 bit, 96 dpi, DirectX 9.0c) running in Virtual PC on Windows 7 SP1 (64 bit).

Windows XP GDI+ WPF Unfrozen WPF Frozen
Direct Buffer Line Path Stream Line Path Stream
Pens Only 160 390 10,800 2,050 1,850 800 950 750
Brushes Only 300 1,120 1,600 1,380 1,300 1,100
Pens & Brushes 460 1,470 3,350 3,150 1,950 1,720
AA + Pens Only 4,760 13,100 4,800 4,500 3,150 3,600 3,400
AA + Brushes Only 3,930 3,000 2,750 2,650 2,450
AA + Pens & Brushes 8,750 7,400 7,250 6,000 5,800

The second table shows the results for Windows 7 SP1 (64 bit, 120 dpi) with Desktop Window Manager disabled (Windows 7 Basic scheme).

Windows 7 Basic GDI+ WPF Unfrozen WPF Frozen
Direct Buffer Line Path Stream Line Path Stream
Pens Only 3,000 360 10,300 1,850 1,650 500 780 570
Brushes Only 3,950 580 680 480 400 200
Pens & Brushes 6,900 910 2,250 2,050 880 680
AA + Pens Only 7,400 4,550 14,000 6,300 6,100 4,050 5,200 5,000
AA + Brushes Only 7,400 3,850 680 480 400 200
AA + Pens & Brushes 14,800 8,450 6,700 6,480 5,300 5,100

The third table shows the results for Windows 7 SP1 (64 bit, 120 dpi) with Desktop Window Manager enabled (Windows 7 Aero scheme).

Windows 7 Aero GDI+ WPF Unfrozen WPF Frozen
Direct Buffer Line Path Stream Line Path Stream
Pens Only 20,000 350 10,400 1,890 1,680 500 770 560
Brushes Only 17,300 580 700 480 400 180
Pens & Brushes 36,000 920 2,300 2,080 880 670
AA + Pens Only 25,600 4,500 13,800 6,200 6,000 4,050 5,150 4,950
AA + Brushes Only 27,800 3,800 680 480 400 190
AA + Pens & Brushes 55,400 8,400 6,700 6,450 5,300 5,070

Windows 10

The second group of test results cover Windows 10 and were obtained on 21 September 2015. The system was Windows 10 Pro (64 bit, 192 dpi) with the .NET Framework 4.6, running on a Dell XPS 15 notebook (model 9530) with an Intel Core i7 4712HQ CPU (2.30 GHz) and 16 GB RAM (DDR3-1600).

I used the old executable built with VS2010 for .NET 4.0 which as expected runs fine on .NET 4.6. The first table shows the results for the Intel HD Graphics 4600 GPU that’s integrated with the Intel Core i7 CPU, and normally used to render desktop applications.

Intel HD 4600 GDI+ WPF Unfrozen WPF Frozen
Direct Buffer Line Path Stream Line Path Stream
Pens Only 9,337 325 9,468 1,667 1,498 395 645 492
Brushes Only 12,867 566 2,356 2,196 2,089 1,935
Pens & Brushes 20,680 807 2,835 2,340 1,207 1,054
AA + Pens Only 12,963 4,022 9,495 1,563 1,395 482 562 422
AA + Brushes Only 14,714 2,951 3,489 3,328 3,193 3,057
AA + Pens & Brushes 27,964 6,963 4,790 4,625 3,504 3,348

The second table shows the results for the Nvidia GeForce GT 750M GPU present as a discrete option for full-screen applications, typically games. I manually enabled it for the test executable.

Nvidia GT 750M GDI+ WPF Unfrozen WPF Frozen
Direct Buffer Line Path Stream Line Path Stream
Pens Only 10,852 327 9,451 1,715 1,548 449 699 542
Brushes Only 17,547 574 973 698 678 439
Pens & Brushes 25,286 820 2,102 1,977 837 666
AA + Pens Only 13,180 4,041 9,471 1,588 1,424 446 557 412
AA + Brushes Only 14,956 2,944 981 704 582 462
AA + Pens & Brushes 28,246 6,981 2,055 1,813 757 548

Java AWT

As a point of comparison, the final table shows test results for Java’s standard AWT library. The first column was obtained on the 27 July 2011 system with Oracle Java SE 6u26. The second column was obtained on the 21 September 2015 system with Java SE 8u60.

Java AWT Windows 7 Aero on Desktop Windows 10 on Laptop
AA Disabled AA Enabled AA Disabled AA Enabled
Pens Only 180 5,500 359 63,895
Brushes Only 520 4,800 1,005 95,349
Pens & Brushes 590 10,200 1,090 157,884

AWT’s performance is roughly comparable to buffered GDI+ on Windows 7 but completely collapses on Windows 10 when anti-aliasing is enabled. This strikes me as a bug that’s hopefully going to be corrected in a future Java release.

Test Conclusions

I make two assumptions in my following attempt to interpret these results:

  1. My Windows XP results, measured within Virtual PC, are representative of a native Windows XP installation, at least insofar as the relationship between GDI+ and WPF is concerned. I make this assumption because I would expect an additional slowdown from the virtual environment, if anything; but instead many tests run faster than on my native Windows 7 installation.
  2. My Windows 7/10 results are also representative of Windows Vista & 8/8.1. I make this assumption because all these operating systems use the new driver and display architecture that was introduced in Vista. In any case, Vista has seen poor adoption and 8/8.1 systems are rapidly transitioning to Windows 10, so the most popular systems are covered directly.

Once again, I encourage you to download the test application and try it on your own system(s), modifying the test code to your own requirements if necessary. Still, based on my test results as they stand, I’m inclined to draw the following conclusions:

Direct GDI+ is obsolete after Windows XP — The architectural changes between XP and Vista slowed direct GDI+ operations by two orders of magnitude. On Windows XP, unbuffered GDI+ is 3–67× faster than WPF; on Windows 7 Basic, between 3× faster and 37× slower; and finally on Windows 7 Aero & 10, never faster and up to 146 slower!

Conclusion: Unbuffered GDI+ is a great choice for custom drawing on XP (if you don’t need anti-aliasing), but it’s completely useless on newer systems where WPF is usually faster. Buffered GDI+, on the other hand, delivers consistent and competitive performance across all systems – and also supports anti-aliasing on Windows XP. As we’ll see next WPF can be significantly faster when anti-aliasing is enabled on Windows 10, though.

Anti-aliasing can be extremely slow — Surprisingly, the fact that WPF enables anti-aliasing by default was originally the single biggest factor in its apparent slowness compared to other APIs. Turning off AA improves performance in most tests by an order of magnitude, and using identical AA settings dramatically shrinks the performance difference between all three APIs.

There are exceptions, however. Using only brushes eliminates the AA penalty on Windows 7, and .NET 4.6 on Windows 10 never exhibits any AA slowdown at all – but only with a discrete GPU. I cannot say whether .NET 4.6 or Windows 10 or perhaps the Nvidia driver is responsible.

Conclusion: Good drawing performance mostly requires disabling anti-aliasing. Once you do that, the choice of API is nearly irrelevant. If you require good performance with AA enabled, however, you’ll need to write raw DirectX or OpenGL code that utilizes your video card’s hardware AA – or deploy only to systems whose software and hardware correctly support WPF’s AA.

Freezing WPF pens & brushes is always a good idea — The basic DrawLine method is highly sensitive to this simple optimization and runs 3–24× faster with frozen pens. One reason for this large speedup is that DrawLine is called three times per triangle, evaluating the current pen each time. The Geometry methods are less sensitive but freezing pens & brushes still yields a speedup of 10–350%, depending on the system and operation.

Conclusion: Always immediately call Freeze on any freezable WPF object that you don’t want to animate or otherwise change in the future.

More complex WPF APIs are not necessarily faster — The complex “low-level” APIs PathGeometry and StreamGeometry beat equivalent DrawLine calls only when using unfrozen pens, and StreamGeometry significantly outperforms PathGeometry only for hardware-accelerated brushes. However, we tested triangles, i.e. very small geometries. Larger collections of geometric primitives should improve the relative performance of the Geometry APIs, especially when reused in multiple drawings.

Conclusion: Don’t expect miracles from complex Geometry APIs. Unless big geometries are reused, disabling anti-aliasing and freezing all possible WPF objects should yield a much greater speedup.

WPF hardware acceleration is inconsistent — In most tests, filling triangles with brushes is about as fast as drawing their outlines, or even slower. This glaring exception is WPF on Windows 7: using the same drawing technique, brushes are always 2–26 times faster than pens. Even more intriguing, the usual anti-aliasing penalty vanishes completely for WPF brushes – but not for pens!

On the other hand, .NET 4.6 on Windows 10 with discrete GPU shows great performance for brushes and pens, even with AA enabled – but only if the pens are frozen. I believe that we observe here the fabled “DirectX acceleration” of WPF, so lamentably unnoticeable in most configurations.

Conclusion: WPF requires a lucky combination of hardware and software for its hardware acceleration to kick in. Freezing pens seems to be a requirement on Windows 10 with .NET 4.6, strangely enough. On other systems, you may even need to use WPF brushes instead of pens where possible. I’m afraid I cannot give any general guidelines here.

Epilogue

Why does WPF have a reputation for being slow? As far as drawing geometric objects is concerned, the apparent reason is that its designers chose two unusual default values: all objects are drawn with anti-aliasing, and most object data is retrieved from expensive mutable dependency properties.

There are good reasons for both choices. Enabling AA by default is necessary since WPF supports automatic display scaling, but its enormous performance impact was virtually unknown and should have been fixed sooner. WPF objects must remain mutable until all properties have been initialized, but most objects are never animated or otherwise changed afterward. Perhaps pens & brushes created by parameterized constructors should be frozen by default – or perhaps WPF would have been better off without the elegant but slow dependency property mechanism.

Fortunately, once these two big performance stumbling blocks are known they are easy to work around. Calling Freeze on all eligible WPF objects is tedious but trivial, and the single line RenderOptions.­SetEdgeMode(this, EdgeMode.­Aliased); in a control’s constructor disables anti-aliasing for all its contents. Newer systems might be able to dispense with the latter fix, too.