Starting with version 3.0, the .NET Framework provides two incompatible and unrelated graphics APIs, both aimed at general GUI application development:
- Windows Forms wraps the GDI+ API introduced in Windows XP, which in turn extends the GDI (Graphics Device Interface) API that dates back to the first versions of Windows. The original interface languages for GDI and GDI+ are C and C++, respectively.
- Windows Presentation Foundation (WPF) is based on DirectX and exposed exclusively through managed .NET code.
WPF was developed for Windows Vista whose new Desktop Window Manager (DWM) is likewise based on DirectX rather than GDI. The DWM is enabled by switching to the Aero desktop theme (the default on most editions), and disabled by switching to the Basic theme which emulates Windows XP.
Rumor has it that Vista was originally intended to use WPF for its entire GUI, but the performance of the new API was not up to the task. Certainly, developers outside Microsoft have frequently criticized WPF for its sluggish performance, especially compared to GDI/GDI+ on Windows XP.
On this page I attempt to measure the performance of simple drawing operations in both WPF and Windows Forms (i.e. GDI+) under a variety of conditions. For comparison, I implemented the same operations in Java’s AWT (Abstract Window Toolkit). Hopefully the results will prove useful to other developers. The test application and its source code are available for download, so you can run your own tests and modify the test cases as desired.
- Measuring WPF Performance, a rather tricky procedure
- Drawing Test Application with documentation and download
- Sample Test Results for various configurations on my systems
- Test Conclusions drawn from these results
Before moving on, I’d like to recommend Jeremiah Morrill’s Critical Deep Dive into the WPF Rendering System. This post is largely unrelated to the following discussion, but it’s a fascinating examination of WPF performance at the lowest level.
Measuring WPF Performance
Attempting to measure the time WPF takes to fully render a window is surprisingly difficult. This is because two different threads collaborate in this task. To explain what that means, here’s a quick overview of how WPF shows things on the screen.
- WPF operates in retained mode (as does JavaFX, not tested here). Calling a drawing method stores the indicated content in some internal format but does not alter the screen. At some unspecified later time, a background thread renders the prepared content to the screen. Rendering happens automatically and repeatedly when necessary, e.g. when an obscured window is uncovered.
- By contrast, GDI/GDI+ and hence Windows Forms operate in immediate mode (as does Java AWT). Calling a drawing method immediately renders the indicated content to the screen before the method returns. However, this content is never stored, so the application must repeat those same method calls whenever the same content should be re-rendered.
An immediate mode API can simulate a primitive sort of retained mode by drawing to a memory buffer which is later copied to the screen. Such buffering is frequently used for better performance or smoother animations. Our test application exercises both direct and buffered GDI+.
Since WPF uses retained mode, all new content passes through two stages before appearing on the screen: first internal preparation, then the actual rendering. WPF implements these stages as follows:
1. Preparation — This includes computing the sizes and layout of all WPF objects to be rendered, as well as recording the actual drawing operations. Any WPF methods that you call explicitly, for example within an OnRender
override, are part of this stage.
All preparations are handled synchronously by the message loop running on the (usually single) GUI thread, which is accessible through the Dispatcher
property. This is the same mechanism that transmits user input to Windows applications, and it’s the reason why WPF won’t update the display until your topmost event handler has returned. The GUI thread cannot process any drawing operations while it’s in your code – it must return to the message loop first.
(There’s a dangerous trick to get around this, known as DoEvents
after the eponymous Windows Forms method, which tells the Dispatcher
to immediately work through all pending messages. The drawing test application uses this trick to clear the message queue before the test timer starts.)
2. Rendering — When all preparations are complete, a separate background thread eventually renders the prepared content to the screen. Unfortunately, this thread is completely hidden from user code, and WPF offers no (direct) way to tell when an object has finished rendering. This is a rather big problem for responsive GUI design which, as far as I’m aware, persists in .NET 4.6.
(This mechanism also explains why WPF, an API based on DirectX, doesn’t expose a DirectX interface for user drawings. Only the background render thread interacts with DirectX, so any user-supplied DirectX code would have to somehow insert itself into this thread. It’s difficult to see how that could work without messing up existing functionality.)
Measuring Windows Forms is easy: we start a timer before showing a test window, and stop it at the end of the window’s OnPaint
handler. Since Windows Forms operates in immediate mode, the entire window has been fully rendered to the screen at that point. Java’s AWT likewise operates in immediate mode, and can be measured in the same way.
Measuring WPF is more difficult. Once again, we start a timer before showing a test window. But now we need to measure both stages of WPF’s retained mode to find the total time until the window has actually been rendered to screen.
Measuring Preparation
This stage is complete when the UI thread’s message loop has processed all pending messages, which (we assume and hope!) all originated from our drawing operations. WPF exposes a Window.ContentRendered
event that is perfect for this purpose. Despite its name, this event fires after all window contents have been prepared for rendering for the first time. We react by setting a flag in our test application that activates measurement of the second stage.
Measuring Rendering
We cannot directly access the rendering thread but WPF does offer one indirect point of access, namely through the CompositionTarget.Rendering
event. This event usually fires at the monitor refresh rate (typically 60 times per second), whether there’s any new content to render or not. It is primarily intended for custom animations that need to generate display updates as quickly as the monitor can show them.
However, the Rendering
event is tied to the render thread in a way we can exploit: the event is not raised as long as the render thread is busy! It will be raised again at some point during the next refresh interval after the render thread has gone idle. Since we set a flag immediately after the preparations stage was complete, we can now examine that flag in our Rendering
handler. If the preparations flag is still set, we know that a test window has just been rendered and we can record the elapsed time.
This trick is not foolproof. Sometimes the Rendering
event fires just after a test window has been prepared, but before the render thread has actually started working on it. We circumvent this problem by comparing the event time to the time when the preparations flag was set. If the difference is less than 100 msec (a value tailored to our benchmark), we assume that rendering has not yet happened and wait for the next event to arrive.
Drawing Test Application
All results shown below were obtained with a small test application. The download package DrawingTest.zip (31.8 KB, ZIP archive) comprises the precompiled application for the .NET Framework 4.0 and the complete source code for Visual Studio 2010, as well as a version for Java’s AWT library.
The test application draws 10,000 triangles to a window’s client area, sized 400×400 screen pixels (for GDI+ and AWT) or device-independent units (for WPF). Each triangle is rotated 1° clockwise compared to the previous one. Triangles are drawn either as outlines using pens (“Pens Only”), filled shapes using brushes (“Brushes Only”), or both with different colors (“Pens & Brushes”). All colors are solid, with no patterns, shading, or animation effects of any kind.
The test application for Java’s AWT library is located in a separate folder and run from the command line – please see the enclosed ReadMe file for instructions. The test application for GDI+ and WPF provides a GUI with five buttons on the left start each test window, as follows:
-
GDI+ Direct (Alt+D) — Shows a WinForms window (a.k.a.
Form
) whoseOnPaint
handler callsGraphics.DrawPolygon
and/orGraphics.FillPolygon
to draw the triangles directly to the window. We enable alpha blending (SourceOver
) and high-quality compositing to replicate WPF behavior, but testing withSourceCopy
and high-speed compositing showed no measurable difference. -
GDI+ Buffer (Alt+B) — Shows a WinForms window whose
OnPaint
handler creates aBufferedGraphics
object covering the entire client area, then callsGraphics.DrawPolygon
and/orGraphics.FillPolygon
to draw the triangles to that buffer, and finally renders the buffer to the window. (This is equivalent to setting theDoubleBuffered
orControlStyles.OptimizedDoubleBuffer
flag on theForm
.) -
WPF Line (Alt+L) — Shows a WPF window whose
OnRender
handler callsDrawingContext.DrawLine
three times for each triangle. TheDrawingContext
class does not expose a method to fill arbitrary polygons, so this test supports only the “Pens Only” option. -
WPF Path (Alt+P) — Shows a WPF window whose
OnRender
handler creates aPathFigure
for each triangle, then aPathGeometry
containing the figure, and finally callsDrawingContext.DrawGeometry
to draw that geometry. -
WPF Stream (Alt+S) — Shows a WPF window whose
OnRender
handler creates aStreamGeometry
for each triangle, which is once again drawn byDrawingContext.DrawGeometry
.
To minimize interference with the test timer, I recommend that you move the mouse cursor away from the application and test windows, and start all tests with keyboard shortcuts rather than mouse clicks. If you use high DPI mode, you’ll notice that the Windows Forms and AWT windows appear smaller than the WPF windows. This is correct and due to the fact that WPF automatically scales all coordinates by the current DPI setting, whereas Windows Forms and AWT do not.
Anti-Aliasing
Anti-aliasing, i.e. smoothing the edges of diagonal lines, turns out to have a huge performance impact on most tests. Anti-aliasing is disabled by default for GDI+ and AWT, and enabled by default for WPF. Use “Anti-Aliasing On/Off” to change this setting which is implemented as follows:
- GDI+ — Set
SmoothingMode
on the currentGraphics
object. - WPF — Set
RenderOptions.EdgeMode
for the currentWindow
. - AWT — Set
RenderingHints.KEY_ANTIALIASING
on the currentGraphics2D
object.
On Windows XP, direct (unbuffered) GDI+ does not support AA at all; the corresponding SmoothingMode
flag is simply ignored. This may be a limitation of GDI hardware acceleration on that platform.
WPF Freezing
The figures and geometries created by the WPF Path & Stream tests are always frozen. Testing showed that leaving them unfrozen makes no discernable difference. However, freezing the pens and brushes used by the three WPF windows makes a very big difference, so this feature is controlled by one last option. All WPF pens and brushes are initially unfrozen until you click the “Freeze WPF” button, at which point they remain frozen until the application is closed.
Limitations
The application tests exactly one thing: drawing the outlines and/or interiors of many triangles in solid colors. It does not test anything else, including the following:
- Standard GUI elements, although those ultimately rely on primitive drawing operations such as the ones that are being tested.
- Advanced features such as patterns, shading, or animation. WPF is much more powerful in this regard than GDI/GDI+.
If you are interested in the performance of some specific drawing operation that is not covered by the application, you should modify its source code to run your own customized tests on your target system. This is ultimately the only way to find reliable answers.
Sample Test Results
This section contains sample test results from a variety of systems, as detailed below. All times are in milliseconds and represent the average of several test runs. In each table, the first three rows were measured with anti-aliasing disabled and the last three rows (“AA +”) with anti-aliasing enabled.
Windows XP & 7
The first group of test results cover Windows XP and Windows 7, both running the .NET Framework 4.0, and were obtained on 27 July 2011. The system comprised an Intel DX58SO motherboard with an Intel Core i7 920 CPU (2.67 GHz), 6 GB RAM (DDR3-1333), and an AMD Radeon HD 6970 GPU (2 GB).
The first table shows the results for Windows XP SP3 (32 bit, 96 dpi, DirectX 9.0c) running in Virtual PC on Windows 7 SP1 (64 bit).
Windows XP | GDI+ | WPF Unfrozen | WPF Frozen | |||||
---|---|---|---|---|---|---|---|---|
Direct | Buffer | Line | Path | Stream | Line | Path | Stream | |
Pens Only | 160 | 390 | 10,800 | 2,050 | 1,850 | 800 | 950 | 750 |
Brushes Only | 300 | 1,120 | — | 1,600 | 1,380 | — | 1,300 | 1,100 |
Pens & Brushes | 460 | 1,470 | — | 3,350 | 3,150 | — | 1,950 | 1,720 |
AA + Pens Only | — | 4,760 | 13,100 | 4,800 | 4,500 | 3,150 | 3,600 | 3,400 |
AA + Brushes Only | — | 3,930 | — | 3,000 | 2,750 | — | 2,650 | 2,450 |
AA + Pens & Brushes | — | 8,750 | — | 7,400 | 7,250 | — | 6,000 | 5,800 |
The second table shows the results for Windows 7 SP1 (64 bit, 120 dpi) with Desktop Window Manager disabled (Windows 7 Basic scheme).
Windows 7 Basic | GDI+ | WPF Unfrozen | WPF Frozen | |||||
---|---|---|---|---|---|---|---|---|
Direct | Buffer | Line | Path | Stream | Line | Path | Stream | |
Pens Only | 3,000 | 360 | 10,300 | 1,850 | 1,650 | 500 | 780 | 570 |
Brushes Only | 3,950 | 580 | — | 680 | 480 | — | 400 | 200 |
Pens & Brushes | 6,900 | 910 | — | 2,250 | 2,050 | — | 880 | 680 |
AA + Pens Only | 7,400 | 4,550 | 14,000 | 6,300 | 6,100 | 4,050 | 5,200 | 5,000 |
AA + Brushes Only | 7,400 | 3,850 | — | 680 | 480 | — | 400 | 200 |
AA + Pens & Brushes | 14,800 | 8,450 | — | 6,700 | 6,480 | — | 5,300 | 5,100 |
The third table shows the results for Windows 7 SP1 (64 bit, 120 dpi) with Desktop Window Manager enabled (Windows 7 Aero scheme).
Windows 7 Aero | GDI+ | WPF Unfrozen | WPF Frozen | |||||
---|---|---|---|---|---|---|---|---|
Direct | Buffer | Line | Path | Stream | Line | Path | Stream | |
Pens Only | 20,000 | 350 | 10,400 | 1,890 | 1,680 | 500 | 770 | 560 |
Brushes Only | 17,300 | 580 | — | 700 | 480 | — | 400 | 180 |
Pens & Brushes | 36,000 | 920 | — | 2,300 | 2,080 | — | 880 | 670 |
AA + Pens Only | 25,600 | 4,500 | 13,800 | 6,200 | 6,000 | 4,050 | 5,150 | 4,950 |
AA + Brushes Only | 27,800 | 3,800 | — | 680 | 480 | — | 400 | 190 |
AA + Pens & Brushes | 55,400 | 8,400 | — | 6,700 | 6,450 | — | 5,300 | 5,070 |
Windows 10
The second group of test results cover Windows 10 and were obtained on 21 September 2015. The system was Windows 10 Pro (64 bit, 192 dpi) with the .NET Framework 4.6, running on a Dell XPS 15 notebook (model 9530) with an Intel Core i7 4712HQ CPU (2.30 GHz) and 16 GB RAM (DDR3-1600).
I used the old executable built with VS2010 for .NET 4.0 which as expected runs fine on .NET 4.6. The first table shows the results for the Intel HD Graphics 4600 GPU that’s integrated with the Intel Core i7 CPU, and normally used to render desktop applications.
Intel HD 4600 | GDI+ | WPF Unfrozen | WPF Frozen | |||||
---|---|---|---|---|---|---|---|---|
Direct | Buffer | Line | Path | Stream | Line | Path | Stream | |
Pens Only | 9,337 | 325 | 9,468 | 1,667 | 1,498 | 395 | 645 | 492 |
Brushes Only | 12,867 | 566 | — | 2,356 | 2,196 | — | 2,089 | 1,935 |
Pens & Brushes | 20,680 | 807 | — | 2,835 | 2,340 | — | 1,207 | 1,054 |
AA + Pens Only | 12,963 | 4,022 | 9,495 | 1,563 | 1,395 | 482 | 562 | 422 |
AA + Brushes Only | 14,714 | 2,951 | — | 3,489 | 3,328 | — | 3,193 | 3,057 |
AA + Pens & Brushes | 27,964 | 6,963 | — | 4,790 | 4,625 | — | 3,504 | 3,348 |
The second table shows the results for the Nvidia GeForce GT 750M GPU present as a discrete option for full-screen applications, typically games. I manually enabled it for the test executable.
Nvidia GT 750M | GDI+ | WPF Unfrozen | WPF Frozen | |||||
---|---|---|---|---|---|---|---|---|
Direct | Buffer | Line | Path | Stream | Line | Path | Stream | |
Pens Only | 10,852 | 327 | 9,451 | 1,715 | 1,548 | 449 | 699 | 542 |
Brushes Only | 17,547 | 574 | — | 973 | 698 | — | 678 | 439 |
Pens & Brushes | 25,286 | 820 | — | 2,102 | 1,977 | — | 837 | 666 |
AA + Pens Only | 13,180 | 4,041 | 9,471 | 1,588 | 1,424 | 446 | 557 | 412 |
AA + Brushes Only | 14,956 | 2,944 | — | 981 | 704 | — | 582 | 462 |
AA + Pens & Brushes | 28,246 | 6,981 | — | 2,055 | 1,813 | — | 757 | 548 |
Java AWT
As a point of comparison, the final table shows test results for Java’s standard AWT library. The first column was obtained on the 27 July 2011 system with Oracle Java SE 6u26. The second column was obtained on the 21 September 2015 system with Java SE 8u60.
Java AWT | Windows 7 Aero on Desktop | Windows 10 on Laptop | ||
---|---|---|---|---|
AA Disabled | AA Enabled | AA Disabled | AA Enabled | |
Pens Only | 180 | 5,500 | 359 | 63,895 |
Brushes Only | 520 | 4,800 | 1,005 | 95,349 |
Pens & Brushes | 590 | 10,200 | 1,090 | 157,884 |
AWT’s performance is roughly comparable to buffered GDI+ on Windows 7 but completely collapses on Windows 10 when anti-aliasing is enabled. This strikes me as a bug that’s hopefully going to be corrected in a future Java release.
Test Conclusions
I make two assumptions in my following attempt to interpret these results:
- My Windows XP results, measured within Virtual PC, are representative of a native Windows XP installation, at least insofar as the relationship between GDI+ and WPF is concerned. I make this assumption because I would expect an additional slowdown from the virtual environment, if anything; but instead many tests run faster than on my native Windows 7 installation.
- My Windows 7/10 results are also representative of Windows Vista & 8/8.1. I make this assumption because all these operating systems use the new driver and display architecture that was introduced in Vista. In any case, Vista has seen poor adoption and 8/8.1 systems are rapidly transitioning to Windows 10, so the most popular systems are covered directly.
Once again, I encourage you to download the test application and try it on your own system(s), modifying the test code to your own requirements if necessary. Still, based on my test results as they stand, I’m inclined to draw the following conclusions:
Direct GDI+ is obsolete after Windows XP — The architectural changes between XP and Vista slowed direct GDI+ operations by two orders of magnitude. On Windows XP, unbuffered GDI+ is 3–67× faster than WPF; on Windows 7 Basic, between 3× faster and 37× slower; and finally on Windows 7 Aero & 10, never faster and up to 146 slower!
Conclusion: Unbuffered GDI+ is a great choice for custom drawing on XP (if you don’t need anti-aliasing), but it’s completely useless on newer systems where WPF is usually faster. Buffered GDI+, on the other hand, delivers consistent and competitive performance across all systems – and also supports anti-aliasing on Windows XP. As we’ll see next WPF can be significantly faster when anti-aliasing is enabled on Windows 10, though.
Anti-aliasing can be extremely slow — Surprisingly, the fact that WPF enables anti-aliasing by default was originally the single biggest factor in its apparent slowness compared to other APIs. Turning off AA improves performance in most tests by an order of magnitude, and using identical AA settings dramatically shrinks the performance difference between all three APIs.
There are exceptions, however. Using only brushes eliminates the AA penalty on Windows 7, and .NET 4.6 on Windows 10 never exhibits any AA slowdown at all – but only with a discrete GPU. I cannot say whether .NET 4.6 or Windows 10 or perhaps the Nvidia driver is responsible.
Conclusion: Good drawing performance mostly requires disabling anti-aliasing. Once you do that, the choice of API is nearly irrelevant. If you require good performance with AA enabled, however, you’ll need to write raw DirectX or OpenGL code that utilizes your video card’s hardware AA – or deploy only to systems whose software and hardware correctly support WPF’s AA.
Freezing WPF pens & brushes is always a good idea — The basic DrawLine
method is highly sensitive to this simple optimization and runs 3–24× faster with frozen pens. One reason for this large speedup is that DrawLine
is called three times per triangle, evaluating the current pen each time. The Geometry
methods are less sensitive but freezing pens & brushes still yields a speedup of 10–350%, depending on the system and operation.
Conclusion: Always immediately call Freeze
on any freezable WPF object that you don’t want to animate or otherwise change in the future.
More complex WPF APIs are not necessarily faster — The complex “low-level” APIs PathGeometry
and StreamGeometry
beat equivalent DrawLine
calls only when using unfrozen pens, and StreamGeometry
significantly outperforms PathGeometry
only for hardware-accelerated brushes. However, we tested triangles, i.e. very small geometries. Larger collections of geometric primitives should improve the relative performance of the Geometry
APIs, especially when reused in multiple drawings.
Conclusion: Don’t expect miracles from complex Geometry
APIs. Unless big geometries are reused, disabling anti-aliasing and freezing all possible WPF objects should yield a much greater speedup.
WPF hardware acceleration is inconsistent — In most tests, filling triangles with brushes is about as fast as drawing their outlines, or even slower. This glaring exception is WPF on Windows 7: using the same drawing technique, brushes are always 2–26 times faster than pens. Even more intriguing, the usual anti-aliasing penalty vanishes completely for WPF brushes – but not for pens!
On the other hand, .NET 4.6 on Windows 10 with discrete GPU shows great performance for brushes and pens, even with AA enabled – but only if the pens are frozen. I believe that we observe here the fabled “DirectX acceleration” of WPF, so lamentably unnoticeable in most configurations.
Conclusion: WPF requires a lucky combination of hardware and software for its hardware acceleration to kick in. Freezing pens seems to be a requirement on Windows 10 with .NET 4.6, strangely enough. On other systems, you may even need to use WPF brushes instead of pens where possible. I’m afraid I cannot give any general guidelines here.
Epilogue
Why does WPF have a reputation for being slow? As far as drawing geometric objects is concerned, the apparent reason is that its designers chose two unusual default values: all objects are drawn with anti-aliasing, and most object data is retrieved from expensive mutable dependency properties.
There are good reasons for both choices. Enabling AA by default is necessary since WPF supports automatic display scaling, but its enormous performance impact was virtually unknown and should have been fixed sooner. WPF objects must remain mutable until all properties have been initialized, but most objects are never animated or otherwise changed afterward. Perhaps pens & brushes created by parameterized constructors should be frozen by default – or perhaps WPF would have been better off without the elegant but slow dependency property mechanism.
Fortunately, once these two big performance stumbling blocks are known they are easy to work around. Calling Freeze
on all eligible WPF objects is tedious but trivial, and the single line RenderOptions.SetEdgeMode(this, EdgeMode.Aliased);
in a control’s constructor disables anti-aliasing for all its contents. Newer systems might be able to dispense with the latter fix, too.