Dolphin Progress Report: May 2016


This will likely be the last Progress Report before Dolphin 5.0 is released for one simple reason: we're running out of fixes and notable changes! From here on out, unless some huge bugs are discovered, all we're left with is a few minor regressions and prepwork for the release.

For those wondering why so much care is being taken into Dolphin 5.0's release, look back at Dolphin 4.0's release. It shipped with several huge day one bugs that required 2 hotfix patches to become what people now see as Dolphin 4.0.2. In order to prevent another debacle, all changes are being carefully checked over and tested before and after being merged. We want the highest possible compatibility for our releases, as they are a benchmark for the next wave of development builds until the next release!

For your viewing pleasure, we have a video of an AI being developed for Dolphin fighting its distant ancestor: the Level 9 AI. It turns out an AI built upon all we've learned about Melee in the past 15 years doesn't exactly feel fair to throw at the original AI.


Smashbot vs Level 9 AI

With that, let's get to what is hopefully the last batch of Notable Changes before the 5.0 release!


Notable Changes


4.0-9273 - Implement Dithering for Video Software by phire

For the most part, no one using Dolphin purely for the enjoyment of their favorite games will care much about this change. Nonetheless, some games do rely on dithering to cover up various artifacts and patterns in textures. In Spyro: A Hero's Tail, it hides banding in various textures. For now this is only implemented in the Software Renderer, but in the future, dithering support could be optionally added to the hardware renderers as well. Why optionally? Dithering will make textures blurrier than before, and even if it is accurate to the console, we're certain that people won't appreciate that when playing their games at higher resolutions! It'd also be optional because it would make the GPU thread even harder to emulate.


The Last Airbender

Sorry for reminding Avatar fans this exists, but, the banding on the menu is very obvious here!

The Last Airbender

With dithering implemented, the banding disappears and the fog is no longer suffering from banding. (click for detail)



4.0-9308 - Implement RealXFB, Bounding Box, and Perf Queries in D3D12 by stenzek

When the D3D12 backend was originally merged to Dolphin, it was lacking quite a few features. This was in anticipation of them hopefully being fixed before Dolphin 5.0, and thankfully stenzek pulled through with this huge batch of features for the young backend!


RealXFB

The GameCube and Wii cannot output directly from the Embedded Frame Buffer (EFB), so it copies to main memory (External Frame Buffer in Dolphin,) and then outputs it to Video Interface. One of the biggest hacks in Dolphin is that we can ignore all of that and simply display the Embedded Frame Buffer on screen with no real penalty in most games. Some games won't use the GPU at all for rendering various things, usually FMVs, meaning that RealXFB is required in some cases. It's also required if the CPU needs to see the data copied to XFB for various manipulations. So, D3D12 lacking RealXFB rendered many titles unable to display in D3D12.


Performance Queries

Performance Queries numerate the amount of rendered pixels. For years, this feature was only used by one game, for one single level. The shine "Scrubbing Sirena Beach" in Super Mario Sunshine is the only time Dolphin ever needs perf queries, and it's only used to tell how much of the paint covering the map has been cleared by the player. Previously D3D12 would malfunction on this stage of the game because it did not support perf queries.


Without performance queries, Super Mario Sunshine can't tell how much goop has been cleared, and this level becomes very, very easy.

TimeSplitters: Future Perfect also enables Performance Queries, but there's little actual documentation on what it uses the feature to do. It's assumed that it uses it to detect if lights (such as sunlight,) are renderered to draw things such as coronas and lenses flares. If someone who has the game can try disabling Performance Queries in the INI and see what breaks, that'd be much appreciated so it can be documented.


Bounding Box

Bounding Box is a fringe feature only used by roughly a dozen games. This features evaluates the min and max of all rendered pixel positions. Expert manipulation of this feature allows for the unique graphical effects present throughout Paper Mario: The Thousand-Year Door and Super Paper Mario. These two titles are the main testing bed for Bounding Box support simply because they do so much with it! Rest assured that the D3D12 backend can now do any of the crazy bounding box effects required to play through the game without visual issues.


Paper Mario: The Thousand-Year Door

How to counter an invisibility spell - ink!

Paper Mario: The Thousand-Year Door

With Bounding Box support, D3D12 can finally render these effects properly.



But there's a little bit more to the D3D12 Bounding Box implementation. Since implementing GPU bounding box, our NVIDIA GPU users have had tons of problems handling Bounding Box titles at playable speed. Even a massive GTX 970 could be brought down to single digit framerates during various bounding box effects! Thankfully, NVIDIA users on Windows 10 won't have to worry about those performance issues any longer on D3D12. It'd sure be nice if we could make OpenGL just as fast for our other NVIDIA users...


4.0-9366 - Improve NVIDIA Performance of Bounding Box on OpenGL by stenzek

While working on D3D12 Bounding Box, Stenzek realized if NVIDIA cards were able to run D3D12 bounding box so quickly, there should be a way to do the same thing on OpenGL. After some investigation he realized that Dolphin wasn't the only one running into performance issues when reading atomics on NVIDIA cards. Atomics are the basis of Bounding Box support, so, not using them wasn't an option unless we wanted to go back to a software solution, thus, finding a way to do things faster.

For the reading back the shader storage used by atomics, it was discovered that using glGetBufferSubData was substantially faster than glMapBufferRange on most graphics cards. AMD, which have always been extremely fast with atomics, actually suffered a performance hit with this switch. So, as of 4.0-9366, if you're running anything but an AMD, Dolphin uses glGetBufferSubData for Atomics, greatly speeding up bounding box on OpenGL. While we cannot know for sure what the driver is doing, it is assumed the reason for these performance issues is that when the buffer is mapped, the NVIDIA GL driver moves it from GPU memory to system memory, which is much slower for the GPU to communicate with. This solution is to simply not map the buffer.


Bounding Box

D3D12 has a slight advantage in the most strenuous scene in The Thousand Year Door, but OpenGL is actually faster overall on most bounding box effects on NVIDIA cards.


4.0-9338 - Fix ES_Launch Timing Regressions by magumagu and 4.0-9347 - Audio DMA Timing Improvements by phire

Both of these issues were caused by old timings in Dolphin relying on the broken Core Event timings that were off by up to 40,000 cycles. magumagu, the architect of the modern ES_Launch implemention, returned to fix the ES_Launch timings after it was discovered that the fixes broke ES_Launch. But this bug actually has a very interesting side note. ES_Launch was never working properly in interpreter at all, since it wasn't fully afflicted by this bug. This means that various things, such as the System Menu, were never working and no one really noticed! With the new ES_Launch timings, behavior should be returned to normal.

phire's Audio DMA Timing Improvements are similar. A bunch of Namco games rely on there being a delay on DMA timings or else they won't boot. The biggest delay seen was 78 cycles in The Sky Crawlers: The Innocent Aces. Anyone with games that have failed to boot in the past or have been noted to be problematic with DMA changes in the past should check and make sure their favorite games still boot.


4.0-9368 - D3D11: Fix CPU EFB Color Reads when MSAA is Enabled by stenzek

When turning on MSAA, various effects would stop working only in D3D11. While it is easy enough to say "enhancements can break games" and ignore it, there are cases where those enhancements can be tuned and fixed for particular situations. The key to figuring that out for this issue was that OpenGL and D3D12 both worked fine in identical configurations.

In order to figure out this behavior, one eagle-eyed user had to find a very strange behavior in Super Mario Sunshine. They were running into an issue where the turning radius when running sideways was a lot tighter in Dolphin than on console! The game is actually reading the screen to see if Mario is visible, and if he's obscured, it makes the turning radius tighter, possibly to allow for easier movement behind objects such as trees and other small objects. To check the screen, it used the feature "EFB Access to CPU" to check, much like how Wind Waker uses that to see the contents of a pictographs. Disabling "Skip EFB Access to CPU" fixed this issue on everything except on D3D11, where MSAA would still break the controls.

It turns out this MSAA path for "EFB Access to CPU" was completely unimplemented on D3D11, leading to the different behavior. Depth reads on D3D11 will now return the minimum depth of all samples, instead of the average of all samples. This matches the behavior of D3D12 and OpenGL.


4.0-9390 - Add Synchronization to State Changes by Emptychaos

Have you ever been playing Dolphin, and despite the game running fine, you go to click something on the graphical user interface (GUI), and Dolphin has encountered a problem? Even though the game is running fine, the GUI has locked up and Dolphin will crash.

This is one of the most common crashes for Dolphin to run into, and it's completely separate from emulation. This change makes it so that won't happen by making sure the race conditions that set it up cannot happen. And even if the UI manages to lock itself up, there is now a ten second time-out before it will release, meaning users won't have to suffer through UI crashes causing them lost data any longer. JMC47 alone has run into this particular crash hundreds of times while testing over the past few years. It was most common when trying to hit tab when logging is turned on, but could happen whenever it feels like it as well. This crash has been a part of Dolphin since at least the 3.0 days, but it wasn't until last year that the bug spread further throughout the GUI, breaking features like frame advance in the process. But that turned out to be a blessing in disguise, as that made the race condition readily visible.

The added guards around state changes should prevent the GUI from locking up, fixes frame advance and fifoplayer issues. This also adds a time-out on GUI lockups to ten seconds; if the GUI does manage to lock-up, don't panic! It'll release after ten seconds and you won't lose your game or data.


Opening the Graphics Menu as a Speedup

Misconception: "Dolphin has some kind of caching problem where it's slow when it starts up. Opening the graphics menu fixes it, though."

This is probably one of the most common misconceptions in Dolphin, and it's due to confusing design choices.

When you startup Dolphin, most games have an initialization file associated with them that loads various settings. These per-game settings are supposed to turn on various slower features that games need in order to run correctly. For example, Pokémon XD has the following settings:

"EFBScale = -1"
"SafeTextureCacheColorSamples = 0"

Let's say a user running Dolphin at 1.5x Interal Resolution (IR) and the Texture Cache set to fast open this game. Dolphin loads the game and checks the INI, seeing these settings. EFBScale set to -1 is a special case that tells Dolphin that non-integer IRs will break the game. So, instead of using 1.5x IR, 2.5x IR, or Auto-windowsize, it will revert to an integral IR (1x, 2x, 3x, Auto-Multiples of 640x528) to prevent bugs. SafeTextureCacheColorSamples = 0, aka Texture Cache Accuracy set to Safe, simply means we hash the entirety of every texture. Medium and Fast work by only hashing so much of a texture which is faster, but, if two textures have a hash collision visual anomalies can happen. In Pokémon XD, we have to hash everything due to collisions in font textures.

A typical user wouldn't know all of this, and would see Dolphin being slow and get annoyed. They then open up the "Graphics Menu" to tweak their settings and boom, the lag is gone. This isn't because of a memory leak or anything else: this is because opening the Graphics Menu causes Dolphin to load the users prior settings over the INIs settings! So, now the user is happy, everything looks fine in the intro. Then the bugs start popping up.


Pokemon XD

If your kids are ever feeling cocky about their reading skills, throw them on Dolphin with texture cache set to fast.


Sometimes there can be a balance between accuracy and performance depending on a particular users computer, but Dolphin itself will, within reason, try to craft settings to give the user as accurate of an experience as possible with the original console. All INIs are editable, so, if said text issue wasn't a big deal for that particular user, they could open the game properties window, hit "edit config" and add their own settings to the blank file. These settings will then override the default settings, but, can still be overridden by opening the graphics menu. The entire configuration system is a mess and one of the top priorities to cleanup after the next release.

A list of INI settings and what they do can be found on the forums.


You can continue the discussion in the forum thread of this article.

Next entry

Previous entry

Similar entries