Booting the Final GameCube Game


Every single GameCube game can at least boot in Dolphin 5.0. Except one. Star Wars: The Clone Wars and its complex way of using the PowerPC Memory Management Unit rendered it unplayable in Dolphin up to this day. But finally as of Dolphin 5.0-540, this challenge has come and gone: Dolphin can finally boot every single GameCube game in the official library.


Star Wars: The Clone Wars Running in Dolphin


So what makes Star Wars: The Clone Wars so special? To truly understand what's going on, you need to have some knowledge on how the PowerPC's processor handles memory management and how Dolphin emulates it.



Memory Management Unit Emulation

The MMU is a part of the GameCube's CPU. Credit: Wikimedia / CC by-SA 4.0

The Memory Management Unit, or MMU, is responsible for quickly giving games access to data and code. Rather than directly accessing available RAM, games interface with virtual memory which is then translated to physical memory by the MMU. This can be done in two ways, with Block Address Translations (BATs) for huge portions of memory or the page tables for small fine-grained mappings.

The games are given access to virtual memory instead of real memory for several reasons. First of all, it gives the CPU an opportunity to cache accesses, greatly speeding up the efficiency of accessing often used values. Secondly, the GameCube only has 24MB (and some specialized areas) of RAM across a 4GB address space, meaning most memory addresses have no RAM backing them! If a game were to access a real memory address with no backing memory, it'd get garbage or even crash! By using virtual memory, the MMU can throw an exception giving the game a way to handle the situation or provide valuable feedback to developers on what went wrong.

Dolphin is able to emulate the MMU with several degrees of accuracy based on theories of how the game is going to behave.


Theory 1: The Software Only Writes to Valid Memory

As long as the games only write to valid memory, Dolphin doesn't have much work to do. Most games use the GameCube/Wii's default BAT mappings, and as long as they don't try to access memory outside of that, all that has to be done is making sure accesses end up in the right place.


virtualphysmemory.svg


The default mappings have one Block Address Translation (BAT) that translates to all of physical memory. Because BATs are faster than page tables, most games simply stick to this.

Dolphin's way of handling this is to hardcode that Block Address Translation (BAT) from virtual memory and translate the address to physical memory. In layman's terms, the game is asking for one memory address, but the console (and thus Dolphin) gives it another from real memory.

There are also a few other major pieces mapped out that we'll be keeping out of the scope of this article. The big one is Memory-Mapped Input/Output (MMIO), which is how the CPU interacts with different devices, such as the disc drive, memory cards, etc. The other thing to note is the special boot sector, but since games can't access it there isn't much more to say about it for now.

MMU Off is as barebones as MMU emulation can get while still booting games. It just gives the game some memory to work with and assumes the game will play nice. And amazingly enough, most GameCube games and nearly all Wii games actually do.

Most of the GameCube games that failed with this should have required full MMU emulation, but devious emulator developers managed to outsmart them with no performance issues with the "MMU Speedhack."


Theory 2: The Software is Predictable About Invalid Memory Usage

MMU Speedhack can be best described as "MMU Off+", as it still doesn't do any serious MMU emulation. It maps more RAM than the GameCube has to trick games into thinking things are working without caring about what the games are actually trying to do.


virtualphysmemoryspeedhack.svg


Because the games are using these addresses as extra RAM, simply mapping them as valid memory works! But, you're probably wondering why these games are accessing invalid memory in the first place, right?

This is because Nintendo provided a library that allowed games to take advantage of the GameCube's 16MB of auxiliary RAM as extra RAM. Because this auxiliary RAM is linked to the DSP, it's more commonly used and referred to as audio ram, but it can technically be used for just about anything. Because the CPU can't directly map the auxiliary RAM to the address space due to a missing hardware feature, the game has to read or write to an invalid memory address to invoke an exception handler. This exception handler would then use Direct Memory Access (DMA) to move data from the auxiliary RAM into a game designated cache in physical memory. It then sets a page table to say that this previously invalid memory address now points to this cache location, allowing the game to continue without crashing!

Somehow, Dolphin's hack works for almost all games that use auxillary RAM for extra memory like this. As ugly a hack as it is, it's painfully efficient and effective.


Theory 3: Invalid Memory Access is not Predictable

While the MMU Speedhack is great for performance, it only really works when a game is following typical memory usage behaviors that have been seen so many times they have been integrated into Dolphin. These hardcoded assumptions allow Dolphin to assume where games are going to read and prevent the need for making sure memory access is actually valid. MMU Speedhack games are so standardized that very simple rules encompass them, but, those rules have limitations.

Several games used their own custom exception handlers and use memory addresses in non-standard ways, breaking the MMU Speedhack. In these cases, Dolphin now has to check and make sure that a memory address is valid before feeding instructions or data to the emulated CPU, which is much slower than simply assuming things are valid.

Before Fiora and other's efforts to optimize the JIT and MMU emulation, Enable MMU was a death sentence to the playability of a game.



21 titles are known to require more than the MMU Speedhack.


Handling memory checks (memchecks) is slower because it harms the performance optimization "fastmem". Fastmem maps the GameCube/Wii address space to host memory and then marks all of the emulated invalid memory as allocated for the host PC. This allows Dolphin to use the host CPU's exception handler to do the dirty work when catching exceptions. When it does catch an exception, Dolphin has to fallback from fastmem to slowmem in order to handle the address, which can be a huge performance dilemma.


virtualphysmemorymmu.svg


Memchecks are the core of what Enable MMU does, and it's the key reason why MMU Enabled titles have been so slow in Dolphin. There are cases that require falling back to the interpreter, it doesn't work with fastmem, and they're even slower than normal memory accesses on console!

Despite often failing, it's always worth it to try for fastmem on pretty much any access; a fastmem loadstore takes as little as 2 instructions, where as the same access in slowmem can take up to 1000 instructions! Because we don't know if a pointer is in valid memory or not, we just always try fastmem due to the huge performance difference it gives whenever it hits.

As said earlier, slowmem actually got a bit faster which made it more viable to use. That's mostly because of moving memchecks into Fiora's Far Code Cache. By optimizing how memchecks were handled and keeping everything ready, MMU games saw huge performance increases. Rogue Squadron 3 saw the biggest effect, going from a meager 4 FPS to nearly 45 FPS from that alone!

Typically, the Far Code Cache along with other JIT optimizations averaged a nearly 100% performance boost in all of the games requiring Enable MMU, even throwing out Rogue Squadron 3 as an outlier. This is why today people on powerful computers can at least stand a chance at the MMU enabled titles.


Theory 3.5 - Factor 5 is Evil

Of course, most people don't know how slow Star Wars Rogue Squadron 3 was before the Far Code Cache was implemented because it didn't boot. At least, the NTSC version didn't. The PAL version sort of booted which is where the performance numbers from before the Far Code Cache were grabbed from.


Rebel Strike Leg

Even if you could get in-game without anything breaking, things could go bad with one vital piece of data stored across pages.


Rebel Strike is the second to last NTSC game to boot for one reason - it had a nasty little MMU trick that evaded emulation for over a decade. Rogue Squadron 3 could actually store its data across pages! During a longer read or write, Rebel Strike could trigger an exception, literally switching from a valid address to an invalid address in the process! Dolphin would previously incorrectly set the exception to the start of the read, when it really needed to set it to where the exception actually occurred.

While it would have been poetic for the final GameCube title to boot in Dolphin to be a Factor 5 game, there remained but one more. And things wouldn't be a small bug this time around.


Theory 4 - The Game Defines its Own Valid Memory

Now we finally get to the case of Star Wars: The Clone Wars. After all of Dolphin's work to handle BATs and page tables as efficiently as possible, this game decides to throw Dolphin's biggest assumption right out the window: Static BATs.


What was Dolphin Doing?

Dolphin was completely ill-equipped to handle a situation where true BAT emulation was required. No matter what settings you used, things would quickly end up going sour thanks to hardcoded assumptions.

MMU Speedhack/MMU Off - Things go very poorly in this combination. Because Dolphin isn't really emulating the MMU, it just goes ahead and decides to give the processor whatever garbage is mapped to 0x00000000. The game promptly crashes because that was a very foolish thing to do.

MMU Enabled - Assuming you're using a version from before this article, things get slightly further with MMU Enabled. Dolphin does emulate the exception and loads the error handler, but Dolphin simply isn't capable of handling what it does next. Clone Wars takes advantage of the BATs being disabled during exception handling to get finer control over memory management. And then proceeds to say that the default BATs and page tables aren't good enough and tries to create its own. Dolphin hardcodes the BATs. They aren't changeable. So, when the error handler returns control to the game... it promptly crashes. But at least Dolphin itself is still running so you can close the game and play something else!

You're probably wondering "Couldn't you make specific hardcoded BATs for this game?" The answer is a resounding no. The game actually changes the BATs multiple times during missions and multiplayer, meaning that proper BAT emulation is required.


What Dolphin Needs to do

Proper BAT emulation means that Dolphin needs to be able to enable and disable BATs based on what the game is actually doing instead of assuming what the game wants. The problem with this is that the foundation of Dolphin's efficient MMU emulation hinges on the fact that it knows where valid virtual memory is going to be, and breaking that assumption broke Dolphin to the core.

Star Wars: The Clone Wars is the only known game to take advantage of the GameCube's four mappable instruction BATs and data BATs to set up its own memory maps. In order to emulate this game, Dolphin's MMU emulation would have to be completely rewritten. Everything gets complicated; fastmem, memchecks, and everything in between.

Unlike the custom exception handlers of the other MMU Enabled titles, The Clone Wars exception handler actually restructures memory. It unmaps the default BAT and sets up its own multiple times throughout play. While all of the previous graphics showing the BATs have built up more and more complexity as emulation became more complete, The Clone Wars throws it all away!


virtualphysmemorydynamicbats.svg


Dolphin can make no assumptions this time around. The core rewrite to Dolphin's BAT emulation finally allows it to handle this worst case scenario. Dynamic BATs is true BAT emulation that actually allows it to take what the games are asking for and map things out correctly. A huge part of the hardcoded assumptions within Dolphin's MMU emulation are now gone thanks to this gargantuan rewrite.


Flipping the Page Table

Developers on Dolphin have known what was necessary to boot Star Wars: The Clone Wars for years. In ancient times (the 3.0 era) there existed a branch that could even boot The Clone Wars when using the interpreter. That was unreasonably slow and affected the performance of non-MMU titles in an era when performance was a much greater concern.

Even the current implementation (Dynamic BATs) had been sitting as a pull request for over a year, and has booted The Clone Wars for much of that time. It's been waiting so long that it had to be rebased by another developer!

The reason that this feature took so long to implement was that it had to be implemented without completely destroying performance of everything else. And in that respect, magumagu's implementation succeeds with flying colors.


dynamicbatsperf.svg


As you can see, a very high priority has been put on keeping Dolphin fast despite this huge rewrite to the MMU. Non-MMU and MMU Speedhack titles should see performance vary by less than 1%. This is a stark difference from earlier plans that saw non-MMU enabled titles drop by over 30% even with MMU emulation disabled!

One of the engineering feats of Dynamic BATs is that it still works alongside fastmem despite the obvious complications of the valid memory being able to move around! On the negative side, MMU titles are a bit slower right now. While 8% - 15% is significant, it's not something that can't be gotten back with optimizations down the road. We felt that some performance loss was worth getting this tremendous boost in accuracy.


Side Effects of a HardCode Rewrite

More accurate MMU emulation means that a lot of strange behaviors in games will now be emulated quite a bit better. While only one game actually requires Dynamic BATs to run, other games will enjoy the benefits of faster access to memchecks.

Users love when games crash, right? There are plenty of Wii games that actually crash on console when you do various unintended actions. Sometimes, doing those actions will even crash Dolphin! Easier access to memchecks means that Dolphin can accurately emulate well known crash glitches in games without Dolphin itself crashing! From issues pertaining to Super Mario Galaxy to Twilight Princess, crashes can now be emulated just like they would on console without a performance penalty!


Twilight Princess Sign Bug


Then again, most users don't want their game to crash. On that side of things, Rayman Raving Rabbids TV Party (and Rabbids Party Collection) no longer crash when loading various minigames. This could be due to slight changes in cache handling, or more likely bugs in the hardcoded BATs/Page Tables. Considering it was a crash that no one had any idea to solve, it simply disappearing with a rewrite is a welcome surprise.

Cleanups to CPU emulation in general also have two more games working without much of an explanation. We Love Golf!, developed by the people behind Mario Golf, would crash when trying to load a course. Summer Athletics 2009 would simply not boot and spit out some debug information on the screen. While neither game was affected by Dynamic BATs main implementation, cleanups during the merge process inexplicably fixed both games. Based on the what is known about the issues, it's likely related to more instruction and data cache issues.

Another welcome surprise is that a bunch of romhacks should run better in Dolphin. Wiimm's Mario Kart Wii More Fun romhacks will load their track selection screen properly! To a degree, at least.

This is very, very lucky to be working at all. Because the hack doesn't clear the icache and instead relies on the PowerPC processor to handle icache eviction and flushing, it was thought to be impossible to emulate on Dolphin without huge performance penalties. But now Dolphin clears the icache on BAT changes. Unfortunately, users will need to return to the main menu between races. If you want this fixed... don't bother reporting it here. While Dolphin will eventually emulate instruction and data cache emulation, both combined will raise requirements for CPU emulation by 14x! For hardcore users and those seeking accuracy, this needs to be emulated, but for everyone else you really don't want a game you like to require accurate cache emulation.

If the romhackers want their games to run in Dolphin in the meantime, we unfortunately must ask them to not rely on the icache/dcache behavior of the PowerPC. In the case of Project M, before their final release they did fix their icache/dcache reliances so it should run properly in Dolphin. But, hey, if you don't want people on Dolphin running your romhacks, you know how to defeat it - for now.


In Closing

With this rewrite, Dolphin has taken another big leap in accuracy under the hood. While most users shouldn't see a difference, a few random crashes here and there should be sorted out. It's bittersweet in a way, while it is a momentous occasion to get the last GameCube game booting, it also denotes that there aren't many huge mysteries remaining. While some games still crash, and there are a lot of issues to still tackle, there are no completely broken games that make zero sense remaining.

Dolphin's MMU emulation should be able to handle any retail game at this point. The only people that could possibly break it would be Factor 5, and it's not like they made any Wii games...


Star Wars: Rogue Leaders on Wii


It turns out there indeed is another Rogue Squadron game in existence for Wii. Developed by Factor 5, it's sure to push the console harder than any of the games before it. The problem is that since it was never released, we can't throw it at Dolphin. Yet, since it does exist, whoever does have it could throw it at Dolphin. Maybe even get whatever crash it inevitably triggers. Or if it does boot, maybe one screenshot? Please?

Ah well. We at least have the Netflix Channel developed by core members of Factor 5 in their new company. And as you probably have guessed, it does not boot in Dolphin.

You can continue the discussion in the forum thread of this article.

Next entry

Previous entry

Similar entries