[ARM] Fix the FPR cache to not have to dump registers after every instruction. Add mullwox instruction.