[ARM] Implement psq_st. Optimizations in psq_l and fix all the remaining bugs...except clamping within the max value range of the value. Causes some minor visual effects mostly.