RY,
You could potentially get into issues if just using NWrite packets since the memory writes will be different paths than the register writes. What I mean is that if the write transactions are loaded into the LSU back-to-back, SRIO will fire the transactions as fast as possible.
So if the write to DDR3 is sent, followed immediately by the write to the MPAX register, it is possible the register write may pass the other transactions because of the SoC internal paths and/or any stalls on DDR3. The best thing to ensure that the writes completed before sending the MPAX register write is to either use NWRITE_R for the last DDR3 write transaction, or use NWRITE followed by a DOORBELLL packet. This will ensure that the NWRITE_R or DOORBELL response packet is received before the LSU will send the MPAX register write packet.
Best Regards,
Chad