Reality Check
There really is not enough time in cycle 4 for the execution unit to produce the result, write result to CRS, then read from CRS for dispatch to an execution unit again.
Instead, a forwarding path would be activated which would mux the result bus to the input of the appropriate execution unit. This is important if we are to achieve consecutive clock execution of instruction sequences with dependent results.
To make our VHDL modeling easier though, we will go ahead and write the execution result to the CRS, then read and dispatch, all in the same clock cycle (just be aware that this is done for modeling purposes, would be implemented differently. However, the end result would be the same.).