As Tobias mentioned, normally processes like the one you ask for would be controlled via coded events or external database variables. + you would also have unique events for the triggers you request.
You are absolutely spot on to look at dynamic dialogue, however i don't think it has been looked at by anyone for the Cube Demo as new code is needed.
A suggested solution is controlling content via audio busses. Then simply silencing (or delay triggering) the content you wish not to play allowing priority content only to be heard.
If you plan to have lots & lots of content controlled via the busses, it could get completed and messy very quickly. Plan update before implementing.
For information of how to set up the mixer busses with this suggested solution, please see the video here:
Also, the vocal idle video will be helpful:
I hope the above helps your progress.