I suspect that the problem you describe is due to incorrect implementation of the QS trace dump in the idle processing. You say that you see QS data during system initialization, which are most likely the "dictionary" trace records. These trace records are generated in-line by the QS::onFlush() callback. After the initial transient, however, all of the rest of trace records are generated from the idle task. This is where your problem is.
I would need some more information to give you more specific help. What CPU/OS are you running? What kind of output are you using to dump the QS data to the host?