Basically the reason why it is recommended to turn it off for these functions is because of this:
Many routers get shipped with a certain offloading chip, this means when hardware accerelation is enabled it offloads it to this chip rather than directly on the cpu.
This means there is more headroom for the cpu and speeds are faster.
But now this does sound exceptional good, it isn't always good.
There are situations where the overloading chip is too over consumed, but because it does not go over the cpu, the packets can be dropped and in some cases can even trigger false STP problems, or even packet corruption.
I won't say this is the case with these functions, but the case is rather that the switch part listens on the cpu and not on the offload chip, the firewall also listens on the cpu but not the offload chip the offload chip bypasses the cpu.
So when a situation happen that packets become lost in transit between the offload chip and cpu, you will notice very unexpected results.
This is why for some things like SQM it is recommended to turn it off, because it requires a very fixed way of ordering execution directly on the cpu, which with a offloading chip will not follow that principe and even can drop packets.
The same can happen with DPI too, it could lose track of some packets and maybe some detections even can passthrough.
For such systems which carefully listen to the firewall with alot of calls, there is no awareness between the offload chip and cpu, and that concept can fail when packets get lost.
It is not that the firewall skips packets or you leak open access to wan, but packets can drop lost in transit or cause some layer 2 problem which actually is corrupt.