minimal reproducer to test += vs. = assignment in gridoperator.h:153
on cuda, the tests fail if = is used, in the rv-test it fails if += is used.
We're currently assuming that the declaration of the yl_tmp vector on the cpu initializes said vector with 0.0d
Maybe a minimal reproducer can be built to check this behaviour?