refine CI stages

Currently in !10, we have images -> cpp -> headercheck -> python. I would propose something like

images -> cpp-make-all
   |           |
   |           +--------> cpp-make-test -> cpp-run-tests
   |           |
   |           +--------> cpp-headercheck
   |           |
   |           +--------> python-make-bindings -> python-run-tests
   |
   +--------------------> wheels-build -> wheels-test
          

to allow for as much in parallel as possible. I am putting wheels independently of bindings here, since the first is in an manylinux image, while the latter is with all our usual c++ compilers. One could even think about branching wheels-build off directly after images already, but I guess thats a bit much. What do you think, @r_milk01?

Edited by Felix Schindler