[walker] use parallel_for instead of parallel_reduce
Currently, we are using parallel_reduce
with an empty join
function, which corresponds to a parallel_for
if I am not mistaken. So this MR changes the walker to use a parallel_for
directly. Any objections, @ag-ohlberger/dune-community? Probably does not matter much in terms of performance, though.