yl_tmp (re)initialization
instead of creating new yl_tmp on host and uploading to device in every iteration, create a new kernel that iterates over all vertices and sets dp_yl[vertex_id]=0.0
before the main kernel in gridoperator.h
instead of creating new yl_tmp on host and uploading to device in every iteration, create a new kernel that iterates over all vertices and sets dp_yl[vertex_id]=0.0
before the main kernel in gridoperator.h