Optimization version 3
Try to reduce the amount of uploads per iteration, change the order of execution for the scatter kernel.
Edited by Alexander Gerwing
Try to reduce the amount of uploads per iteration, change the order of execution for the scatter kernel.