Split linearsolver into matrix-based and matrix-free parts
in particular that means device-executables no longer need to include amg and bcrsmatrix