pacxx-projectseminar-2019 merge requestshttps://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests2019-09-29T17:28:33+02:00https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/103Include fix2019-09-29T17:28:33+02:00Alexander GerwingInclude fixhttps://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/102Follow move of pacxx-docker to hpc2se namespace2019-08-21T12:03:38+02:00Dr. Jorrit FahlkeFollow move of pacxx-docker to hpc2se namespaceAddresses: HPC2SE-Project/pacxx-ci#8Addresses: HPC2SE-Project/pacxx-ci#8https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/101WIP: Scatterkernelreorder optimization2019-09-29T17:38:54+02:00Alexander GerwingWIP: Scatterkernelreorder optimizationSome smaller optimizations to scatterkernelreorder.Some smaller optimizations to scatterkernelreorder.https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/100Resolve "Figure out why RV times out with accumulation of results on host"2019-08-06T17:21:11+02:00Alexander GerwingResolve "Figure out why RV times out with accumulation of results on host"Closes #70 and #83Closes #70 and #83https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/99Bucket colouring2019-08-08T13:37:44+02:00Alexander GerwingBucket colouringhttps://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/98Fix variant visitation2019-07-23T15:27:15+02:00Dr. Jorrit FahlkeFix variant visitationSo, as it turns out, pacxx supports c++17 on agamemnon, probably due to Ubuntu
18.04 (compared to Ubuntu 16.04 in the CI). This means `Dune::Std::variant`
is just `std::variant`, rather than using Dune's fallback implementation.
Now, wi...So, as it turns out, pacxx supports c++17 on agamemnon, probably due to Ubuntu
18.04 (compared to Ubuntu 16.04 in the CI). This means `Dune::Std::variant`
is just `std::variant`, rather than using Dune's fallback implementation.
Now, with the fallback implementation, it was impossible to find `visit()` via
ADL for some reason, so I had been using the member function `visit()`, which
isn't part of `std::variant`. To satisfy both, try `using Dune::Std::visit`
before ADL.https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/97[Timing] Synchronize, report from rank 0 only2019-07-23T14:36:25+02:00Dr. Jorrit Fahlke[Timing] Synchronize, report from rank 0 onlyThe synchronization is done by taking the max over the ranks. The reasoning
for taking the max is that the other ranks will have to wait for the rank that
takes the longest. The reasoning for not summing up the timings is that we
aren'...The synchronization is done by taking the max over the ranks. The reasoning
for taking the max is that the other ranks will have to wait for the rank that
takes the longest. The reasoning for not summing up the timings is that we
aren't doing that for the individual threads running on the device either.
Addresses: #88https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/96Human-friendly category/mpi mode printing2019-07-23T11:54:51+02:00Dr. Jorrit FahlkeHuman-friendly category/mpi mode printinghttps://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/95Enable MPI for device2019-07-23T12:26:31+02:00Dr. Jorrit FahlkeEnable MPI for deviceAddresses: #88Addresses: #88https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/94Somewhat simplify test definition for device tests2019-07-23T11:23:37+02:00Dr. Jorrit FahlkeSomewhat simplify test definition for device testshttps://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/93Compute and check L2 error2019-07-18T10:50:00+02:00Dr. Jorrit FahlkeCompute and check L2 errorAddresses: #88
This is needed for MPI-parallel computation. In a parallel setting the partitioning is not always deterministic, so it becomes difficult to generate reference output files to compare against. This introduces a check on...Addresses: #88
This is needed for MPI-parallel computation. In a parallel setting the partitioning is not always deterministic, so it becomes difficult to generate reference output files to compare against. This introduces a check on the [$`L^2`$ error norm](https://en.wikipedia.org/wiki/Lp_space#Lp_spaces) of the computed solution with respect to the known analytical solution. In particular, for a given computation with refinement level $`l`$ we check that the following holds:
```math
\|x_l-x_\text{ref}\|_{L^2} < C'·h_l^2 = C·2^{-2·l}
```
- $`x_l`$ is the computed solution at refinement level $`l`$
- $`x_\text{ref}`$ is the analytic reference solution
- $`h_l`$ is the size of the mesh elements at refinement level $`l`$. For the structured refinement we are using, we have $`h_l=2^{-l}·h_0`$. The exact definition of "size of the element" isn't that important, what is important is that it halves with every step of refinement.
- $`C`$ is a parameter that needs to be determined experimentally, such that the above holds for all refinement levels $`l`$ we are interested in. It is usually something like the $`L^2`$ error norm at refinement level 0. But it can happen that the error at refinement level 0 is "too good": the above inequality only makes a statement about the upper limit for the error, not the lower limit. In such which case $`C`$ needs to be enlarged artificially to the inequality also holds for the other level we are interested in.
- $`C'`$ is just $`\frac{C}{h_0}`$, it is just used to write the right hand side of the inequality in a more familiar form that might be found in a textbook
The square in $`h_l^2`$ (or equivalently, the 2 in the exponent in $`2^{-2·l}`$) is actually a property of the finite element scheme we are using. (For Q1 ansatz function you usually have this 2, for Q2 you would have 3, etc. This is the reason why poeple bother with higher order ansatz functions: it allows for much coarser meshes while still keeping the error below a certain level.)https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/92Zero the interior of the initial solution2019-07-17T15:47:00+02:00Dr. Jorrit FahlkeZero the interior of the initial solutionCloses: #92Closes: #92https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/91Add mpi support2019-07-24T09:07:32+02:00Dr. Jorrit FahlkeAdd mpi supportImplement MPI parallelism for the matrix-free host code. Leaving parallel device code for other MR's.
Addresses: #88
WIP:
- [x] Make sure the CI runs the MPI tests
Implement MPI parallelism for the matrix-free host code. Leaving parallel device code for other MR's.
Addresses: #88
WIP:
- [x] Make sure the CI runs the MPI tests
https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/90Resolve "device-v2 flavor is not included in the default flavours in the bench script"2019-06-24T12:41:27+02:00Alexander GerwingResolve "device-v2 flavor is not included in the default flavours in the bench script"Closes #87Closes #87https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/89Optimization version 32019-08-06T14:36:03+02:00Alexander GerwingOptimization version 3Try to reduce the amount of uploads per iteration, change the order of execution for the scatter kernel.Try to reduce the amount of uploads per iteration, change the order of execution for the scatter kernel.https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/88Resolve "yl_tmp (re)initialization"2019-06-24T18:03:54+02:00Alexander GerwingResolve "yl_tmp (re)initialization"Closes #82Closes #82https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/87First optimization attempts2019-06-20T12:57:39+02:00Alexander GerwingFirst optimization attemptsCloses #55Closes #55https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/86Resolve "minimal reproducer to test += vs. = assignment in gridoperator.h:153"2019-06-19T15:41:42+02:00Alexander GerwingResolve "minimal reproducer to test += vs. = assignment in gridoperator.h:153"Closes #83Closes #83https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/85Gridoperator device v22019-06-24T12:25:27+02:00Alexander GerwingGridoperator device v2Introduced a second version of the device gridoperator. Optimizations can now be applied to the PPS::v2:: classes, without affecting the original brute-force version.Introduced a second version of the device gridoperator. Optimizations can now be applied to the PPS::v2:: classes, without affecting the original brute-force version.https://zivgitlab.uni-muenster.de/kucher/pacxx-projectseminar-2019/-/merge_requests/84CRTP-less way to semi-automatically determine timer names2019-06-17T20:37:32+02:00Dr. Jorrit FahlkeCRTP-less way to semi-automatically determine timer namesJust make the `trafo` argument to `nonlinear_jacobian_apply` a proper functor, not a lambda, that demangles to something benign.
Closes: #81Just make the `trafo` argument to `nonlinear_jacobian_apply` a proper functor, not a lambda, that demangles to something benign.
Closes: #81