Micro element matrix * vector: measured performance significantly higher than expected
The measured performance for the Micro Elementwise Stiffness Matrix are significantly higher (2x) than expected. For me it is currently unclear why that is the case.
The following measured metric line up with the expected ones:
- OI (higher but not that significant)
- overall FLOP (close if the precomputations are considered)
- overall memory transfers
Not lining up:
- (kernel) runtime