Add assembleGradient() method to FE assemblers
This adds a method to compute only the gradient.
On the pro side it saves a lot of time if only the gradient is needed (for first order methods).
On the con side it introduces some duplicated code (but only for hyberdual numbers).
A test case is added.