[draft][AD] Add some automated differentiation utilities
This provide the ADValue<T,maxOrder,dim>
class as a primitive
for automatic differentiation. It that encapsulates a scalar type
T
(e.g. T=double
) and provides arithmetic operations.
Each ADValue
is considered as an intermediate result in
the evaluation of a scalar dim
-variate function but does
not store the value only, but the jet of all derivatives up
to maxOrder
. Currently only maxOrder<=2
is implemented.
Additionally, this provides some convenience utilities
for using this ADValue
:
- A functions for easy definition of AD-aware nonlinear differentiable functions.
- Overloads for a few basic functions (
abs
,sin
,cos
,log
,exp
,sqrt
,pow
). - Some glue code to implements dune-functions differentiable
functions (for a callback
f
simply useADFunction<f>
.
Usage example:
using std::sin;
using std::exp;
using std::pow;
using namespace Dune::Indices;
// Define and initialize functions arguments of a tri-variate
// twice differentiable function.
auto x0 = ADValue<double,2,3>(23., 0);
auto x1 = ADValue<double,2,3>(42., 1);
auto x2 = ADValue<double,2,3>(13., 2);
// Evaluate expression
auto y = pow(2, exp(sin(x[0]*x[1])*sin(x[2]) + 3));
// Extract value of function, 1st and 2nd order partial derivatives
y.partial();
y.partial(i);
y.partial(j);
Why propose this despite the fact we have AdolC bindings?
- Disadvantages of AdolC tape-based mode:
- This relyies on global variables and is not thread safe.
- It is much slower. For simple expressions I measured, that
it is between 10 and 100 times slower than
ADValue
. When using AD for the derivatives of the energy with a Newton method for the minimal surface equation, assembly is takes about 20 times as long with AdolC compared toADValue
. The latter is essentially as fast as with manually implemented derivatives and evenmore allows for multi-threading (not used in the comparison).
- AFAIK
ADValue
can be characterized as tapeless forward mode Taylor polynomial AD. - AdolC also has a tape-less forward mode, which is probably
similar to
ADValue
but has some conceptual restrictions:- Only 1st order derivatives are available, so this can't be used for Newton's method.
- You have to decide in advance on the maximal domain dimension
and set it using a
macroglobal variable. This influences the memory consumption of all AD-values stored later on. - There's also no guarantee about thread-safety.
Edited by Carsten Gräser