Commit a8748828 by Praetorius, Simon

### Expression templates chapter updated

parent aca58695
 ... ... @@ -10,30 +10,37 @@ class: center, middle - Evaluation of a complex expression happens pairwise. - Multiple copy operations and loops involved. -- ## Performance problem - Multiple allocations and deallocations cost time (penalty for small vectors) - Multiple loops cost time (penalty for large vectors) --- # Expression Templates ## Pairwise evaluation problem ```c++ Vector a{N}, b{N}, c{N}, erg{N}; erg = a + b + c; Vector a{N}, b{N}, c{N}, sum{N}; sum = a + b + c; ``` generate code like this: generates code like this: ```c++ double* tmp1 = new double[N]; // allocation of temporary for (int i = 0; i < N; ++i) // copy for (int i = 0; i < N; ++i) // copy loop 1 tmp1[i] = a[i]; for (int i = 0; i < N; ++i) // operator+= for (int i = 0; i < N; ++i) // operator+= loop 2 tmp1[i] += b[i]; double* tmp2 = new double[N]; // allocation of temporary for (int i = 0; i < N; ++i) // copy for (int i = 0; i < N; ++i) // copy loop 3 tmp2[i] = tmp1[i]; for (int i = 0; i < N; ++i) // operator+= tmp2 += c[i]; for (int i = 0; i < N; ++i) // operator+= loop 4 tmp2[i] += c[i]; delete[] tmp1; // destroy temporary for (int i = 0; i < N; ++i) // copy erg[i] = tmp2[i]; for (int i = 0; i < N; ++i) // copy loop 5 sum[i] = tmp2[i]; delete[] tmp2; // destroy temporary ``` ... ... @@ -42,6 +49,56 @@ delete[] tmp2; // destroy temporary # Expression Templates ## Pairwise evaluation problem ```c++ Vector a{N}, b{N}, c{N}, sum{N}; sum = a + b + c; ``` generates code like this: (with copy elision) ```c++ double* tmp1 = new double[N]; // allocation of temporary for (int i = 0; i < N; ++i) // copy loop 1 tmp1[i] = a[i]; for (int i = 0; i < N; ++i) // operator+= loop 2 tmp1[i] += b[i]; // double* tmp2 = new double[N]; for (int i = 0; i < N; ++i) // copy loop 3 sum[i] = tmp1[i]; for (int i = 0; i < N; ++i) // operator+= loop 4 sum[i] += c[i]; delete[] tmp1; // destroy temporary // delete[] tmp2; ``` --- # Expression Templates ## Pairwise evaluation problem ```c++ Vector a{N}, b{N}, c{N}, sum{N}; sum = a + b + c; ``` generates code like this: (with *more* copy elision - not automatically) ```c++ // double* tmp1 = new double[N]; for (int i = 0; i < N; ++i) // copy loop 1 sum[i] = a[i]; for (int i = 0; i < N; ++i) // operator+= loop 2 sum[i] += b[i]; // double* tmp2 = new double[N]; for (int i = 0; i < N; ++i) // operator+= loop 3 sum[i] += c[i]; // delete[] tmp1; // delete[] tmp2; ``` --- # Expression Templates ## Pairwise evaluation problem ```c++ Vector a{N}, b{N}, c{N}, erg{N}; erg = a + b + c; ... ... @@ -49,19 +106,20 @@ erg = a + b + c; Goal: "automatically" generate function that uses only 1 loop and 0 allocations/copies: ```c++ void sum3 (Vector const& a, Vector const& b, Vector const& c, Vector& erg) void vector_sum (Vector const& a, Vector const& b, Vector const& c, Vector& sum) { for (int i = 0; i < N; ++i) erg[i] = a[i] + b[i] + c[i]; sum[i] = a[i] + b[i] + c[i]; } ... sum3(a, b, c, erg); // problem: no mathematical notation vector_sum(a, b, c, sum); // problem: no mathematical notation ``` --- # Domain-Specific (Embedded) Languages - Description and solution of a problem to be can be expressed in the idiom and at the level of ## Domain-Specific Language (DSL) - Description and solution of a problem can be expressed in the idiom and at the level of abstraction of the problem domain. - Use operators and symbols with a specific meaning in the specific application domain ... ... @@ -91,7 +149,7 @@ sum3(a, b, c, erg); // problem: no mathematical notation # Expression templates: encoding operations - Use operator overloading to express an operation - Do not perform the operation in the operator, but return an object that "encodes" that operation - An assignment of the expression to a variable, perform the encoded operation. - An assignment of the expression to a variable performs the encoded operation. ## Example Assume, we have a way to (automatically) create a functor ... ... @@ -132,7 +190,9 @@ public: VectorPlusExpr (Vector const& a, Vector const& b) : a_(a) , b_(b) {} { assert(a_.size() == b_.size()); } // element access operator double operator[] (int i) const { return a_[i] + b_[i]; } ... ... @@ -157,7 +217,9 @@ public: PlusExpr (A const& a, A const& b) : a_(a) , b_(b) {} { assert(a_.size() == b_.size()); } // element access operator double operator[] (int i) const { return a_[i] + b_[i]; } ... ... @@ -173,8 +235,11 @@ public: The expression `a + b + c` can now be encoded by composition of `PlusExpr`: ```c++ a + b + c -> PlusExpr, Vector> (a + b) + c -> PlusExpr, Vector> ``` -- Thus, the `operator+` can simply return the binary expression `PlusExpr`: ```c++ template ... ... @@ -205,7 +270,7 @@ public: : f_(f), a_(a), b_(b) {} // access the i'th element of the expression double operator[](int i) const { return f_( a_[i], b_[i] ); } double operator[] (int i) const { return f_( a_[i], b_[i] ); } // size information int size () const { return a_.size(); } ... ... @@ -226,7 +291,7 @@ struct Plus { we can write the `PlusExpr` as ```c++ a + b + c -> BinaryExpr, Vector> (a + b) + c -> BinaryExpr, Vector> ``` -- ... ... @@ -252,11 +317,140 @@ auto operator+ (auto const& a, auto const& b) } ``` --- # Expression templates ## Unary expressions - Binary expressions can handle `+`, `-` - Unary expression can handle negation `-`, but also multiplication with a scalar: ```c++ template class UnaryExpr { UnaryFunctor f_; A const& a_; public: BinaryExpr (UnaryFunctor f, A const& a) : f_(f), a_(a) {} // access the i'th element of the expression double operator[] (int i) const { return f_( a_[i] ); } // size information int size () const { return a_.size(); } }; ``` --- # Expression templates ## Unary expressions ### Example 1: negation To negate all the elements in a vector, apply the negate operator elementwise: ```c++ template auto operator- (A const& a) { return UnaryExpr{[](auto ai) { return -ai; }, a}; } ``` --- # Expression templates ## Unary expressions ### Example 2: scaling Scale the elements of a container from the left (or from the right) ```c++ // multiplication from the left with factor template auto operator* (double factor, A const& a) { return UnaryExpr{[factor](auto ai) { return factor*ai; }, a}; } // multiplication from the right with factor template auto operator* (A const& a, double factor) { return UnaryExpr{[factor](auto ai) { return ai*factor; }, a}; } ``` --- # Expression templates ## Generator expressions - Similar to the `std::generate` algorithm with a generator function parameter, generator expressions can be used to generate values without storing them explicitly ```c++ template class GeneratorExpr { Generator g_; int size_; // need size information explicitly public: GeneratorExpr (Generator g, int size) : g_(g), size_(size) {} // generate the i'th element double operator[] (int i) const { return g_( i ); } // size information int size () const { return size_; } }; ``` **Note:** The generator it not in the classical sense a generator, i.e., a functor with empty parameter list, but a functor that just receives an index. --- # Expression templates ## Generator expressions ### Example 1: Unit vector The Euclidean unit vectors \(\mathbf{e}_i\in\mathbb{R}^n\) ```c++ template auto e (int size) { return GeneratorExpr{[](int j) { return i==j ? 1.0 : 0.0; }, size}; } int main() { const int i = 1; int n = 3; auto unit_vector = e(n); } ``` --- # Expression templates ## Generator expressions ### Example 2: Zero vector The vector representing the origin \((0,0,\ldots,0)^T\): ```c++ auto zero (int size) { return GeneratorExpr{[](int /*j*/) { return 0.0; }, size}; } int main() { auto zero_vector = zero(3); } ``` --- # Expression templates ## Terminal Operations - A terminal operation is an operation when the expression is actually evaluated. - A terminal operation is an operation that executes the expression, using `operator[]`. - It might be that the expression can be evaluated only once. ### Examples: ... ... @@ -269,8 +463,7 @@ auto operator+ (auto const& a, auto const& b) # Expression templates ## Terminal Operations ### Vector with Constructor from Expression ## Terminal Operations: Vector with Constructor from Expression ```c++ struct Vector { ... ... ... @@ -287,8 +480,7 @@ struct Vector { --- # Expression templates ## Terminal Operations ### Vector with assignment operators ## Terminal Operations: Vector with assignment operators ```c++ ... template ... ... @@ -297,14 +489,250 @@ struct Vector { assert(size_ == expr.size()); for (int i = 0; i < size_; ++i) // Evaluate the expression elementwise data_[i] = expr[i]; return *this; } template Vector& operator+= (BinaryExpr const& expr) { assert(size_ == expr.size()); for (int i = 0; i < size_; ++i) // Evaluate the expression elementwise data_[i] += expr[i]; return *this; } }; ``` --- # Expression templates ## Terminal Operations: Inner product Evaluate expressions in a loop. ```c++ template double dot (A const& a, B const& b) { assert(a.size() == b.size()); double res = 0.0; for (int i = 0; i < a.size(); ++i) res += a[i] * b[i]; return res; } ``` --- # Expression templates ## Expression templates in action ```c++ Vector a{10}, b{10}, b{10}; Vector sum = a + b + c; // sum = BinaryExpr{Plus{}, BinaryExpr{Plus{}, a, b}, c}; Vector sum2 = -a + e<3>(10) + zero; // sum2 = BinaryExpr{Plus{}, // BinaryExpr{Plus{}, // UnaryExpr{Negate{},a}, // GeneratorExpr{Unit<3>{},10} }, // c }; double res = dot(a + b, e<3>(10) - c); // dot(BinaryExpr{Plus{}, a, b}, // BinaryExpr{Minus{}, GeneratorExpr{Unit<3>{},10}, c}) ``` --- # Expression templates ## Some Design Problems - Operators, e.g., `operator+`, are implemented generically, thus accept everything \(\to\) how can we restrict the arguments? - When implementing `UnaryExpr`, `BinaryExpr`, maybe also `GeneratorExpr`, then the number of constructors and assignment operators in the `Vector` class grows. - Inventing a new type of expression means, we have to extend the `Vector`. - Everything is stored by reference. This might lead to dangling references. -- ## Different Solution ideas - curiously recurring template pattern (CRTP) - C++20 concepts --- # Curiously recurring template pattern (CRTP) An idiom in C++ in which a class `D` derives from a class template instantiation `B` using `D` itself as a template argument. ```c++ // The Curiously Recurring Template Pattern (CRTP) template struct Base { // methods within Base can use template to access members of Derived }; struct Derived : public Base { // ... }; ``` - Class `Derived` inherits all members from `Base`, its base class. - Class `Base` cannot directly access any member from `Derived` --- # Curiously recurring template pattern (CRTP) An idiom in C++ in which a class `D` derives from a class template instantiation `B` using `D` itself as a template argument. ```c++ // The Curiously Recurring Template Pattern (CRTP) template struct Base { void interface () { static_cast(this)->implementation(); // cast the pointer to the class instance } // to the derived class type. protected: Base () {} // constructor is protected. Can be called }; // by derived class only. struct Derived : public Base { void implementation () { // the actual implementation } }; ``` --- # Curiously recurring template pattern (CRTP) An idiom in C++ in which a class `D` derives from a class template instantiation `B` using `D` itself as a template argument. ```c++ template void foo (Base& base) { base.interface(); } int main () { Derived derived; foo(derived); } ``` `foo()` can be called, since `Derived` *is-a* `Base`. --- # Expression Templates - CRTP ## A Base Expression - Use the CRT pattern to implement a common base class for all expressions: ```c++ template struct Expr { // Element access auto operator[] (auto const i) const { return static_cast(*this).access_impl(i); } // Length of the vector expression auto size () const { return static_cast(*this).size_impl(); } protected: Expr () = default; }; ``` --- # Expression Templates - CRTP ## All expressions derived from `Expr` ```c++ template class BinaryExpr : public Expr< BinaryExpr > { Functor f_; A const& a_; B const& b_; public: BinaryExpr (Functor f, A const& a, B const& b) : f_(f), a_(a), b_(b) { assert(a_.size() == b_.size()); } // implementation of element access double access_impl (int i) const { return f_( a_[i], b_[i] ); } // implementation of size information int size_impl () const { return a_.size(); } }; ``` --- # Expression Templates - CRTP ## Generic argument `Expr` In all functions expecting an expression and combining expressions, use `Expr` instead: ```c++ template auto operator+ (Expr const& a, Expr const& b) { return BinaryExpr{Plus{}, a, b}; } ``` **Note:** The class `Vector` must also derive from `Expr`. ```c++ class Vector : public Expr { template Vector (Expr const& expr); // copy-construct the vector from a generic expression ... double access_impl (int i) const { return data_[i]; } int size_impl () const { return size_; } }; ``` --- # Expression Templates - CRTP ## Example: inner product ```c++ template auto dot (Expr const& expr1, Expr const& expr2) { assert(a.size() == b.size()); using T = decltype(expr1[0]*expr2[0]); // type of the elements using S = decltype(expr1.size()); // type of the indices T res = 0.0; for (S i = 0; i < a.size(); ++i) res += a[i] * b[i]; return res; } ``` --- # Expression Templates ## Design problems with CRTP - It is intrusive, i.e., you need to derive from a base class - You need to implement a specific "impl" function. - Increases compile times due to additional templates and indirection ## Possible Solution - C++20 Concepts \ No newline at end of file
 #include #include #include #include #include template struct Expr { // Element access auto operator[] (std::size_t const i) const