An advantage of generic programming is that we can reuse the same code for multiple types/arguments/parameters. Although this gives us a lot of flexibility
sometimes we want to be more specific, \ie some algorithms require special knowledge about the data types used to allow an efficient implementation. Some
types have additional guarantees (\eg contiguous storage of its data elements) or allow more (efficient) operations (\eg containers with optimized sorting
algorithm directly implemented and not by the generic \cpp{std::sort} function). In such cases is would be advantageous to provide a more specialized implementation
of the same algorithm or data-structure without loosing the generic version for the other types.
\begin{example}
Consider the function \texttt{distance()} returning the Euclidean distance of coordinate vectors.
%
\begin{minted}{c++}
template <typename Point>
double distance(Point const& a, Point const& b) { // primary template
double result = 0.0;
for (std::size_t i = 0; i < a.size(); ++i)
result += (a[i] - b[i]) * (a[i] - b[i]);
return std::sqrt(result);
}
\end{minted}
%
If we interpret scalar types also as 1-dimensional coordinate vectors, we might need to provide a specialization of that function
since the floating point types do not provide an \cpp{operator[]} and do not have a member function \cpp{size()}:
%
\begin{minted}{c++}
template <> // Specialization for floating-point type double
double distance<double>(double const& a, double const& b) {
return std::abs(a-b);
}
\end{minted}
%
Calling (instantiating) the function \texttt{distance} with arguments of type \cpp{double} then results in an instantiation of the more specialized
function and just returns the absolute difference. This might be more efficient to evaluate than the square-root of the squared value.
\end{example}
\begin{defn}
The first complete (non-specialized) template definition is called the \emph{primary template}. It has a special meaning in the context of overload resolution.
\end{defn}
Template specialization of function and class templates always start with the keyword \cpp{template <..>} and have the specialized type(s) in angular brackets
directly after the function name or class name, \eg
Thereby, all template parameters that are specialized are moved from the template parameter list in \cpp{template<...>} to the specialization parameter list
\cpp{[function|class]Name<...>}. If no template parameters remain in the template parameter list, the specialization is called \emph{full specialization},
otherwise \emph{partial specialization}.
The \emph{primary template} specifies, that a function or class is a template. It is not specialized, but allows any template parameter. It is required to be
declared (not necessarily defined) before any specialization.
Some examples:
\begin{minted}{c++}
// primary template
template <class T, class S>
class MyPoint; // Type T... element type, Type S... index type
// specialization for element type `double` and index type `int`
template <>
class MyPoint<double, int> { ... }; // (a)
// specialization for any element type but index type `long`
template <class T>
class MyPoint<T, long> { ... }; // (b)
// specialization for element type `MyPoint<T,int>' where `T` could be any type
// and fixed index type `int`
template <class T>
class MyPoint< MyPoint<T,int>, int > { ... }; // (c)
\end{minted}
(a) full specialization, (b) and (c) partial template specialization.
Combining classical function overloading and template specialization raises the question: which function is more specialized, more constrained than the
other one and thus, which function to call.
Three examples should illustrate, that combining template specialization and overloading is a delicate and complicated situation and should be avoided:
\begin{example}
Which function is called in the \cpp{main()}?
\begin{minted}{c++}
template <class T> void foo(T); /* (a) */
template <class T> void foo(T*); /* (b) */
template <> void foo<int>(int*); /* (c) */
int main() { int *p; foo(p); }
\end{minted}
Here, (b) is an overload of (a) and (c) is a template specialization of (b).
General procedure:
\begin{enumerate}
\item Which of the \emph{primary templates} is the most specialized one? $\Rightarrow$ (a) and (b) are primary templates and (b) is more specialized.
\item If there are specializations of the selected primary template, choose the most specialized (but viable candidate) of all specialization of this
primary template. $\Rightarrow$ (c) is specialization of (b), thus choose (c).
\end{enumerate}
\end{example}
\begin{example}
Which function is called in the \cpp{main()}?
\begin{minted}{c++}
template <class T> void foo(T); /* (a) */
template <class T> void foo(T*); /* (b) */
template <> void foo<int*>(int*);/* (c) */
int main() { int *p; foo(p); }
\end{minted}
The situation is similar to the last example but not exactly equal. Here, (c) is a specialization of (a).
\begin{enumerate}
\item (b) is the most specialized primary templates.
\item Since there is no template specialization of (b), (b) itself is chosen!
\end{enumerate}
\end{example}
Exact matching non-template functions are always preferred over function templates!
\begin{example}
Which function is called in the \cpp{main()}?
\begin{minted}{c++}
template <class T> void foo(T); /* (a) */
template <class T> void foo(T*); /* (b) */
void foo(int*); /* (c) */
int main() { int *p; foo(p); }
\end{minted}
There is a free function (non-template) (c). This is chosen first!
\end{example}
\begin{guideline}{Guideline}
Do not mix template specialization and function overloading. Prefer function overloading in general.
\end{guideline}
\begin{rem}
A nice overview and more detailed explanation of the interaction of function template specialization, overloading, argument type deduction and
argument dependent lookup can be found in the article\begin{itemize}
\item All visible names are collected (involving \emph{argument dependent lookup} (ADL) in the namespaces of the arguments) (\emph{name lookup})
\item All non-viable functions are erased from that list\begin{itemize}
\item Number of parameters must match (involving default function parameters)
\item There must be a sequence of implicit conversions to transform the passed arguments into the function parameter types.
\item For function templates all template parameters must be deducible (\emph{argument type deduction} -- ATD)
\item If the replacement of a template parameter by the type derived from ATD would lead to a direct error, this raises a \emph{substitution failure}.
Candidates involving a substitution failure are simply ignored (a \emph{substitution failure is not an error} -- SFINAE)
\end{itemize}
\item A non-template function that is an exact match is the best fitting candidate.
\item For all primary template the most specialized one is selected (see below).
\item If there are template specializations of the best fitting primary template, the most specialized (matching) one is selected.
\end{enumerate}
What does it mean to be the \textit{most specialized} template? For two function templates that one is more specialized whose arguments can be inserted into the other function but not vice versa, \eg
%
\begin{minted}{c++}
template <class S> void foo(S); // (1)
template <class T> void foo(T*); // (2)
\end{minted}
%
Every pointer \cpp{T*} is also an arbitrary (non-constrained) type \cpp{S}, but not all types \cpp{S} are pointer types. Thus, (2) is more specialized than (1).
\section{SFINAE}
\label{sec:sfinae}
We have learned that functions can be specialized for different
parameters either be function overloading, or in certain cases by
template specialization.
\begin{example}
We consider again the case of a mathematical vector and the special
case of scalar functions.
While it was relatively easy to specialize \cpp{distance} for vector
types defined over \cpp{double} and for scalars, i.e. \cpp{double}.
Generalizing it to other scalar types is significantly more
complex. The declaration and the specialization would look like
\begin{minted}{c++}
template <typename Point>
auto distance(Point const& a, Point const& b);
template <typename Scalar>
auto distance<Scalar>(Scalar const& a, Scalar const& b);
\end{minted}
\emph{This example will not work!}
The first declaration is intended to work for any vector and the
return type depends on the type returned by \cpp{a[i]},
e.g. \cpp{double}, \cpp{float}, etc..
The specialization should work for scalars, but as we observe the
signature is the same in both cases, so we can't use specialization
here.
\end{example}
An alternative is based on function overloads, the SFINAE technique.
SFINAE stands for \emph{Substitution Failure Is Not An Error}. Ill-formed
signatures that results from substituting template parameters is not a
hard compile error, it is only treated as a deduction failure.
\subsection{The overload set}
The compiler holds a list of different function overloads for a
function name. If templated functions are available, the compiler first
substitutes the templates and then adds the signature of this specific
instatiation to the list of overloads.
If this substitution is not well defined, SFINAE states that this is
not a compile error, but the function is then simply not part of the
overload set.
\begin{example}
We now use the \cpp{distance} example and use SFINAE to manage the
overload set and make sure that the first version only works for
types with a bracket operator:
\begin{minted}{c++}
template <typename Point>
auto distance(Point const& a, Point const& b)
-> typename std::decay<decltype(a[0])>::type
{
using Field = typename std::decay<decltype(a[0])>::type;
Field result = 0.0;
for (int i = 0; i < a.size(); ++i)
result += (a[i] - b[i]) * (a[i] - b[i]);
return std::sqrt(result);
}
\end{minted}
\begin{itemize}
\item\cpp{decltype(a[0])} tries to deduce the return type of
\cpp{a[0]}
\item if the template parameter does not have a bracket operator,
the deduction failes and the function is removed from the overload
set.
\item if the bracket operator is available,
e.g. \cpp{std::vector<double>}, \cpp{decltype(a[0])} yields a
const reference to the scalar type, e.g. \cpp{const double&}. The
type trait (see section \ref{sec:type-traits}) \cpp{std::decay}
allows to retrieve the underlying type, i.e. the typedef
\cpp{std::decay<...>::type} is now \cpp{double}.
\end{itemize}
With this we have guarded the function definition for vectors. We
still have to do the same for the variant for scalar types:
Template metaprogramming techniques are purely based on templates, template specialization and static type information like static constants. Since
templates are instantiated by the compiler at compile-time, the resulting non-template class or function must have all information resolved. So, the
calling syntax for template metaprograms is just template instantiation. The template metaprogramming sub-language is Turing complete.
\begin{example}
In 1994 the Developer Erwin Unruh from Siemens-Nixdorf presented his prime-number program to the C++ standard committee -- probably the most famous
C++ program that does not compile. The error-messages of this program contain the result of the computation: the first 30 prime numbers. This side-effect
of the compiling process has clearly shown that the compile can do computing:
\begin{verbatim}
error: no suitable constructor exists to convert from "int" to "D<17>"
error: no suitable constructor exists to convert from "int" to "D<13>"
error: no suitable constructor exists to convert from "int" to "D<11>"
error: no suitable constructor exists to convert from "int" to "D<7>"
error: no suitable constructor exists to convert from "int" to "D<5>"
error: no suitable constructor exists to convert from "int" to "D<3>"
error: no suitable constructor exists to convert from "int" to "D<2>"
\end{verbatim}
\end{example}
We distinguish two types of template metaprogramming
\begin{enumerate}
\item Templates that calculate values (metafunctions)
\item Templates that define or transform types (type-traits)
Classical function get their inputs as values in the function parameters can return values either in the return statement or in an output
function parameter. In contrast, metafunctions get their input as template parameters and provide their results either as typedef (type alias)
or as static constant. Thereby, metafunctions are typically class templates.
Since class templates need to be instantiated, the evaluation of the metafunction happens at the template instantiation and thus at compile-time.
We have seen two types of template parameters: type parameters and non-type parameters. In order to pass values we will first introduce a type
that represents a value:
%
\begin{minted}{c++}
template <class T, T v>
struct integral_constant
{
using type = T;
static constexpr type value = v;
};
\end{minted}
%
This is a type defined in the standard library in \cpp{<type_traits>}. So, the value that the type represents is first passed as template parameter
and second it is stored in a static \cpp{constexpr} variable.
\begin{guideline}{Guideline}
Classes representing a numeric value name the static constant ``value''. Classes representing another type, name the typedef (type alias) simply ``type''.
\end{guideline}
\begin{rem}
Static \cpp{constexpr} constants are one way to define a value in a class. The other way is to use ``enums'':
\cppline{ enum : T { value = v };}
\end{rem}
In order to access the value, we have to instantiate the template and use the name-resolution operator:
\begin{minted}{c++}
using V = integral_constant<int, 42>;
std::cout << V::value << std::endl; // prints 42
\end{minted}
\subsubsection{Direct calculations with template parameters}
An \cpp{integral_constant} just stores a value. But the same technique can be used to do simple calculations.
Therefore, either you pass the values directly as non-type template parameters:
\begin{minted}{c++}
template <int a, int b>
struct plus
{
static constexpr int value = a + b;
};
std::cout << plus<13, 29>::value << std::endl; // prints 42
\end{minted}
Or you pass the values as integral constants:
\begin{minted}{c++}
template <class A, class B>
struct plus
{
using type = typename A::type;
static constexpr type value = A::value + B::value;
// or static constexpr auto value = A::value + B::value;
};
using A = integral_constant<int,13>;
using B = integral_constant<int,29>;
std::cout << plus<A, B>::value << std::endl; // prints 42
\end{minted}
\subsubsection{Recursive programming}
C++ is a statically typed language, meaning: the type of a variable or an alias cannot be changed once it is set. And everything must
have a type. The prevents from implementing something like loops where you update a counter during iteration. This makes it more
difficult to do programming with templates. Everything has to be implemented using recursion instead of iteration.
In order to illustrate a recursive algorithm implemented using templates, we consider the factorial computation.
\[
\operatorname{factorial}(n) :=\left\{\begin{array}{ll}1&\text{if }n =0\\ n \cdot\operatorname{factorial}(n-1)&\text{otherwise}\end{array}\right.
\]
In a classical function, we would write
\begin{minted}{c++}
int factorial(int n) {
return n <= 0 ? 1 : n * factorial(n - 1);
}
int main() {
int x = factorial(3); // = 3*2*1 = 6
int y = factorial(0);
}
\end{minted}
Compiling this program generates code that can be executed at runtime.
The compiler output of \texttt{g++ -S factorial.cc} generates assembler code:
\begin{verbatim}
_Z3factoriali:
.LFB0:
// ...
movl %edi, -4(%rbp) // n := m
cmpl $0, -4(%rbp)
je .L2// n ==0 ? jump to .L2 : continue
movl -4(%rbp), %eax // \
subl $1, %eax // } m := n-1
movl %eax, %edi // /
call _Z3factoriali // factorial(m)
//...
.L2:
movl $1, %eax // return_value = 1
.L3:
leave // return
// ...
main:
.LFB1:
//...
movl $3, %edi // m := 3
call _Z3factoriali // factorial(m)
// ...
\end{verbatim}
Now, the same program implemented using templates, static constants, recursive instantiation and template specialization for the break condition
looks as follows:
\begin{minted}{c++}
template <int N>
struct factorial_meta // recursion
{
static constexpr int value = N * factorial_meta<N-1>::value;
};
template <>
struct factorial_meta<0> // break condition
{
static constexpr int value = 1;
};
int main() {
int x = factorial_meta<3>::value;
int y = factorial_meta<0>::value;
}
\end{minted}
and the corresponding assembler code:
\begin{verbatim}
main:
.LFB0:
// ...
movl $6, -8(%rbp) // explicit value
movl $1, -4(%rbp)
// ...
\end{verbatim}
\begin{rem}
When writing an expression involving template instantiations, like
\cppline{ N <= 0 ? 1 : factorial_meta<N-1>::value;}
All templates first get instantiated, second the arithmetic expression is evaluated. Meaning, even for the case \cpp{N == 0} the
\cpp{factorial_meta<N-1>} gets instantiated, thus \cpp{factorial<-1>}. So we would get an infinite recursion and template instantiation.
This can only be overcome by providing another specialization of the template that kicks in instead of the recursive call.
\end{rem}
\subsubsection{Value aliases}
Instead of accessing the result of a computation in a template metafunction by the \cpp{value} member, it is common standard to
introduce a variable template for this purpose that simplifies the calls and makes it look very similar to regular function calls:
%
\begin{minted}{c++}
template <int N>
constexpr int factorial_v = factorial_meta<N>::value;
\end{minted}
%
With this, you can simply evaluate \cpp{factoral_v<7>} in your code without the \cpp{::value} access. The postfix \cpp{_v} is commonly used
and is also introduced for several metafunctions in the standard library with \cxx{17}.