Information-Theoretic Entropy Approximants

Shannon [1] introduced the concept of entropy in information theory, with an eye on its applications in communication theory. The general form of informational entropy (Shannon-Jaynes or relative entropy functional) is [2,3,4]:

$\displaystyle H(p,m) = - \sum_{i=1}^n p_i \ln \left( \dfrac {p_i}{m_i} \right) ...
...\textrm{or}   H(p,m) = - \int p(x) \ln \left( \dfrac {p(x)}{m(x)} \right) dx,$ (1)

where $ m$ is a $ p$-estimate (prior distribution). The quantity $ D(p\Vert m) = -H(p,m)$ is also referred to as the Kullback-Leibler (KL) distance. As a means for least-biased statistical inference in the presence of testable constraints, Jaynes's used the Shannon entropy to propose the principle of maximum entropy [5], and if the KL-distance is adopted as the objective functional, the variational principle is known as the principle of minimum relative entropy [4].

Consider a set of distinct nodes in $ {\rm {I\!R}}^d$ that are located at $ {x}^i$ ( $ i = 1, 2, \ldots, n$), with $ D = \mathop{\rm con}({x}^1, \dots, {x}^n) \subset {\rm {I\!R}}^d$ denoting the convex hull of the nodal set. For a real-valued function $ u({x}): D \rightarrow {\rm {I\!R}}$, the numerical approximation for $ u({x})$ is:

$\displaystyle u^h({x}) = \sum_{i=1}^n \phi_i({x}) u_i,$ (2)

where $ x \in D$, $ \phi_i({x})$ is the basis function associated with node $ i$, and $ u_i$ are coefficients. The use of basis functions that are constructed independent of an underlying mesh has become popular in the past decade--meshfree Galerkin methods are a common target application for such approximation schemes [6,7,8,9,10]. The construction of basis functions using information-theoretic variational principles is a new development [11,12,13,14]; see Reference [14] for a recent review on meshfree basis functions. To obtain basis functions using the maximum-entropy formalism, the Shannon entropy functional (uniform prior) and a modified entropy functional (Gaussian prior) were introduced in References [11] and [12], respectively, which was later generalized by adopting the Shannon-Jaynes entropy functional (any prior) [14]. The implementation of these new basis functions has been carried out, and this manual describes a Fortran 90 library for computing maximum-entropy (max-ent) basis functions and their first and second derivatives for any prior weight function.

We use the relative entropy functional given in Eq. (1) to construct max-ent basis functions. The variational formulation for maximum-entropy approximants is: find $ {x}\mapsto {\phi}({x}): D \to {\rm {I\!R}}_+^n$ as the solution of the following constrained (convex or concave with $ \min$ or $ \max$, respectively) optimization problem:

\begin{subequations}\begin{align}\min_{{\phi} \in {\rm {I\!R}}_+^n} - f({x};{\ph...
...\ \sum_{i=1}^n \phi_i({x}) ( {x}^i - {x}) &= {0}, \end{align}\end{subequations}

where $ {\rm {I\!R}}_+^n$ is the non-negative orthant, $ w_i({x}): D \to {\rm {I\!R}}_+$ is a non-negative weight function (prior estimate to $ \phi_i$), and the linear constraints form an under-determined system. On using the method of Lagrange multipliers, the solution of the variational problem is [14]:

$\displaystyle \phi_i({x}) = \dfrac{Z_i({x};{\lambda})}{Z({x};{\lambda})}, \quad Z_i({x};{\lambda}) = w_i({x})\exp (- {\lambda} \cdot \tilde x^i ) ,$ (4)

where $ \tilde x^i = {x}^i - {x}$ ( $ {x},{x}^i \in {\rm {I\!R}}^d$) are shifted nodal coordinates, $ {\lambda} \in {\rm {I\!R}}^d$ are the $ d$ Lagrange multipliers (implicitly dependent on the point $ {x}$) associated with the constraints in Eq. (3c), and $ Z({x}) = \sum_j Z_j({x}; {\lambda})$ is known as the partition function in statistical mechanics. The smoothness of maximum-entropy basis functions for the Gaussian prior was established in Reference [12]; the continuity for any $ C^k$ ($ k\ge 0$) prior was proved in Reference [15].

N. Sukumar
Copyright © 2008