next up previous contents Next: Analyzing trigonometric equations Up: ALIAS-C++ Previous: Solving with Interval Analysis

  • La page de J-P. Merlet
  • J-P. Merlet home page
  • La page Présentation de HEPHAISTOS
  • HEPHAISTOS home page
  • La page "Présentation" de l'INRIA
  • INRIA home page

    Subsections


    Analyzing systems of equations

    Introduction

    The purpose of this chapter is to present some tools enabling to analyze system of equations i.e. to get some a-priori information on the roots without solving the systems. These tools allows one to determine that there is a unique solution in a given box and provides a mean to calculate it in a safe way. Hence they are essential to provide a guaranteed solution for a system of equations.

    Note that a generic analyzer based on the ALIAS parser has been developed and will be presented in the chapter devoted to the parser. This generic analyzer enable to analyze almost any type of system in which at least one equation is algebraic in at least one of the unknowns.


    Moore theorem

    Mathematical background

    Let a system of $n$ equations in $n$ unknowns $
F=\{ F_i(x_1,\ldots,x_n)=0, i\in [1,n]\}
$ each $F_i$ being at least $C^1$. Let ${\bf X}$ be a range vector for $\{x_1,\ldots,x_n\}$, $y$ a point inside ${\bf X}$ and $Y$ an arbitrary nonsingular real matrix. Let define $K$ as:

    \begin{displaymath}
K{\bf X}=y-YF(y)+\{I-YF^\prime({\bf X})\}({\bf X}-y)
\end{displaymath}

    and let $r_0$ be the norm of the matrix $I-YF^\prime({\bf X})$. If

    \begin{displaymath}
K({\bf X}) \subseteq {\bf X}   {\rm and}  r_0<1
\end{displaymath}

    then there is a unique solution [16] of $F$ in ${\bf X}$. This unique solution can be found using Krawczyk solving method (see section 2.10).

    Implementation

    The previous test is implemented for $y=Mid({\bf X})$ and $Y=F^{\prime
-1}(y)$. The procedure is implemented as:

     
    int Krawczyk_Analyzer(int m,int n,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* J)(int, int, INTERVAL_VECTOR &),INTERVAL_VECTOR &Input)
    
    with This procedure returns 1 if there is a unique solution of $F$ in Input.


    Kantorovitch theorem

    Mathematical background

    Let a system of $n$ equations in $n$ unknowns:

    \begin{displaymath}
F=\{ F_i(x_1,\ldots,x_n)=0, i\in [1,n]\}
\end{displaymath}

    each $F_i$ being at least $C^2$. Let ${\bf x_0}$ be a point and $U=\{ {\bf x} / \vert\vert{\bf x} -{\bf
x_0}\vert\vert\le 2B_0\}$, the norm being $\vert\vert A\vert\vert= {\rm Max}_i \sum_j \vert a_{ij}\vert$. Assume that ${\bf x_0}$ is such that:
    1. the Jacobian matrix of the system has an inverse $\Gamma_0$ at ${\bf x_0}$ such that $\vert\vert\Gamma_0\vert\vert \le A_0$
    2. $\vert\vert\Gamma_0 F({\bf x_0})\vert\vert \le 2B_0$
    3. $\sum_{k=1}^{n}\vert\frac{\partial^2F_i({\bf x}}{\partial
x_j\partial x_k}\vert \le C$ for $i,j=1,\ldots,n$ and ${\bf x} \in U$
    4. the constants $A_0, B_0, C$ satisfy $2nA_0B_0C \le 1$
    Then there is an unique solution of $F=0$ in $U$ and Newton method used with ${\bf x_0}$ as estimate of the solution will converge toward this solution [3]. An interesting use of Kantorovitch theorem can be found in section 2.5.

    Implementation

     
    int Kantorovitch(int m,VECTOR (* TheFunction)(VECTOR &),MATRIX (* Gradient)(VECTOR &), 
            INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &),VECTOR &Input,double *eps)
    
    Another implementation, consistent with the procedure used in the general solving algorithm (see section 2.5) is:
     
    int Kantorovitch(int m,
            INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), VECTOR &Input,double *eps)
    
    There is also an implementation of Kantorovitch theorem for univariate polynomial, see section 5.2.12.

    Return code

    This procedure return an integer $k$:


    Rouche theorem

    Mathematical background

    Let a system of $n$ equations in $n$ unknowns:

    \begin{displaymath}
F=\{ F_i(x_1,\ldots,x_n)=0, i\in [1,n]\}
\end{displaymath}

    We will denote by $F^{(k)}$ the matrix of the derivatives of order $k$ of $F$ with respect to the variable. Let us consider a point $X_0$ and define $\gamma$ as

    \begin{displaymath}
\gamma = Sup (\frac{\vert\vert(F^{(1)}(X_0))^{-1}F^{(k)}(X_0)\vert\vert^{1/k-1}}{k!})  k \ge 2
\end{displaymath}


    \begin{displaymath}
\beta =\vert\vert (F^{(1)}(X_0))^{-1}F(X_0)\vert\vert
\end{displaymath}

    and $\alpha = \beta \gamma$. If $\alpha$ is strictly lower than $3-2\sqrt{2}$, then $F$ has a single root in a ball centered at $X_0$ with radius $(1+\alpha-\sqrt{\alpha^2-6*\alpha+1})/(4*\gamma)$ and the Newton scheme with initial guess $X_0$ will converge to the solution.

    The most difficult part for using this theorem is to determine $\gamma$. For algebraic equations it is easy to determine a value $k_1$, that we will call the order of Rouche theorem, such that $F^{(k_1)}=0$ and consequently $\gamma$ may be obtained by computing

    \begin{displaymath}
S_k=\frac{\vert\vert(F^{(1)}(X_0))^{-1}F^{(k)}(X_0)\vert\vert^{1/k-1}}{k!}
\end{displaymath}

    for all $k$ in $[2,k_1-1]$ and taking $\gamma$ as the Sup of all $S_k$.

    For non algebraic finding $\gamma$ requires an analysis of the system.

    Rouche theorem may be more efficient than Moore or Kantorovitvh theorems. For example when combined with a polynomial deflation (see section 5.9.6) it allows one to solve Wilkinson polynomial of order up to 18 with the C++ arithmetic on a PC, while stand solving procedure fails for order 13.

    Implementation

    Rouche theorem is implemented in the following way:

    The syntax of the procedure is:
     
    int Rouche(int DimensionEq,int DimVar,int order,
               INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &),
               INTERVAL_VECTOR (* Jacobian)(int, int, INTERVAL_VECTOR &), 
               INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
               INTERVAL_VECTOR (* OtherDerivatives)(int, int, INTERVAL_VECTOR &), 
               double Accuracy,
               int MaxIter,
               INTERVAL_VECTOR &Input,
               INTERVAL_VECTOR &UnicityBox)
    
    where The parameters Accuracy is used in the Newton scheme to determine if Newton has converged i.e. if the residues are lower than Accuracy. A maximum of MaxIter iterations are performed. The solution found with Newton is stored in ALIAS_Simp_Sol_Newton_Numerique while a copy of the unicity box is available in ALIAS_Simp_Sol_Newton

    If a ball with a single solution has been found it will be returned in UnicityBox and the procedure returns 1, otherwise it returns 0.

    If the flag ALIAS_Always_Use_Inflation is set to 1, then an inflation procedure is used to try to enlarge the box up to the accuracy ALIAS_Eps_Inflation.


    Interval Newton

    The classical interval Newton method is embedded in the procedure GradientSolve and HessianSolve but may also be useful in other procedures. Furthermore this method relies on the use of the product $J^{-1}(X_0)J(X)$ where $J$ is the Jacobian of the system of equations and $J^{-1}(X_0)$ the inverse of $J$ computed at some particular point $X_0$. In the classical method this product is cimputed numerically and this does not take into account that the element of $J$ are functions of the same parameters. For example if the first column of $J$ is $(x,x)$ where $x$ is some parameter with interval value, the first element of $J^{-1}(X_0)J(X)$ will be computer as

    \begin{displaymath}
a_{11}x+ a_{12}x
\end{displaymath}

    where $(a_{11}, a_{12}$ are the elements of $J^{-1}(X_0)$. Clearly the double occurence of $x$ in the numerical evaluation of the elements may lead to an overestimation of the elements: this element should be written as $x(a_{11}+a_{12})$ which is optimal in term of interval evaluation. Furthermore it may also be interesting to have the derivatives of each element of the product in order to improve the interval evalation of the matrix product. Indeed the interval evaluation of $J^{-1}(X_0)J(X)$ plays a very important role in the interval Newton method either for filtering a box for possible solution or for determining that a box includes a solution of the system.

    The procedure IntervalNewton is a sophisticated interval Newton algorithm that allows one to introduce knowledge on the product $K=J^{-1}(X_0)J(X)$ in the classical scheme. Its syntax is:

     
    int IntervalNewton(int Dim,INTERVAL_VECTOR &P,INTERVAL_VECTOR &FDIM,
                       INTERVAL_MATRIX &Grad,MATRIX &GradMid,
                       MATRIX &InvGradMid,
                       int hasBGrad,
                       INTERVAL_VECTOR (* BgradFunc)(int,int,INTERVAL_VECTOR &), 
                       INTERVAL_MATRIX (* BgradJFunc)(int, int,INTERVAL_VECTOR &), 
                       int grad1,
                       int grad3B1)
    
    where The procedure returns -1 if no solution of the system exists in P, 1 if P has been improved, 0 otherwise. Note that the procedure BgradFunc and BgradJFunc may require the availability of the mid-matrix GradMid: therefore a global MATRIX should be made available, initialized with GradMid.

    Various variants of IntervalNewton are available:

     
    int IntervalNewton(int Dim,INTERVAL_VECTOR &P,INTERVAL_VECTOR &FMID,
                      INTERVAL_MATRIX &Grad,MATRIX &GradMid,MATRIX &InvGradMid)
    
    is the classical interval Newton method with hasBgrad=grad1=grad3B1=0.

     
    int IntervalNewton(int Dim,INTERVAL_VECTOR &P,int DimVar,int DimEq,
              int TypeGradMid,MATRIX &InvGradMid,
              INTERVAL_VECTOR (*TheIntervalFunction)(int,int,INTERVAL_VECTOR &),
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &))
    
    is also the classical interval Newton method for a system having DimVar unknowns and DimEq equations (here DimVar and DimEq are not required to have the same value: only the Dim first equations will be considered). The flag TypeGradMid is used to determine how the mid jacobian matrix is calculated: if 0 this matrix is calculated for the mid-point of P, if 1 the mid-jacobian is calculated as the mid-matrix of the interval jacobian calculated for P.

     
    int IntervalNewton(int Dim,INTERVAL_VECTOR &P,int DimEq,int DimVar,
                       int has_BGrad,
                       INTERVAL_VECTOR (* BgradFunc)(int,int,INTERVAL_VECTOR &), 
                       INTERVAL_MATRIX (* BgradJFunc)(int, int,INTERVAL_VECTOR &), 
                       int grad1,int grad3B1,
                       int TypeGradMid,
                       MATRIX &GradFuncMid,
                       MATRIX &InvGradFuncMid,
                       INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
                       INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &))
    
    Here the the mid jacobian GradFuncMid and its inverse InvGradFuncMid will be provided by the procedure.


    Miranda theorem

    Miranda theorem provides a simple way to determine if there is one, or more, solution of a system of equations in a given box. It has the advantage of not requiring the derivatives of the equations but the drawback of not provinding the proof of the existence of a single solution in the box.

    Mathematical background

    Let a system $F(X)=0$ with $X=\{x_1,\ldots,x_n\}$. Let us consider a ball ${\cal B}=\{[\underline{x_1},\overline{x_1}],\ldots,[\underline{x_n},\overline{x_n}]$ for $X$ and define

    \begin{eqnarray*}
&&[X]_j^+ = \{X \in {\cal B} {\rm such that} x_j=\overline{x_j...
..._j^- = \{X \in {\cal B} {\rm such that} x_j=\underline{x_j}\}\\
\end{eqnarray*}

    for $j$ in $[1,n]$. If

    \begin{displaymath}
F_j(X)\ge 0 ~{\rm and}~ F_j(Y)\le 0~~\forall~X\in [X]_j^+,
\forall~Y\in [X]_j^-, j=1,\ldots,n
\end{displaymath}

    or if

    \begin{displaymath}
F_j(X)\le 0 ~{\rm and}~ F_j(Y)\ge 0~~\forall~X\in [X]_j^+,
\forall~Y\in [X]_j^-, j=1,\ldots,n
\end{displaymath}

    then $F$ has at least one zero in ${\cal B}$ [15].

    Implementation

    The simplest implementation of the Miranda theorem is

     
    int Miranda(int Dim,INTERVAL_VECTOR (* F)(int,int,INTERVAL_VECTOR &),
               INTERVAL_VECTOR &Input)
    
    where Dim is the number of equations, Input is a ball for the variables and F is a procedure in MakeF format that allows to compute an interval evaluation of the equations. This procedure returns 1 if Miranda theorem is satisfied for Input, 0 otherwise. This implementation is embedded in the Solve_General_Interval solving algorithm.

    Another implementation uses the derivatives for improving the interval evaluation:

     
    \begin{verbatim} 
    int Miranda(int Dim,
               INTERVAL_VECTOR (* F)(int,int,INTERVAL_VECTOR &),
               INTERVAL_MATRIX (* J)(int,int,INTERVAL_VECTOR &),
               INTERVAL_VECTOR &Input)
    
    J is a procedure in MakeJ format that allows to compute the derivative of the equations.


    Inflation

    Mathematical background

    Let a system $F(X)=0$, $J$ the Jacobian matrix of this system and $X0$ a solution of the system. The purpose of the inflation method is to build a box that will contain only this solution. Let $B(X0)$ be a ball centered at $X0$: if for any point in $B$ $J$ is not singular, then the ball contains only one solution of the system.

    The problem now is to determine a ball such for any point in the ball the Jacobian is regular. Let $H$ be the matrix $J(X0)^{-1}J(B)$ whose components are intervals. Let $u$ be the diagonal element of H having the lowest absolute value, let $v_i$ be the maximum of the absolute value of the sum of the elements at row $i$ of $H$, discarding the diagonal element of the row and let $v$ be the maximum of the $v_i$'s. If $u>v$, then the matrix is denoted diagonally dominant and all the matrices $J(B)$ are regular [19].

    Let $\epsilon$ be a small constant: we will build incrementally the ball $B$ by using an iterative scheme defined as:

    \begin{eqnarray*}
&&B_0=X0\\
&&B_n=[B_{n-1}-\epsilon,B_{n-1}+\epsilon]
\end{eqnarray*}

    that will be repeated until $B_n$ is no more diagonally dominant. $\epsilon$ Note that in some cases (see for example section 2.16) it is possible to calculate directly the largest possible $\epsilon$ so that all the matrices in $J([B_0-\epsilon,B_0+\epsilon])$ are regular without relying on the iterative scheme.

    Implementation

     
    int ALIAS_Epsilon_Inflation(int Dimension,int Dimension_Eq,
           INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
           INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
           INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
           VECTOR &X0,
           INTERVAL_VECTOR &B)
    
    Note that the Hessian argument is not used in this procedure. This routine will return 1 if the inflation has succeeded. The value of $\epsilon$ can be found in the global variable. ALIAS_Eps_Inflation


    next up previous contents Next: Analyzing trigonometric equations Up: ALIAS-C++ Previous: Solving with Interval Analysis
  • J-P. Merlet home page
  • La page Présentation de HEPHAISTOS
  • HEPHAISTOS home page
  • La page "Présentation" de l'INRIA
  • INRIA home page

    jean-pierre merlet
    2018-07-25