next up previous contents Next: Analyzing systems of equations Up: ALIAS-C++ Previous: Introduction

  • La page de J-P. Merlet
  • J-P. Merlet home page
  • La page Présentation de HEPHAISTOS
  • HEPHAISTOS home page
  • La page "Présentation" de l'INRIA
  • INRIA home page

    Subsections


    Solving with Interval Analysis

    Introduction

    The purpose of this chapter is to describe the methods based on interval analysis available in the ALIAS library for the determination of real roots of system of equations and inequalities.


    Interval Analysis

    Mathematical background

    This section is freely inspired from the book [5]. An interval number is a real, closed interval $(\underline{x},\overline{x})$. Arithmetic rules exist for interval numbers. For example let two interval numbers $X=(\underline{x},\overline{x})$, $Y=(\underline{y},\overline{y})$, then:

    \begin{eqnarray*}
&&X+Y= [\underline{x}+\underline{y},\overline{x}+\overline{y}]\\
&&X-Y= [\underline{x}-\overline{y},\overline{x}-\underline{y}]
\end{eqnarray*}

    An interval function is an interval-valued function of one or more interval arguments. An interval function $F$ is said to be inclusion monotonic if $X_i \subset Y_i$ for $i$ in $[1,n]$ implies:

    \begin{displaymath}
F(X_1,\ldots,X_n) \subset F(Y_1,\ldots,Y_n)
\end{displaymath}

    A fundamental theorem is that any rational interval function evaluated with a fixed sequence of operations involving only addition, subtraction, multiplication and division is inclusion monotonic. This means in practice that the interval evaluation of a function gives bounds (very often overestimated) for the value of the function: for any specific values of the unknowns within their range the value of the function for these values will be included in the interval evaluation of the function. A very interesting point is that the above statement will be true even taking into account numerical errors. For example the number 1/3, which has no exact representation in a computer, will be represented by an interval (whose bounds are the highest floating point number less than 1/3 and the smallest the lowest floating point number greater than 1/3) in such way that the multiplication of this interval by 3 will include the value 1. A straightforward consequence is that if the interval evaluation of a function does not include 0, then there is no root of the function within the ranges for the unknowns.

    In all the following sections an interval for the variable $x$ will be denoted by $(\underline{x},\overline{x})$. The width or diameter of an interval $(\underline{x},\overline{x})$ is the positive difference $\overline{x}-\underline{x}$. The mid-point of an interval is defined as $(\overline{x}+\underline{x})/2$.

    A box is a set of intervals. The width of a box is the largest width of the intervals in the set and the center of the box is the vector constituted with the mid-point of all the intervals in the set.


    Implementation

    All the procedures described in the following sections use the free interval analysis package BIAS/Profil in which the basic operations of interval analysis are implemented2.1.

    This package uses a fixed precision arithmetics with an accuracy of roughly $10^{-16}$. Different types of data structure are implemented in this package. For fixed value number:

     
    BOOL, BOOL_VECTOR, BOOL_MATRIX,
    INT, INTEGER_VECTOR, INTEGER_MATRIX, REAL, VECTOR, MATRIX, COMPLEX
    
    for intervals:
     
    INTERVAL, INTERVAL_VECTOR, INTERVAL_MATRIX
    
    All basic arithmetic operations can be used on interval-valued data using the same notation than for fixed numbers. Not that for vector and matrices the index start at 1: $V(1,1)$ for an interval matrix represents the interval at the first row and first column of the interval matrix $V$. The type INTERVAL_VECTOR will be used to implement the box concept, while the type INTERVAL_MATRIX will be used to implement the concept of list of boxes.

    For the evaluation of more complex interval-valued function there are also equivalent function in the BIAS/Profil, whose name is usually obtained from their equivalent in the C language by substituting their first letter by the equivalent upper-case letter: for example the evaluation of $\sin(X)$ where $X$ is an interval will be obtained by calling the function $Sin(X)$. We have also introduced in ALIAS some other mathematical operators whose names are derived from their Maple implementation: ceil, floor, round.

    Table 2.1 indicates the substitution for the most used functions.


    Table 2.1: Equivalent interval-valued function
    C function Substitution C function Substitution
    sin Sin cos Cos
    tan Tan arcsin ArcSin
    arccos ArcCos arctan ArcTan
    sinh Sinh cosh Cosh
    tanh Tanh arcsinh ArcSinh
    arccosh ArcCosh arctanh ArcTanh
    exp Exp log Log
    log10 Log10 $x^2$ Sqr
    sqrt Sqrt(x) $\sqrt[i](x)$ Root(x,i)
    $x^i$ Power(x,i) $x^y$ Power(x,y)
    $\vert x\vert$ IAbs(x) ceil($x$) ALIAS_Ceil($x$)
    floor($x$) ALIAS_Floor($x$) rint($x$) ALIAS_Round($x$)


    Note also that the mathematical operators $cot$, $arccot$, $arccoth$ exist under the name Cot, ArcCot, ArCoth.

    A special operator is defined in the procedure ALIAS_Signum: formally it defines the signum operator of Maple defined as

    \begin{displaymath}
{\rm signum}(x)=\frac{x}{\vert x\vert}
\end{displaymath}

    which is not defined at $x=0$. In our implementation for an interval $x =[\underline{x},\overline{x}]$ ALIAS_Signum($x$) will return:
     $$
    1if $\underline{x}>0$
     $$
    -1 if $\overline{x}<0$
     $$
    1if $\vert x\vert$ is lower than ALIAS_Value_Sign_Signum and ALIAS_Sign_Signum is positive
     $$
    -1 if $\vert x\vert$ is lower than ALIAS_Value_Sign_Signum and ALIAS_Sign_Signum is negative
    -1,1
    otherwise
    The default value for ALIAS_Value_Sign_Signum and ALIAS_Sign_Signum are respectively 1e-6 and 0.

    The derivative of ALIAS_Signum is defined in the procedure ALIAS_Diff_Signum. Formally this derivative is 0 for any $x$ not equal to 0. In our implementation ALIAS_Diff_Signum($x$) will return 0 except if $\vert x\vert$ is lower than ALIAS_Value_Sign_Signum in which case the procedure returns [-1e11,1e11].

    The derivative of the absolute value is defined in the procedure ALIAS_Diff_Abs. If the interval X includes 0 the procedure returns [-1e11,1e11] otherwise it returns ALIAS_Signum(X).

    Using the above procedures when an user has to write an interval-valued function he has to convert its C source code using the defined substitution. For example if a function is written in C as:

    \begin{displaymath}
double x,y,z,f;    
f=sin(x)*cos(y)+sqrt(z);
\end{displaymath}

    then its equivalent interval valued function is

    \begin{displaymath}
INTERVAL  x,y,z,f;    
f=Sin(x)*Cos(y)+Sqrt(z);
\end{displaymath}

    A special care has to be used when transforming an equation into its interval equivalent. The formulation may play a role in the efficiency. For example you should avoid as much as possible multiple occurences of the same variable as this will usually lead to an overestimation of the interval evaluation. For example $x^2+2x+1$ is better written as $(x+1)^2$. You should also avoid as much as possible to repeat the same evaluation. For example if an expression is involved several times in equation(s) you better have to assign once the interval evaluation of this expression in a temporary interval variable and use this variables in the calculation. Transforming equation in their interval equilvalent is a tedious process and may be automated using the ALIAS-Maple procedure MakeF that produces automatically the C++ code and apply heuristics for obtaining the most efficient form. There is another procedure MakeJ that will produce automaticaaly the code for the gradient matrix being given the equations and variables.

    Note that all the even powers of an interval are better managed with the Sqr and Power procedures. Indeed let consider the interval $X=[-1,1]$, then the interval product $X\times X$ leads to the interval [-1,1] while the interval Sqr($X$) leads to [0,1]. For an interval $X=(\underline{x},\overline{x})$ the width of the interval is obtained by using the procedure Diam($X$) while we have $\underline{x}=$Inf($X$) and $\overline{x}=$Sup($X$).

    We will denote by box a set of intervals which define the possible values of the unknowns. By extension and according to the context boxes may also be used to denote a set of such set. A function intervals will denote the interval values of a set of functions for a given box, while a solution intervals will denote the box which are considered to be solution of a system of functions.


    Problems with the interval-valuation of an expression

    An important point is that not all expressions can be evaluated using interval arithmetics. Namely constraints that prohibits the interval evaluation of an expression are:

    If such situation occurs a fatal error will be generated at run time. Hence such special cases has to be dealt with carefully. ALIAS-Maple offers the possibility of dealing with such problem. For example the procedure Problem_Expression allows one to determine what constraints should be satisfied by the unknowns so that each equations can be interval evaluated, see the ALIAS-Maple documentation.

    If you use your own evaluation procedure and are aware of evaluation problems and modify the returned values if such case occurs it will be a good policy to set C++ flags ALIAS_ChangeF, ALIAS_ChangeJ to 1 (default value 0) if a change occurs. Currently the interval Newton scheme that is embedded in some of the solving procedures of ALIAS will not be used if one of these flags is set to 1 during the calculation.

    Dealing with infinity

    In some specific cases we may have to deal with interval in which infinity is used. These quantities are represented using BIAS convention, BiasNegInf representing the negative infinity and BiasPosInf the positive infinity.


    Non 0-dimensional system

    Although the solving procedures of ALIAS are mostly devoted to be used for 0 dimensional system (i.e. systems having a finite number of solutions) most of them can still be used for non 0-dimensional system. In that case the result will be a set of boxes which will be an approximation of the solution. When dealing with such system it is necessary to set the global variable ALIAS_ND to 1 (its default value is 0) and to define a name in the character string ALIAS_ND_File. The solution boxes of the system will be stored in a file with the given name. The quality of the approximation may be estimated with the flags ALIAS_Volume_In, ALIAS_Volume_Neglected that give respectively the total volume of the solution boxes and the total volume of the neglected boxes (i.e. the boxes for which the algorithm has not been able to determine if they are or not a solution of the system). Note that there are special procedures for 1-dimensional system, see chapter 9.


    General purpose solving algorithm

    This algorithm enable to determine approximately the solutions of a system of $n>0$ equations and inequalities in $m>0$ unknowns. Hence this method may be used to solve a system composed of $n_1$ equations $F_1(X)=0,\ldots,F_{n_1}=0$, $n_2$ inequalities $G_1(X) \ge 0,\ldots,G_{n_2})X) \ge 0$ and $n_3$ inequalities $H_1(X)\le 0, \ldots,H_{n_3}(X)\le 0$.

    Mathematical background

    Principle

    Let $x_1,\ldots,x_m$ be the set of unknowns and let ${\cal I}_1=\{(\underline{x_1^1},\overline{x_1^1}),
\ldots (\underline{x_m^1},\overline{x_m^1})\} $ be the set of $m$ intervals in which you are searching the solutions of the $n$ equations ${\cal F}_1(x_1,\ldots,x_m)=0, \ldots {\cal
F}_n(x_1,\ldots,x_m)=0$ (for the sake of simplicity we don't consider inequalities but the extension to inequalities is straightforward).

    We will denote by $F_i$ the interval value of ${\cal F}_i$ when this function is evaluated for the box $(\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m})$ of the unknowns while $F({\cal I}_j)$ will denote the $n$-dimensional interval vector constituted of the $F_i$ when the unknowns have the interval value defined by the set ${\cal I}_j$.

    The algorithm will use a list of boxes ${\cal I}$ whose maximal size $M$ is an input of the program. This list is initialized with ${\cal I}_1$. The number of ${\cal I}$ currently in the list is $N$ and therefore at the start of the program $N=1$. The algorithm will also use an accuracy on the variable $\epsilon$ and on the functions $\epsilon_F$. The norm of a ${\cal I}$ is defined as:

    \begin{displaymath}
\Vert{{\cal I}^j}\Vert={\rm Max}(\overline{x_k^j}-\underline{x_k^j})   {\rm
for} k \in [1,m]
\end{displaymath}

    The norm of the interval vector $F({\cal I}_j)$ is defined as:

    \begin{displaymath}
\uparrow{F({\cal I}_j)}\uparrow ={\rm Max}(\overline{F_k({\cal I}_j)}-
\underline{F_k({\cal I}_j)})   {\rm for} k \in [1,n]
\end{displaymath}

    The algorithm uses an index $i$ and the result is a set ${\cal S}$ of interval vector ${\cal S}_k$ for the unknowns whose size is $S$. We assume that there is no $F_k$ with $k$ in $[1,n]$ such that $\overline{F({\cal I}_1)}<0$ or $\underline{F({\cal I}_1)}>0$, otherwise the equations have no solution in ${\cal I}_1$. Two lists of interval vectors ${\cal V}, {\cal W}$ whose size is $2^m$ will also be used. The algorithm is initialized with $i = 1, S=1$ and proceed along the following steps:
    1. if $i = N+1$ return $S-1$ and ${\cal S}$ and exit
    2. bisect ${\cal I}_i$ which produce $2^m$ new interval vectors ${\cal V}_l$ and set $j=1$
    3. for $l=1,\ldots,2^m$
      1. evaluate $F({\cal V}_l)$
      2. if it exist $F_k$ with $k$ in $[1,n]$ such that $\overline{F_k({\cal V}_l)}<0$ or $\underline{F_k({\cal V}_l)}>0$, then $l = l+1$ and go to step 3
      3. if $\Vert{{\cal V}_l}\Vert < \epsilon$ or $\uparrow{F({\cal
V}_l)}\uparrow <\epsilon_f $, then store ${\cal V}_l$ in ${\cal
S}_S$, increment $S$ and go to step 3
      4. store ${\cal V}_l$ in ${\cal W}_j$, increment $j$ and go to step 3
    4. if $j=1$ increment $i$ and go to step 1
    5. if $N+j-2 >M$ return a failure code as there is no space available to store the new intervals
    6. if $j>1$ store one of the ${\cal W}$ in ${\cal I}_i$, the other $j-2$ at the end of ${\cal I}$, starting at position $N+1$. Add $j-2$ to $N$ and go to step 1
    Basically the algorithm just bisect the box until either their width is lower than $\epsilon$ or the width of the interval function is lower than $\epsilon_f$ (provided that there is enough space in the list to store the intervals). Then if all the intervals functions contain 0 we get a new solution, if one of them does not contain 0 there is no solution of the equations within the current box. A special case occurs when all the components of the box are reduced to a point, in which case a solution is obtained if the absolute value of the interval evaluation of the function is lower than $\epsilon_f$.

    Now three problems have to be dealt with:

    1. how to choose the ${\cal W}$ which will be put in place of the ${\cal I}_i$ and in which order to store the other ${\cal W}$ at the end of the list?
    2. can we improve the management of the bisection process in order to conclude the algorithm with a limited number $M$?
    3. how do we distinguish distinct solutions ?
    The first two problems will be addressed in the next section.


    Managing the bisection and ordering

    The second problem is solved in the following way: assume that at some step of the algorithm the bisection process leads to the creation of $k$ ${\cal W}$ such that $N+k-2>M$. As we have previously considered the $i-1$ elements of ${\cal I}$ we may use them as storage space. This means that we will store ${\cal I}_j, j\ge i$ at ${\cal I}_{j-i+1}$ thereby freeing $i-1$ elements. In that case the procedure will fail only if $N+k-i+1>M$.

    Now we have to manage the ordering of the ${\cal W}$. We have defined two types of order for a given set of boxes ${\cal I}$:

    1. maximum equation ordering: the box are ordered along the value of $C={\rm Max}(\vert\overline{F_k({\cal I})}\vert,
\vert\underline{F_k({\cal I})}\vert)$ for all $k$ in [1,$n$]. The first box will have the lowest $C$.
    2. maximum middle-point equation ordering: let $C_i$ be the vector whose components are the middle points of the intervals ${\cal I}$. The box are ordered along the value of $C={\rm Max}(\vert\overline{F_k(C_i)}\vert,
\vert\underline{F_k(C_i)}\vert)$ for all $k$ in [1,$n$]. The first box will have the lowest $C$.
    When adding the ${\cal W}$ we will substitute the ${\cal I}_i$ by the ${\cal W}$ having the lowest $C$ while the others ${\cal W}$ will be added to the list ${\cal I}$ by increasing order of $C$. The purpose of these ordering is to try to consider first the box having the highest probability of containing a solution. This ordering may have an importance in the determination of the solution intervals (see for example section 2.3.5.2).

    This method of managing the bisection is called the Direct Storage mode and is the default mode in ALIAS. But there is another mode, called the Reverse Storage mode. In this mode we still substitute the ${\cal I}_i$ by the ${\cal W}$ having the lowest $C$ but instead of adding the remaining $n$ ${\cal W}$ at the end of the list ${\cal I}$ we shift by $n$ the boxes in the list, thereby freeing the storage of ${\cal
I}_{i+1},\ldots, {\cal I}_{i+1+n}$ which is used to store the remaining $n$ ${\cal W}$. In other words we may consider the solving procedure as finding a leaves in a tree which are solutions of the problem: in the Direct Storage mode we may jump during the bisection from one branch of the tree to another while in the Reverse storage mode we examine all the leaves issued from a branch of the tree before examining the other branches of the tree. If we are looking for all the solutions the storage mode has no influence on the total number of boxes that will be examined. But the Reverse Storage mode may have two advantages:

    To switch the storage mode see section 2.3.4.5 .We may also define a mixed strategy which is staring in the direct mode and then switching to the reverse mode when the storage becomes a problem (see section 8.3).


    An alternative: the single bisection

    A possibility to reduce the combinatorial explosion of the previous algorithm is to bisect not all the variables i.e. to use the full bisection mode, but only one of them (it must be noted that the algorithms in ALIAS will not accept a full bisection mode if the number of unknowns exceed 10). This may reduce the computation time as the number of function evaluation may be reduced. But the problem is to determine which variable should be bisected. All the solving algorithms of ALIAS may manage this single bisection by setting the flag Single_Bisection to a value different from 0. The value of this global variable indicates various bisection modes. Although the behavior of the mode may change according to the algorithm here are the possible modes for the general solving algorithm and the corresponding values for Single_Bisection: For all general purpose solving procedures the number of the variable that has been bisected is available in the integer ALIAS_Selected_For_Bisection.

    There is another mode called the mixed bisection: among the $n$ variables we will bisect $m_1 <n$ variables, which will lead to $2^{m_1}$ new boxes. This mode is obtained by setting the global integer variable
    ALIAS_Mixed_Bisection to $m_1$. Whatever is the value of Single_Bisection we will order the variables according to their width and select the $m_1$ variables having the largest width.

    Solutions and Distinct solutions

    An interval will be considered as a solution for a function of the system in the following cases:

    A solution of the system is defined as a box such that the above conditions hold for each function of the system. Note that for systems having interval coefficients (which are indicated by setting the flag ALIAS_Func_Has_Interval to 1) a solution of a system will be obtained only if the inequalities are strictly verified.

    Assume that two solutions ${\cal S_1}, {\cal S_2}$ have been found with the algorithm. We will first consider the case where we have to solve a system of $n$ equations in $n$ unknowns, possible with additional inequality constraints. First we will check with the Miranda theorem (see section 3.1.5) if ${\cal S_1}, {\cal S_2}$ include one (or more) solution(s). If both solutions are Miranda, then they will kept as solutions. If one of them is Miranda and other one is not Miranda we will consider the distance between the mid-point of ${\cal S_1}, {\cal S_2}$: if this distance is lower than a given threshold we will keep as solution only the Miranda's one. If none of ${\cal S_1}, {\cal S_2}$ is Miranda we keep these solutions, provided that their distance is greater than the threshold. Note that in that case these solutions may disappear if a Miranda solution is found later on such that the distance between these solutions and the Miranda's one is lower than the threshold.

    In the other case the solution will be ranked according the chosen order and if a solution is at a distance from a solution with a better ranking lower than the threshold, then this solution will be discarded.


    The 3B method

    In addition to the classical bisection process all the solving algorithms in the ALIAS library may make use of another method called the 3B-consistency approach [2].

    Although its principle is fairly simple it is usually very efficient (but not always, see section 2.4.3.1). In this method we consider each variable $x_i$ in turn and its range $[\underline{x_i},\overline{x_i}]$. Let $x^m_i$ be the middle point of this range. We will first calculate the interval evaluation of the functions in the system with the full ranges for the variable except for the variable $i$ where the range will be $[\underline{x_i},x^m_i]$. Clearly if one of the equations is not satisfied (i.e. its interval evaluation does not contain 0), then we may reduce the range of the variable $i$ to $[x^m_i,\overline{x_i}]$. If this is not the case we will define a new $x^m_i$ as the middle point of the interval $[\underline{x_i},x^m_i]$ and repeat the process until either we have found an equation that is not satisfied (in which case the interval for the variable $i$ will be reduced to $[x^m_i,\overline{x_i}]$) or the width of the interval $[\underline{x_i},x^m_i]$ is lower than a given threshold $\delta$. Using this process we will reduce the range for the variable $i$ on the left side and we may clearly use a similar procedure to reduce it on the left side. The 3B procedure will be repeated if:

    For all the algorithms of ALIAS this method may be used by setting the flag ALIAS_Use3B to 1 or 2. In addition you will have to indicate for each variable a threshold $\delta$ and a maximal width for the range (if the width of the range is greater than this maximal value the method is not used). This is done through the VECTOR variables ALIAS_Delta3B and ALIAS_Max3B. The difference of behavior of the method if ALIAS_Use3B is set to 1 or 2 is the following:

    A typical example for a problem with 25 unknowns will be:
     
       ALIAS_Use3B=1;
       Resize(ALIAS_Delta3B,25);Resize(ALIAS_Max3B,25);
       for(i=1;i<=25;i++)
         {
           ALIAS_Delta3B(i)=0.1;ALIAS_Max3B(i)=7;
         }
    
    which indicate that we will start using the 3B method as soon as the width of a range is lower than 7 and will stop it if we cannot improve the range by less than 0.1.

    A drawback of the 3B method is that it may imply a large number of calls to the evaluation of the functions. The larger number of evaluation will be obtained by setting the ALIAS_Use3B to 2 and ALIAS_Full3B to 1 while the lowest number will be obtained if these values are 1 and 0. It is possible to specify that only a subset of the functions (the simplest) will be checked in the process. This is done with the global variable ALIAS_SubEq3B, an integer array whose size should be set to the number of functions and for which a value of 1 at position $i$ indicates that the function $i$ will be used in the 3B process while a value of 0 indicates that the function will not be used. For example:

     
    Resize(ALIAS_SubEq3B,10);
    Clear(ALIAS_SubEq3B);
    ALIAS_SubEq3B(1)=1;
    ALIAS_SubEq3B(2)=1;
    
    indicates that only the two first functions will be used in the 3B process. If you are using your own solving procedure, then it is necessary to indicate that only part of the equations are used by setting the flag ALIAS_Use_SubEq3B to 1.

    In some cases it may be interesting to try to use at least once the 3B method even if the width of the range is larger than ALIAS_Max3B. If the flag ALIAS_Always3B is set to 1, then the 3B will be used once to try to remove the left or right half interval of the variables.

    If you are using also a simplification procedure (see section 2.3.3) you may avoid using this simplification procedure by setting the flag ALIAS_Use_Simp_In_3B to 0. You may also adapt the simplification procedure when it is called within the 3B method. For that purpose the flag ALIAS_Simp_3B is set to 1 instead of 0 when the simplification procedure is called within the 3B method. For some procedure if ALIAS_Use_Simp_In_3B is set to 2 then ALIAS_Simp_3B is set to 1 when the whole input is checked. But if ALIAS_Use_Simp_3B is set to a value larger than 2 then ALIAS_Simp_3B is set to 0.

    Some methods allows to start the 3B method not by a small increment that is progressively increased but by a large increment (half the width of the interval) and to decrease it if it does not work. This is done by setting the flag ALIAS_Switch_3B to a value between 0 and 1: if the width of the current interval is lower than the width of the initial search domain multiplied by this flag, then a small increment is used otherwise a large increment is used.

    When the routine that evaluate the expression uses the derivatives of the expression we may avoid to use these derivatives if the width of the ranges in the box are too large. This is obtained by assigning the size of the vector ALIAS_Func_Grad to the number of unknowns and assigning to the components of this vector to the maximal width for the ranges of the variables over which the derivatives will not be used: if there is a range with a width larger than its limits then no derivatives will be used.

    Note also that the 3B-consistency is not the only one that can be used: see for example the ALIAS-Maple manual that implements another consistency test for equations which is called the 2B-consistency or Hull-consistency in the procedure HullConsistency (similarly HullIConsistency implement it for inequalities). See also the section 2.17 for an ALIAS-C++ implementation of the 2B and section 11.4 for detailed calls to the 3B procedures.


    Simplification procedure

    Most of the procedures in ALIAS will accept as optional last argument the name of a simplification procedure: a user-supplied procedure that take as input the current box and proceed to some further reduction of the width of the box or even determine that there is no solution for these box, in which case it should return -1. Such procedure must be implemented as:

     
    int Simp_Proc(INTERVAL_VECTOR & P)
    
    where P is the current box. This procedure must return either -1 or any other integer. If a reduction of an interval is done within this procedure, then P must be updated accordingly.

    This type of procedure allows the user to add information to the algorithm without having to add additional equations. The simplification procedure is applied on a box before the bisection and is used within the 3B method if this heuristic is applied.

    Note that the Maple package associated to ALIAS allows in some cases to produce automatically the code for such procedure (see the ALIAS-Maple manual) and that section 2.17 presents a standard simplification procedure that may be used for almost any system of equations.

    Implementation

    The algorithm is implemented as:
     
    int Solve_General_Interval(int m,int n,
         INTEGER_VECTOR Type_Eq,
         INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
         INTERVAL_VECTOR & TheDomain, 
         int Order,int M,int Stop,
         double epsilon,double epsilonf,double Dist,
         INTERVAL_MATRIX & Solution,int Nb,
         int (* Simp_Proc)(INTERVAL_VECTOR &))
    
    the arguments being:

    Note that the following arguments may be omitted:


    Number of unknowns and functions

    The only constraint on n,m is that they should be strictly positive. So the algorithm is able to deal with under-constrained or over-constrained systems.


    Type of the functions

    The i-th value in the array of $n$ integers Type_Eq enable one to indicate if function $F_i$ is an equation or an inequality: For all the general solving algorithms the integer ALIAS_Pure_Equation is set to the number of equations in the constraints list.


    Interval Function

    The user must provide a function which will compute the function intervals of the functions for a given box. When designing ALIAS we have determined that to be efficient we need a procedure that allow to calculate the interval evaluation of all the functions or only a subgroup of them in order to avoid unnecessary calculations. Hence the syntax of this procedure is:

     
    INTERVAL_VECTOR IntervalFunction (int l1,int l2,INTERVAL_VECTOR & x)
    
    This function should be written using the BIAS/Profil rules. If you have equations and inequalities in the system you must define first the equations and then the inequalities.

    The efficiency of the algorithm is heavily dependent on the way this procedure is written. Two factors are to be considered:

    Efficiency will enable to decrease the computation time of the evaluation. Let consider for example the following system:

    \begin{eqnarray*}
&& {x}^{2}+{y}^{2}-50=0\\
&&{x}^{2}-20 x+8 x\cos(\theta)+90...
...,\sin(\theta)+{y}^{2}-20 y+4 y\sin(\theta)+4 
y\cos(\theta)=0
\end{eqnarray*}

    The evaluation function may be written as:
     
    e1 = Sqr(x)+Sqr(y)-50.0 ;
    e2 =Sqr(x)-20.0*x+8.0*x*Cos(teta)+90.0-80.0*Cos(teta)+Sqr(y)+8.0*y*Sin(teta);
    e3 =Sqr(X)-6.0*x+4.0*x*Cos(teta)-4.0*x*Sin(teta)+92.0-52.0*Cos(teta)-28.0*
          Sin(teta)+SQR(Y)-20.0*y+4.0*y*Sin(teta)+4.0*y*Cos(teta);
    
    or, using temporary variables:
     
    t1 = Sqr(x);
    t2 = Sqr(y);
    t5 = Cos(teta);
    t6 = x*t5;
    t9 = Sin(teta);
    t10 = y*t9;
    e1 = t1+t2-50.0;
    e2 = t1-20.0*x+8.0*t6+90.0-80.0*t5+t2+8.0*t10;
    e3 = t1-6.0*x+4.0*t6-4.0*x*t9+92.0-52.0*t5-28.0*t9+t2-20.0*y+4.0*t10+4.0*y*t5;
    
    the second manner is more efficient as the intervals $\sin\theta,
\cos\theta, x^2, y^2$, $x\cos\theta$, $y \sin\theta$ are evaluated only once instead of 3 or 2 in the first evaluation. Note also that for speeding up the computation it may be interesting to declare the variables t1, t2, t5, t6, t9, t10 as global to avoid having to create a new interval data structure at each call of the evaluation function.

    The second point is the sharpness of the evaluation. Let consider the polynomial $x^2-x$. If the variable lie in the interval [0,1] the evaluation will lead to the interval [-1,1]. The same polynomial may we written in Horner form as $x(x-1)$ the function being then evaluated as [-1,0]. Now suppose that $x$ lie in [0.8,1.1]. The initial polynomial will be evaluated as [-0.46,0.41] while in Horner form the evaluation leads to [-0.22,0.11]. But this polynomial may also be written as $(x-1)^2+x-1$ (which is the centered form at 1) whose evaluation leads to [-0.2,0.14] which has a sharper lower bound than in the Horner form (note that Horner form is very efficient for the evaluation of a polynomial but do not lead always to the sharpest evaluation of the bounds on the polynomial although this is some time mentioned in the literature). Unfortunately there is no known method which enable to determine what is the best way to express a given function in order to get the sharpest possible bounds. For complex expression you may use the procedures MinimalCout or Code of ALIAS-Maple that try to produce the less costly formulation of a given expression.

    Another problem is the cost of the tests which are necessary to determine if the interval evaluation of one of the function does not include 0. Indeed let us assume that we have 40 equations and 7 unknowns and that we are considering a box such that the function interval all contain 0. When testing the functions we may either evaluate all the functions with one procedure call (with the risk of performing useless evaluations e.g. if the interval evaluation of the first equations does not contain 0) or evaluate the functions one after the other (at a cost of 40 procedure calls but avoiding useless equation evaluations). The best way balances the cost of procedure calls compared to the cost of equation evaluations. By default we are evaluating all the functions in one step but by setting the variable Interval_Evaluate_Equation_Alone to 1 the program will evaluate the functions one after the other.

    A last problem is the interval valuation of the equations. Indeed you may remember that some expression may not be evaluated for some ranges for the unknowns (see section 2.1.1.3). If such problem may occur a solution is to include into this procedure a test before each expression evaluation that verify if the expression is interval-valuable. If not two cases may occur:

    In the first case the interval evaluation of the expression should be set to an interval which does not include 0 (hence the algorithm will discard the box). In the second case the best strategy seems to be to set the interval evaluation of the expression to a very large interval that includes 0 (e.g. [-1e8,1e8]). Note that some filtering strategy such as the one described in section 2.17 may help to avoid some of these problems (in the example this strategy will have determined that Sqrt(x) is interval-valuable only for x in [0,10] and will have automatically set the range for x to this value).


    The order

    Basically the algorithm just bisect the box TestDomain until one of the criteria described in 2.3.4.6 is satisfied. The boxes resulting from the bisection process are stored in a list and the boxes in the list are treated sequentially. If we are looking only for one solution of the equation or for the first Nb solutions of a system (see the Stop variable) it is important to store the new boxes in the list in an order which ensure that we will treat first the boxes having the highest probability of containing a solution. Two types of ordering may be used, see section 2.3.1.2, indicated by the flag MAX_FUNCTION_ORDER or MAX_MIDDLE_FUNCTION_ORDER.

    Note that if we are looking for all the solutions of the system the order has still an importance: although all the boxes of the list will be treated the order define how close solution intervals will be distinguished (see for example section 2.3.5.2).


    Storage

    The boxes generated by the bisection process are stored in an interval matrix:

     
    Box_Solve_General_Interval(M,m)
    
    The algorithm try to manage the storage in order to solve the problem with the given number M. As seen in section 2.3.1.2 two storage modes are available, the Direct Storage and the Reverse Storage modes, which are obtained by setting the global variable Reverse_Storage to 0 (the default value) or at least to the number of unknowns plus 1. See also section 8.3 to use a mixed strategy between the direct and reverse mode.

    For both modes the algorithm will first run until the bisection of the current box leads to a total number of boxes which exceed the allowed total number. It will then delete the boxes in the list which have been already bisected, thereby freeing some storage space (usually larger for the reverse mode than for the direct mode) and will start again.

    If this is not sufficient the algorithm will consider each box in the list and determine if the bisection process applied on the box does create any new boxes otherwise the box is deleted from the list. Note that this procedure is computer intensive and constitute a "last ditch" effort to free some storage space. You can disable this feature by setting the integer variable Enable_Delete_Fast_Interval to 0. If the storage space freed by this method is not sufficient the algorithm will exit with a failure return.

    If epsilonf=0, epsilon=$\epsilon$ and $f$ is the largest width of the intervals in TestDomain, then the number of boxes that will be considered in the direct mode is $M$ with, in the worst case:

    \begin{displaymath}
M=2^{{\rm Sup}(\frac{f}{2\epsilon})+1}-1
\end{displaymath} (2.1)

    where ${\rm Sup}(\frac{f}{2\epsilon})$ is the largest integer greater than $f/2\epsilon$. In the direct storage mode the storage space $N$ will be in the worst case:
    \begin{displaymath}
N=2^{{\rm Sup}(\frac{f}{2\epsilon}})-1
\end{displaymath} (2.2)

    In the reverse storage mode the storage space is only:
    \begin{displaymath}
N=\frac{log(\frac{f}{2\epsilon})}{log(2)}+1
\end{displaymath} (2.3)

    Note that with the reverse storage mode, storage is not really a problem. For example if the width of the initial box is 1000 and the accuracy $10^{-10}$, then the necessary storage space is only 44. Thus only the computation time or the conditioning of the functions may lead to a failure of the algorithm. If epsilonf is not equal to 0 the size of the storage cannot be estimated.

    If the procedure has to be used more than once it is possible to speed up the computation by allocating the storage space before calling the procedure. Then you may indicate that the storage space has been allocated beforehand by indicating a negative value for M, the number of boxes being given by the absolute value of M.

    Note also that the bisection process applied only to one variable may lead to a better estimation of the roots of the system if the algorithm stops when the accuracy required on the variable is reached: indeed, compared to the standard algorithm, one (or more) of the variable may have been individually split before reaching the step where a full bisection will lead to a solution (see the example in section 2.4.3.2).

    Note also a specific use of ALIAS_RANDG: if this integer is not set to 0, then every ALIAS_RANDG iteration the algorithm will put the box having the largest width as current box, except if the number of boxes remaining to be processed is greater than half the total number of available boxes.


    Accuracy

    Two criterion are used to determine if a box possibly includes a solution of the system:

    If we use only the first criteria (i.e. we put epsilonf=0) the largest width of the solution intervals will be epsilon. A consequence is that the unknowns should be normalized in order that all the intervals in the TestDomain have roughly the same width.

    If we use only the second criteria the width of the solution intervals cannot be determined and the functions should be roughly normalized (see the example in section 15.1.3 for the importance of the conditioning).


    Distinct solutions

    Two solution intervals will be assumed to contain distinct solutions if the minimal distance between the middle point of all the intervals is greater than the threshold Dist.

    Return code

    The procedure will return an integer $k$


    Debugging

    If the algorithm fail some debugging options are provided. The integer variable
    Debug_Level_Solve_General_Interval indicate which level of debug is used:

    Examples and Troubleshooting


    Example 1

    We will present first a very silly example of system in the three unknowns $x(1),x(2),x(3)$:

    \begin{eqnarray*}
&& x(1)^2+2 x(1)+1=0 \\
&& x(2)^2+2 x(2)+1=0 \\
&& x(3)^2+2 x(3)+1=0
\end{eqnarray*}

    Clearly this system has the unique solution (-1,-1,-1). We choose to define the TestDomain as the interval [-2,0] for all three unknowns. So we define a function which specify the TestDomain:
     
    VOID SetTestDomain (INTERVAL_VECTOR & x)
    {
      Resize (x, 3);
      x(1) = Hull (-2.0,0.0);
      x(2) = Hull (-2.0,0.0);
      x(3) = Hull (-2.0,0.0);
    }
    
    The we have to define the IntervalFunction:
     
    INTERVAL_VECTOR IntervalTestFunction (int l1,int l2,INTERVAL_VECTOR & x)
    // interval valued functions. The input are intervals for the
    //variables and the output is intervals on the functions
    //x are the input variables and xx the function intervals
    {
    INTERVAL_VECTOR xx(3);
    
    if(l1==1)  xx(1)=x(1)*(x(1)+2)+1;
    if(l1<=2 && l2>=2)  xx(2)=x(2)*(x(2)+2)+1;
    if(l2==3)  xx(3)=x(3)*(x(3)+2)+1;
      return xx;
    }
    
    This function returns the interval vector xx which will contain the value of the function from l1 to l2 for the box x. Note that the initial functions have been written in Horner form (or "nested" form) which may lead to a sharper estimation of the function intervals.

    The main program may be written as:

     
    INT main()
    {
    int Num; //number of solution
    INTERVAL_MATRIX SolutionList(1,3);//the list of solutions
    INTERVAL_VECTOR TestDomain;//the input intervals for the variable
    
    //We set the value of the variable intervals
    SetTestDomain (TestDomain);
    
    //let's solve....
    Num=Solve_General_Interval(3,3,IntervalTestFunction,TestDomain,
          MAX_FUNCTION_ORDER,50000,1,0.001,0.0,0.1,SolutionList,1);
    //too much intervals have been created, this is a failure
    if(Num== -1)cout << "The procedure has failed (too many iterations)"<<endl;
    return 0;
    }
    
    This main program will stop as soon as one solution has been found. We set epsilonf to 0 and epsilon to 0.001 which mean that the maximal width of the interval solution will be lower than 0.001. The chosen order is the maximum equation ordering. The number of boxes should not be larger than 50000 and the distance between the distinct solutions must be greater than 0.1.

    Running this program will provide the solution interval [-1.000977,-1] for all three variables and uses 71 boxes. The result will have been similar if we have chosen the maximum middle-point equation ordering.

    On the other hand if we have epsilon to 0 and epsilonf to 0.001 the algorithm find the solution interval [-1.00048828125,-1] and use 78 boxes.

    Now let's look at a more complete test program which enable to test the various options of the procedure.

     
    INT main()
    {
    int Iterations;//maximal size of the storage
    int Dimension,Dimension_Eq; // size of the system
    int Num,i,j,order,precision,Stop;
    // accuracy of the solution either on the function  or on the variable
    double Accuracy,Accuracy_Variable,Diff_Sol; 
    INTERVAL_MATRIX SolutionList(200,3);//the list of solutions
    INTERVAL_VECTOR TestDomain;//the input intervals for the variable
    INTERVAL_VECTOR F(3),P(3);
    
    //We set the number of equations and unknowns and the value of the variable intervals
    Dimension_Eq=Dimension=3;
    SetTestDomain (TestDomain);
    
    cerr << "Number of iteration = "; cin >> Iterations;
    cerr << "Accuracy on Function = "; cin >> Accuracy;
    cerr << "Accuracy on Variable = "; cin >> Accuracy_Variable;cerr << "Order (0,1)"; cin >>order;
    cerr << "Stop at first solutions (0,1,2):";cin>>Stop_First_Sol;
    cerr << "Separation between distincts solutions:";cin>> Diff_Sol;
    
    //let's solve....
    Num=Solve_General_Interval(Dimension,Dimension_Eq,IntervalTestFunction,TestDomain,order,Iterations,Stop,
          Accuracy_Variable,Accuracy,Diff_Sol,SolutionList,1);
    //too much intervals have been created, this is a failure
    if(Num== -1){cout<<"Procedure has failed (too many iterations)"<<endl;return -1;}
    cout << Num << " solution(s)" << endl;
    
    for(i=1;i<=Num;i++)
      {
        cout << "solution " << i <<endl;
        cout << "x(1)=" << SolutionList(i,1) << endl;
        cout << "x(2)=" << SolutionList(i,2) << endl;
        cout << "x(3)=" << SolutionList(i,3) << endl;
        cout << "Function value at this point" <<endl;
        for(j=1;j<=3;j++)F(j)=SolutionList(i,j);
        cout << IntervalTestFunction(1,Dimension_Eq,F) <<endl;
        cout << "Function value at middle interval" <<endl;
        for(j=1;j<=3;j++)P(j)=Mid(SolutionList(i,j));
        F=IntervalTestFunction(1,Dimension_Eq,P);
        for(j=1;j<=3;j++)cout << Sup(F(j)) << endl;
      }
      return 0;
    }
    
    This program is basically similar to the previous one except that it enable to define interactively the order, M, Stop, epsilon, epsilonf,Dist. Then it print the solution together with the function interval for the solution interval and the value of the functions at the middle point of the solution interval. Let's test the algorithm to find all he solution (Stop=0) with epsilonf=0 and epsilon=0.001. The algorithm will fail: indeed let's compute the maximal storage that we may need using the formula (2.1). We end up with the number $2^{6000}$ which is indeed a very large number.... But (2.1) is only an upper bound for the storage. For example if we have used epsilon=0.01 in the previous formula we will find that M=$2^{600}$ although the algorithm converge toward [-1,-0.992188] while using only 5816 boxes.

    Although academic this system shows several properties of interval analysis. If we set epsilon=epsilonf=1e-6, which is the standard setting of ALIAS-Maple, the algorithm will run for a very long time before finding the solution of the system. Even using the 3B method, section 2.3.2 and the 2B method (section 2.17) the computation time, although improved, will still be very large. Now if we write the system as

    \begin{eqnarray*}
&& (x(1)+1)^2=0 \\
&& (x(2)+1)^2=0 \\
&& (x(3)+1)^2=0
\end{eqnarray*}

    the system will be solved with only 8 boxes, with 2 boxes if we use the 3B method and without any bisection if we use the 2B method. This shows clearly the importance of writing the equations in the most compact form.


    Example 2

    The problem we want to solve is presented in section 15.1.1. We consider a system of three equations in the unknowns $x,y,\theta$:

    \begin{eqnarray*}
&& {x}^{2}+{y}^{2}-50=0\\
&&{x}^{2}-20 x+8 x\cos(\theta)+90...
...,\sin(\theta)+{y}^{2}-20 y+4 y\sin(\theta)+4 
y\cos(\theta)=0
\end{eqnarray*}

    which admit the two solutions:

    \begin{displaymath}
(5,5,0)              (3.369707132, 6.216516219, -0.806783438)
\end{displaymath}

    By looking at the geometry of the problem it is easy to establish a rough TestDomain:
     
    VOID SetTestDomain (INTERVAL_VECTOR & x)
    {
      Resize (x, 3);
      x(1) = Hull (0.9,7.1);
      x(2) = Hull (2.1,7.1);
      x(3) = Hull (-Constant::Pi,Constant::Pi);
    }
    
    and to determine that the maximum number of real solution is 6. The IntervalFunction is written as:
     
    INTERVAL_VECTOR IntervalTestFunction (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL_VECTOR xx(3);
    if(l1==1)xx(1)=in(1)*in(1)+in(2)*in(2)-50.0;
    if(l1<=2 && l2>=2)
      xx(2)=-80.0*Cos(in(3))+90.0+(8.0*Sin(in(3))+in(2))*in(2)+(-20.0+8.0*Cos(in(3))+in(1))*in(1);
    if(l2==3)
     xx(3)=92.0-52.0*Cos(in(3))-28.0*Sin(in(3))+(-20.0+4.0*Sin(in(3))+
           4.0*Cos(in(3))+in(2))*in(2)+(-4.0*Sin(in(3))-6.0+4.0*Cos(in(3))+in(1))*in(1);
      return xx;
    }
    
    and we may use the same main program as in the previous example (the name of this program is
    Test_Solve_General1).

    Let's assume that we set epsilonf to 0 and epsilon to 0.01 while looking at all the solutions (Stop=0), using the maximum equation ordering and setting Dist to 0.1. The algorithm provide the following solutions,using 684 boxes:

    \begin{eqnarray*}
&&x=[4.99297,4.99902]    y=[5.00527,5.01016]    
\theta=[-0.00...
...7031]     y=[6.21133,6.21621]      
\theta=[-0.809942,-0.803806]
\end{eqnarray*}

    We notice that indeed none of the roots are contained in the solution intervals. If we use the maximum middle-point equation ordering the algorithm provide the solution intervals, using 684 boxes:

    \begin{eqnarray*}
&&x=[3.36426,3.37031]    y=[6.21621,6.22109]        
\theta=[-...
...508,5.01113]     y=[4.99063,4.99551]      
\theta=[0,0.00613592]
\end{eqnarray*}

    which still does not contain the root (5,5,0) (but contain one of the root which show the importance of the ordering). Let's look at what is happening by setting the debug flag Debug_Level_Solve_General_Interval to 2 (see section 2.3.4.9). At some point of the process the algorithm has determined four different solution intervals:

    \begin{eqnarray*}
&& {\rm Solution 1}   
[4.9990234375,5.005078125]
   [4.990625...
...
   [6.211328125,6.2162109375]
   [-0.809941856,-0.803805932852]
\end{eqnarray*}

    the criteria $C$ for the ordering being:

    \begin{eqnarray*}
&&0.300481441333159    0.329822553021982\\
&&0.293359098885522    0.416913915955262
\end{eqnarray*}

    Clearly solution 3 has the lowest criteria and will therefore be stored as the first solution. Then solution 1 will be considered: but the distance between the middle point of solution 3 and 1 is lower than Dist and therefore solution 1 will not be retained. The solution 2 will be considered but for the same reason than for solution 1 this solution will not been retained. Finally solution 4 will be considered and it spite of his index being the worse this solution will be retained as its distance to solution 3 is greater than Dist.

    Note that if the single bisection is activated and setting the flag Single_Bisection to 1 we find the two roots for epsilonf to 0 and epsilon to 0.01 with 650 boxes using the maximum equation ordering.

    We may also illustrate on this example how to deal with inequalities. Assume now that we want to deal with the same system but also with the inequality $xy-22 \le 0$. We modify the IntervalTestFunction as:

     
    INTERVAL_VECTOR IntervalTestFunction (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL x,y,teta;
    INTERVAL_VECTOR xx(4);
    
    x=in(1);y=in(2);teta=in(3);
    if(l1==1)xx(1)=x*x+y*y-50.0;
    if(l1<=2 && l2>=2)
      xx(2)=-80.0*Cos(teta)+90.0+(8.0*Sin(teta)+y)*y+(-20.0+8.0*Cos(teta)+x)*x;
    if(l1<=3 && l2>=3)
     xx(3)=92.0-52.0*Cos(teta)-28.0*Sin(teta)+(-20.0+4.0*Sin(teta)+
           4.0*Cos(teta)+y)*y+(-4.0*Sin(teta)-6.0+4.0*Cos(teta)+x)*x;
    if(l2==4)
     xx(4)=x*y-22.;
      return xx;
    }
    
    Part of the main program will be:
     
    Type(1)=0;Type(2)=0;Type(3)=0;Type(4)=-1;
    Num=Solve_General_Interval(3,4,Type,IntervalTestFunction,TestDomain,order,
    	Iterations,Stop_First_Sol,Accuracy_Variable,
    	Accuracy,Diff_Sol,SolutionList,6);
    
    Here Type(4)=-1; indicates that the fourth function is an inequality of the type $F_i(X)\le 0$. If we have to deal with the constraint $xy -22 \ge 0$ then we will use Type(4)=1;.


    Example 3

    This example is derived from example 2. We notice that in the three functions of example 2 the second degree terms of $x,y$ are for all functions $x^2+y^2$. Thus by subtracting the first function to the second and third we get a linear system in $x,y$. This system is solved and the value of $x,y$ are substituted in the first function. We get thus a system of one equation in the unknown $\theta$ (see section 15.1.2). The roots of this equation are 0,-0.806783438. The test program is Test_Solve_General2. The IntervalFunction is written as:
     
    INTERVAL_VECTOR IntervalTestFunction (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL_VECTOR xx(1);
    xx(1)=11092.0+(-25912.0+(19660.0-4840.0*Cos(in(1)))*Cos(in(1)))*Cos(in(1))+(
    -508.0+(3788.0-1600.0*Cos(in(1)))*Cos(in(1)))*Sin(in(1));
    return xx;
    }
    
    This program is implemented under the name Test_Solve_General2. With epsilonf=0 and epsilon=0.001 we get the solution intervals, using 32 boxes:

    \begin{displaymath}
\theta=[-0.0007669904,0]   
\theta=[-0.8068739,-0.8061069]
\end{displaymath}

    for whatever order. If we use epsilon=0 and epsilonf=0.1 we get, using 50 boxes:

    \begin{eqnarray*}
&& \theta=[-0.806784012741056,-0.806781016684830]\\
&& \theta=[-4.793689962142628e-05,0]
\end{eqnarray*}

    In both cases the solution intervals contain the roots of the equation.


    Example 4

    In this example (see section 15.1.3) we deal with a complex problem of three equations in three unknowns $\psi,\theta,\phi$. We are looking for a solution in the domain:

    \begin{displaymath}[4.537856054,4.886921908], [1.570796327,1.745329252], [0.6981317008,
0.8726646262]
\end{displaymath}

    The system has a solution which is approximately:

    \begin{displaymath}
4.6616603883, 1.70089818026, 0.86938888189
\end{displaymath}

    This problem is extremely ill conditioned as for the TestDomain the functions intervals are:

    \begin{displaymath}[-1.45096e+08,1.32527e+08]; [-38293.3,29151.5] ; [-36389.1,27705.7]
\end{displaymath}

    This program is implemented under the name Test_Solve_General. With espsilonf=0 and epsilon=0.001 and if we stop at the first solution we find with the maximum equation ordering:

    \begin{displaymath}
\psi=[4.664665,4.665347]    
\theta=[1.7034,1.703741]    
\phi=[0.8706193,0.8709602]
\end{displaymath}

    with 531 boxes. We may also mention the following remarks: With the maximum middle-point equation ordering we find:

    \begin{displaymath}
\psi=[4.665347,4.666029]    
\theta=[1.701355,1.701696]    
\phi=[0.8709602,0.8713011]    
\end{displaymath}

    with 203 boxes. The importance of normalizing the functions appears if we use epsilonf=0.1 and epsilon=0. If we stop at the first solution we find:

    \begin{eqnarray*}
&&\psi=[4.661660388259656,4.661660388340929]\\
&&\theta=[1.70...
...00898180284073]\\
&&\phi=[0.869388881899751,0.869388881940387]
\end{eqnarray*}

    while if we divide the first function by 1000 we find:

    \begin{eqnarray*}
&&\psi=[4.661658091884636,4.661658424779772]\\
&&\theta=[1.70...
...00898570395561]\\
&&\phi=[0.869388105618527,0.869388272066095]
\end{eqnarray*}

    in four time less computation time.

    General comments

    The advantages of the proposed algorithm is that it is easy to use and implement for a fast check. For sharp system it may provide quickly solutions with a reasonable accuracy. The drawback is that it may provide solutions intervals which does not contain roots or, worse, miss some roots if Dist is not set to 0 (see section 2.3.5.2).

    This algorithm may be used also for analysis: if we have to solve numerous systems we may use this algorithm with a low M in order to fast check if the current system may have some real roots, in which case we may consider using a more sophisticated algorithm.


    General purpose solving algorithm with Jacobian

    Mathematical background

    Assume now that we are able to compute the jacobian matrix of the system of functions. We will use this jacobian for improving the evaluation of the function intervals using two approaches:

    1. use the monotonicity of the function
    2. use the gradient for the evaluation of the function intervals
    Note also that this method may be used to solve a system composed of $n_1$ equations $F_1(X)=0,\ldots,F_{n_1}=0$, $n_2$ inequalities $G_1(X) \ge 0,\ldots,G_{n_2})X) \ge 0$ and $n_3$ inequalities $H_1(X)\le 0, \ldots,H_{n_3}(X)\le 0$. Note also that not all the function must be differentiable to use this procedure: only one of them is sufficient. In that case you will have however to set special values in the gradient function (see 2.4.2.2).

    A notable difference with the previous procedure is that we use Moore theorem (see section 3.1.1) to determine if a unique solution exists in a given box, in which case we use Krawczyk method for determining this solution (see section 2.10). Therefore if the algorithm proposes as solution a point instead of a range this imply that this solution has been obtained as the result of Moore theorem. Note however that getting a range for a solution instead of a point does not always imply that we have a singular solution. For example it may happen that the solution is exactly at one extremity of a box (see example in section 2.4.3.2) which a case that our algorithm does not handle very well. A local analysis of the solution should however confirm quickly if the solution is indeed singular.

    In addition we use also the inflation method presented in section 3.1.6 to increase the width of the box in which we may guarantee that there is a unique solution.

    Hence this algorithm allows to determine exact solutions in the sense that we determine boxes that contains a unique solution and provides a guaranteed numerical scheme that allows for the calculation of the solution.

    In the same way we use the Hansen-Sengupta version of the interval Newton method to improve the boxes (see [21]). Note that an improved interval Newton that may benefit from the structure of the system is available (see section 3.1.4).

    This algorithms allows also to determines the solutions of non-0 dimensional system, see section 2.2.


    Using the monotonicity

    For a given box we will compute the jacobian matrix using interval analysis. Each row $j$ of this interval matrix enable to get some information of the corresponding function $F_j$.

    In the first case the minimal and maximal value of $F_j$ will be obtained either for $x_i = \underline{x_i}$ or $x_i =
\overline{x_i}$ and we are able to define the value of $x_i$ to get successively the minimal and maximal value as we know the sign of the gradient. But this procedure has to be implemented recursively. Indeed we have previously computed the jacobian matrix for $x_i = (\underline{x_i},\overline{x_i})$ but now $x_i$ have a fixed value: hence a component $J_{jk}$ of the j-th row which for $x_i = (\underline{x_i},\overline{x_i})$ was such that $\underline{J_{jk}}<0$ and $\overline{J_{jk}}>0$ may now be a strictly positive or negative intervals. Consequently the minimal and maximal value will be obtained for some combination of $x_i, x_k$ in the two sets $\{\underline{x_i},\overline{x_i}\}$ and $\{
\underline{x_k},\overline{x_k}\}$. Bus as $x_k$ has now a fixed value some other component of $J_k$ may become strictly negative or positive...

    The algorithm for computing a sharper evaluation of $F_j$ is:


    $(\underline{F_j},\overline{F_j})$=Evaluate$F_j$( $(\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m})$)

    1. compute $J_j((\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m}))$
    2. let $l_1$ be the number of components of $J_j$ such that $\underline{J_{jk}}>0$ or $\overline{J_{jk}}<0$ and let $x_l,\ldots,x_p$ be the variables for which this occur
    3. if $l_1>0$
      loop: for all combination of $x_l,\ldots,x_p$ in the set $\{
\underline{x_l}, \overline{x_l},\ldots,
\underline{x_p},\overline{x_p}\}$:
      • if $l_1< m$
        • compute $J_j((\underline{x_1},\overline{x_1}),\ldots, x_l,
\ldots,x_p,\ldots,(\underline{x_m},\overline{x_m}))$
        • let $l_2$ be the number of components of $J_j$ such that $\underline{J_{jk}}>0$ or $\overline{J_{jk}}<0$
        • if $l_2 > l_1$, then $(\underline{F_u},\overline{F_u})$=Evaluate$F_j$( $(\underline{x_1},\overline{x_1}),\ldots,x_l,\ldots,x_m,\ldots,(\underline{x_m},\overline{x_m})$)
        • otherwise $(\underline{F_u},\overline{F_u})=F_j((\underline{x_1},\overline{x_1}),\ldots,x_l,\ldots,x_m,\ldots,(\underline{x_m},\overline{x_m})$)
        • if this is the first estimation of $F_j$ then $F_j=F_u$
        • otherwise
          • if $\underline{F_u}<\underline{F_j}$, then $\underline{F_j}=\underline{F_u}$
          • if $\overline{F_u}>\overline{F_j}$, then $\overline{F_j}=\overline{F_u}$
      • otherwise
        • $(\underline{F_u},\overline{F_u})=F_j((\underline{x_1},\overline{x_1}),\ldots,x_l,\ldots,x_m,\ldots,(\underline{x_m},\overline{x_m})$)
        • if this is the first estimation of $F_j$ then $F_j=F_u$
        • otherwise
          • if $\underline{F_u}<\underline{F_j}$, then $\underline{F_j}=\underline{F_u}$
          • if $\overline{F_u}>\overline{F_j}$, then $\overline{F_j}=\overline{F_u}$
    4. end loop:
    5. otherwise $(\underline{F_j},\overline{F_j})=F_j((\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m}))$
    6. return $(\underline{F_j},\overline{F_j})$
    This procedure has to be repeated for each $F_j$.

    Improving the evaluation using the Jacobian and centered form

    Let $x_j^m$ be the middle point of $(\underline{x_j},\overline{x_j})$ and $X=\{(\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m})\}$ be the box. Then:

    \begin{displaymath}
(\underline{F_j(X)},\overline{F_j(X)}) \in F_j(x_1^m,\ldots,...
...ldots,(\underline{x_i},\overline{x_i}),x_{i+1}^m,\ldots,x_m^m)
\end{displaymath} (2.4)

    see [5], pp. 52. This expression enable to get in some cases a sharper bound on $F_j$.

    This evaluation is embedded into the evaluation procedure of the solving algorithms using the Jacobian. It is also available in its general form as

     
    INTERVAL_VECTOR Centered_Form(int DimVar,int DimEq,
              INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
              VECTOR &Center,
              INTERVAL_VECTOR &Input)
    
    where A variant of this procedure is
     
    INTERVAL Centered_Form(int k,int DimVar,int DimEq,
              INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
              VECTOR &Center,
              INTERVAL_VECTOR &Input)
    
    which is used to evaluate only expression number $k$.

    A more sophisticated evaluation for the centered form is based on Baumann theorem [18]. First we define the procedure cut(double x,INTERVAL X) as:

    \begin{displaymath}
cut(x,X)=\left\{
\begin{array}{c}
\overline{X} if x\ge \ov...
...{X} if x\le \underline{X}\\
x otherwise
\end{array} \right.
\end{displaymath}

    For a system of $m$ equations $F$ in $n$ unknowns $X$ we define

    \begin{displaymath}
p^l_k = cut(\frac{Mid((\partial F^\prime_l/\partial
X_k)(X))}{Diam((\partial F^\prime_l/\partial X_k)(X))},[-1,1])
\end{displaymath}

    For a given equation $l$ we use the centered form with as center

    \begin{displaymath}
x^1_k=Mid(X_k)-p^l_kDiam(X_k)   x^2_k=Mid(X_k)-p^l_kDiam(X_k)
\end{displaymath}

    with $k$ in $[1,n]$. The choice for $x^1, x^2$ is based on the property that the lower end-point of the centered form $F(x,X)$ has a maximum at $x^2$ while its upper end-point has a minimum at $x^1$. The interval evaluation of $F_l$ is obtained as $F_l(x^1,X) \cap F_l(x^2,X)$. Although $2nm$ centered form are used to compute the interval evaluation of the $m$ equations the calculation is in fact not so expensive as the interval evaluation of the Jacobian matrix has to be done only once.

    The implementation is:

     
    INTERVAL_VECTOR BiCentered_Form(int DimVar,
              int DimEq,
              INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
              INTERVAL_VECTOR &Input,
              int Exact)
    
    where A variant for evaluating only equation number $k$ is
     
    INTERVAL_VECTOR BiCentered_Form(int k,int DimVar,
              int DimEq,
              INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
              INTERVAL_VECTOR &Input,
              int Exact)
    

    Another variant is based on the property that the numerical interval evaluation of the product J(Input)(Input-Center) may be overestimated as there may be several occurence of the same variable in this product. We may assume that this product has been computed symbolically, then re-arranged to reduce the number of occurence of the same variable leading to a procedure in MakeF format that computes directly the product. The syntax of the bicentered form procedure is

     
    INTERVAL_VECTOR BiCentered_Form(int DimVar,
              int DimEq,
              INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
              INTERVAL_VECTOR (* ProdGradient)(int, int, INTERVAL_VECTOR &), 
              INTERVAL_VECTOR &Input,
              int Exact)
    
    where ProdGradient is the procedure that computes the product J(Input)(Input-Center), being understood that the Center is available in the global variable ALIAS_Center_CenteredForm.


    Single bisection mode

    We may use the single bisection mode i.e. bisect only one variable at a time. Fives modes exist for determining the variable to be bisected, the choice being made by setting Single_Bisection to a value from 1 to 8

    The smear mode leads in general to better result than the other modes (but there are exception, see example in section 2.4.3.4). There is another mode called the mixed bisection: among the $n$ variables we will bisect $m_1 <n$ variables, which will lead to $2^{m_1}$ new boxes. This mode is obtained by setting the global integer variable ALIAS_Mixed_Bisection to $m_1-1$. Here we will order the variables according to the value of their smear function (if the flag Single_Bisection is 2 or 3) or according to their width (for 1,4,5).

    Implementation

    The algorithm is implemented in generic form as:
     
    int Solve_General_Gradient_Interval(int m,int n, 
    	  INTEGER_VECTOR Type_Eq,
              INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
              INTERVAL_MATRIX (* IntervalGradient)(int, int, INTERVAL_VECTOR &), 
              INTERVAL_VECTOR & TheDomain, 
              int Order,int M,int Stop,
              double epsilon,double epsilonf,double Dist,
              INTERVAL_MATRIX & Solution,int Nb,INTEGER_MATRIX &GI,
              int (* Simp_Proc)(INTERVAL_VECTOR &))
    
    the arguments being: Note that the following arguments may be omitted:

    Return code

    The procedure will return an integer $k$ The following variables play also a role in the computation:


    Jacobian matrix

    The user must provide a function which will compute the interval evaluation of the jacobian matrix of its particular functions for a given box. As for the function evaluation procedure we have chosen a syntax which shows the best compromise between program calls and interval calculation. The syntax of this function is:
     
    INTERVAL_MATRIX IntervalGradient (int l1,int l2,INTERVAL_VECTOR & x)
    
    This procedure returns an interval matrix grad of size m X n in which grad(l1,l2) has been computed. This function should be written using the BIAS/Profil rules.

    Note that if ALIAS_Store_Gradient has not been set to 0 we will store the simplified Jacobian matrix for each box. Indeed if for a given box B the interval evaluation of one element of the gradient has a constant sign (indicating a monotonic behavior of the function) setting the simplified jacobian matrix element to -1 or 1 allows to avoid unnecessary evaluation of the element of the Jacobian for the box resulting from a bisection of B as they will exhibit the same monotonic behavior. Although this idea may sound quite simple it has a very positive effect on the computation time. The name of the storage variable of the simplified jacobian is:

     
    INTEGER_MATRIX Gradient_Solve_General_Interval
    

    If a function ${\tt i}$ is not differentiable you just set the value of grad(i,..) to the interval [-1e30,1e30]. It is important to define here a large interval as the program Compute_Interval_Function_Gradient that computes the interval evaluation of the function by using their derivatives uses also the Taylor evaluation based on the value of the derivatives: a small interval value of the derivatives may lead to a wrong function evaluation.

    Note also that a convenient way to write the Gradient procedure is to use the procedure MakeJ offered by ALIAS-Maple (see the ALIAS-Maple manual). Take care also of the interval valuation problems of the element of the Jacobian (see section 2.1.1.3) which may be different from the one of the functions: for example if a function is $\sqrt{x}F(x,y,\ldots)$ it is interval-valuable as soon as the lower bound of $x$ is greater or equal to 0 although the gradient will involve $1/\sqrt{x}$ which is not interval-valuable if the lower bound of $x$ is equal to 0. ALIAS-Maple offers also the possibility to treat automatically the interval-valuation problems of the jacobian elements.


    Evaluation procedure using the Jacobian

    A better evaluation of the function intervals than the IntervalFunction can be obtained using the Jacobian matrix. A specific procedure can be used to obtain this evaluation:

     
    INTERVAL_VECTOR Compute_Interval_Function_Gradient(int m,int n, 
            INTEGER_VECTOR &Type_Eq,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* IntervalGradient)(int, int, INTERVAL_VECTOR &), 
            INTERVAL_VECTOR & Input, int Exact,
            INTEGER_VECTOR &AG,INTEGER_MATRIX &AR)
    
    This procedure computes the function intervals for the box Input. The parameters Type_Eq, AG, AR may be omitted. This procedure uses the derivatives for improving the interval evaluation of the functions in two different ways:

    The best evaluation of the l-th equation may be computed with

     
    INTERVAL Compute_Interval_Function_Gradient_Line(int l,int Dim_Var,
    int Dimension_Eq,
    INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
    INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
    INTERVAL_VECTOR &Input,int Exact,INTEGER_VECTOR &AG)
    


    Storage

    The boxes generated by the bisection process are stored in an interval matrix:

     
    Box_Solve_General_Interval(M,m)
    
    while the corresponding simplified Jacobian matrix is stored in the integer matrix of size (M, m $\times$ n):
     
    Gradient_Solve_General_Interval
    
    called the simplified jacobian: the entry $i,j$ of this matrix indicates for function $i$ and variable $j$ that the gradient is always positive (1), the gradient is always negative (-1), the function does not depend upon the variable (0), the gradient may have not a constant sign within the range of the variable (2). The purpose of storing the simplified gradient for each box is to avoid to re-compute a gradient as soon as it has been determined that a father of the box has already a gradient with a constant sign. This has the drawback that for large problems this storage will be also large: hence it is possible to avoid this storage by setting the variable ALIAS_Store_Gradient to 0 (its default value is 1).

    The algorithm try to manage the storage in order to solve the problem with the given number M. As seen in section 2.3.1.2 two storage modes are available, the Direct Storage and the Reverse Storage modes, which are obtained by setting the global variable Reverse_Storage to 0 (the default value) or 1.

    For both modes the algorithm will first run until the bisection of the current box leads to a total number of boxes which exceed the allowed total number. It will then delete the boxes in the list which have been already bisected, thereby freeing some storage space (usually larger for the reverse mode than for the direct mode) and will start again.

    Examples


    Example 1

    This problem has been presented in section 2.3.5.1.

    The IntervalGradient function is defined as:

     
    INTERVAL_MATRIX IntervalGradient (int l1,int l2,INTERVAL_VECTOR & x)
    {
    INTERVAL_MATRIX Grad(3,3);
    if(l1==1)
          {
          if(l2==1){Grad(1,1)=2*x(1)+2;return Grad;}
          if(l2==2){Grad(1,2)=0;return Grad;}
          if(l2==3){Grad(1,3)=0;return Grad;}
          }
    if(l1==2)
          {
          if(l2==1){Grad(2,1)=0;return Grad;}
          if(l2==2){Grad(2,2)=2*x(2)+2;return Grad;}
          if(l2==3){Grad(2,3)=0;return Grad;}
          }
    if(l1==3)
          {
          if(l2==1){Grad(3,1)=0;return Grad;}
          if(l2==2){Grad(3,2)=0;return Grad;}
          if(l2==3){Grad(3,3)=2*x(3)+2;return Grad;}
          }
    }
    
    A test main program may now be written as:
     
    INT main()
    {
    int Num,i,j,order,precision,Stop;
    // accuracy of the solution either on the function  or on the variable
    double Accuracy,Accuracy_Variable,Diff_Sol; 
    INTERVAL_MATRIX SolutionList(200,3);//the list of solutions
    INTERVAL_VECTOR TestDomain;//the input intervals for the variable
    INTERVAL_VECTOR F(3);
    
    //We set the value of the variable intervals
    SetTestDomain (TestDomain);
    
    cerr << "Accuracy on Function = "; cin >> Accuracy;
    cerr << "Accuracy on Variable = "; cin >> Accuracy_Variable;
    cerr << "Order (0,1)"; cin >>order;
    cerr << "Stop at first solutions (0,1,2):";cin>>Stop_First_Sol;
    cerr << "Separation between distincts solutions:";cin>> Diff_Sol;
    
    //let's solve....
    Num=Solve_General_Gradient_Interval(3,3,IntervalTestFunction,
            IntervalGradient,TestDomain,order,10000,Stop,
          Accuracy_Variable,Accuracy,Diff_Sol,SolutionList,1);
    
    //too much intervals have been created, this is a failure
    if(Num== -1)cout<<"Procedure has failed (too many iterations)"<<endl;
    
    //otherwise print the solution intervals
    for(i=1;i<=Num;i++)
      {
        cout << "solution " << i <<endl;
        cout << "x(1)=" << SolutionList(i,1) << endl;
        cout << "x(2)=" << SolutionList(i,2) << endl;
        cout << "x(3)=" << SolutionList(i,3) << endl;
        cout << "Function value at this point" <<endl;
        for(j=1;j<=3;j++)F(j)=SolutionList(i,j);
        cout << Compute_Interval_Function_Gradient(Dimension,Dimension_Eq,
    				     IntervalTestFunction,
    				     IntervalGradient,
    				     F,1) << endl;
      }
      return 0;
    }
    
    A property of this problem is that the Jacobian of the system is singular at the solution. Hence the unicity test cannot be verified as it needs to evaluate the inverse of the jacobian matrix (as will fail the classical Newton scheme, see section 2.9, that needs also the inverse Jacobian). But even with epsilon=epsilonf=1e-6 the algorithm is able to find an approximation of the solution with 16 boxes only. Interestingly this is a case where the 3B method is not efficient at all: with the 3B method the number of boxes increases to over 100 000. This is quite normal: as the 3B method is used before the interval Newton method it reduces the range for the unknowns toward a region where the interval Newton method will fail as the Jacobian is close to a singularity. We therefore end up with a solving that is only based on the bisection process and we have seen that this process behaves poorly for this system. But if we mix the 3B method and the 2B filtering of section 2.17 then the solving needs only 1 box.


    Example 2

    The problem we want to solve has been presented in section 2.3.5.2,15.1.1.

    The IntervalGradient procedure is:

     
    INTERVAL_MATRIX IntervalGradient (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL_MATRIX Grad(3,3);
    INTERVAL x,y,teta;
    x=in(1);y=in(2);teta=in(3);
    if(l1==1)
      {
        if(l2==1){Grad(1,1)=2*x;return Grad;}
        if(l2==2){Grad(1,2)=2*y;return Grad;}
        if(l2==3){Grad(1,3)=0;return Grad;}
      }
    if(l1==2)
      {
        if(l2==1){Grad(2,1)=2.0*x-20.0+8.0*Cos(teta);return Grad;}
        if(l2==2){Grad(2,2)=2.0*y+8.0*Sin(teta);return Grad;}
        if(l2==3){Grad(2,3)=-8.0*x*Sin(teta)+80.0*Sin(teta)+8.0*y*Cos(teta);
          return Grad;}
      }
    if(l1==3)
      {
        if(l2==1){Grad(3,1)=2.0*x-6.0+4.0*Cos(teta)-4.0*Sin(teta);
          return Grad;}
        if(l2==2){Grad(3,2)=2.0*y-20.0+4.0*Sin(teta)+4.0*Cos(teta);return Grad;}
        if(l2==3){
          Grad(3,3)=52.0*Sin(teta)-28.0*Cos(teta)+(4.0*Cos(teta)-
                  4.0*Sin(teta))*y+(-4.0*Sin(teta)-4.0*Cos(teta))*x;
          return Grad;
                 }
      }
    }
    
    We may use the same main program as in the previous example (the name of this program is
    Test_Solve_General1_Gradient).

    Let's assume that we set epsilonf to 0 and epsilon to 0.01 while looking at all the solutions (Stop=0), using the maximum equation ordering and setting Dist to 0.1. The algorithm provide the following solutions after using 55 boxes:

    \begin{eqnarray*}
&&x=[3.36607,3.37306]    y=[6.21468,6.21856]   \theta
=[-0.808...
...99845,5.00165]    y=[4.99845,5.00146]   \theta =[-0.000536641,0]
\end{eqnarray*}

    We notice that all of the roots are contained in the solution intervals.

    If we use the maximum middle-point equation ordering the algorithm provide the same solution intervals.

    With epsilonf=0.001, epsilon=0 the algorithm still find exactly the root with 55 boxes and a computation time of 7010ms. Here Moore test may have failed as the solution in $\theta$ is 0, which correspond exactly to split point in the bisection process: it may be useful to break the symmetry in the test domain.

    Using the single bisection mode and setting the flag Single_Bisection to 2 enable to reduce the number of boxes to 33 and the computation time to 3580ms for epsilonf=0.00001.

    Note that we may improve the efficiency of the procedure by using simplification procedures such as the 2B (section 2.17) and the 3B method. In that case for epsilonf=1e-6, epsilon=1e-6 the number of boxes will have been reduced to 7. Note that the solution [5,5,0] is still not guaranteed. But using a search space of $[-\pi,\pi+1]$ for $\theta$ allow the Moore test to guarantee both solutions.


    Example 3

    This example is derived from example 2. We notice that in the three equations of example 2 the second degree terms of $x,y$ are for all functions $x^2+y^2$. Thus by subtracting the first function to the second and third we get a linear system in $x,y$. This system is solved and the value of $x,y$ are substituted in the first function. We get thus a system of one equation in the unknown $\theta$ (see section 15.1.2). The roots of this equation are 0,-0.8067834380. The test program is Test_Solve_Gradient_General2. The IntervalGradient function is written as:
     
    INTERVAL_MATRIX IntervalGradient (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL_MATRIX Grad(1,1);
    Grad(1,1)=-3788.0+(2692.0+(7576.0-4800.0*Cos(in(1)))*Cos(in(1)))*Cos(in(1))+(
    25912.0+(-39320.0+14520.0*Cos(in(1)))*Cos(in(1)))*Sin(in(1));
    return Grad;
    }
    
    epsilonf=0 and epsilon=0.001 we get the solution by using 8 boxes:

    \begin{displaymath}
\theta=[-0.0005316396,0]     
\theta=[-0.8071015,-0.8063247]
\end{displaymath}

    for whatever order. The solution intervals contain the roots of the equation. If we use epsilon=0 and epsilonf=0.1 we get by using 8 boxes:

    \begin{displaymath}
\theta=[-0.8067839277]    
\theta=[-5.4136579e-16]
\end{displaymath}

    Here we get a unique solution and a range solution. But we notice that the solution 0 is exactly the middle point of the test domain: Moore test will fail as 0 will always be an end-point of the range. If we break the symmetry of the test domain we will get exactly both solutions.


    Example 4

    In this example (see section 15.1.3) we deal with a complex problem of three equations in three unknowns $\psi,\theta,\phi$. We are looking for a solution in the domain:

    \begin{displaymath}[4.537856054,4.886921908], [1.570796327,1.745329252], [0.6981317008,
0.8726646262]
\end{displaymath}

    The system has a solution which is approximatively:

    \begin{displaymath}
4.6616603883, 1.70089818026, 0.86938888189
\end{displaymath}

    This problem is extremely ill conditioned as for the TestDomain the functions intervals are:

    \begin{displaymath}[-1.45096e+08,1.32527e+08]; [-38293.3,29151.5] ; [-36389.1,27705.7]
\end{displaymath}

    The name of the test program is Test_Solve_General_Gradient. With espsilonf=0 and epsilon=0.001 and if we stop at the first solution we find with the maximum equation ordering:

    \begin{displaymath}
\psi=[4.661209,4.661906]     
\theta=[1.700332,1.701014]     
\phi=[0.8693552,0.8696447]
\end{displaymath}

    with 73 boxes using the direct storage mode (with the reverse storage mode only 37 boxes are needed). With the maximum middle-point equation ordering we find the same intervals with 67 boxes in the direct storage mode (41 for the reverse storage mode).

    The importance of normalizing the functions has been mentioned in section 2.3.5.4. But in this example the use of the Jacobian matrix enable to drastically reduce the computation time. If we use epsilonf=0.1 and epsilon=0 and if we stop at the first solution we find an exact solution using 73 boxes:

    \begin{displaymath}
\psi=[4.66166]   
\theta=[1.700898]    
\phi=[0.869388]
\end{displaymath}

    in a time which is about 1/100 of the time necessary when we don't use the Jacobian.

    In the single bisection smear mode (i.e. only one variable is bisected in the process) the same root is obtained in 21080ms (about 50% less time than when using the full bisection mode) with only 40 boxes in the direct storage mode.

    Note that for epsilonf=0.1 and epsilon=0 we find the only root with 73 boxes in 39760ms (41 boxes and 24650ms in the single bisection smear mode).

    Note that we may improve the efficiency of the procedure by using simplification procedures such as the 2B (section 2.17) and the 3B method. An interesting point in this example is that the bisection mode 1 (bisecting along the variable whose interval has the largest diameter) is more effective than using the bisection mode 2 (using the smear function) with 53 boxes against 108 for the mode 2 for epsilonf=1e-6 and epsilon=1e-6. This can easily been explained by the complexity of the Jacobian matrix elements that leads to a large overestimation of their values when using interval: in that case the smear function is not very efficient to determine which variable has the most influence on the equations.

    General comments

    According to the system this procedure may not be especially faster than the general purpose algorithm but the number of necessary boxes is in general drastically reduced. Furthermore the use of Moore test and interval Newton method enable in many cases to determine exactly the solutions.


    General purpose solving algorithm with Jacobian and Hessian

    Mathematical background

    In this new algorithm we will try to improve the evaluation of the function intervals by using the Hessian of the functions. This improvement is based on a sharper analysis of the monotonicity of the functions which in turn is based on a sharper evaluation of the Jacobian matrix of the system. The element $J_{ij}$ of the Jacobian matrix at row $i$ and column $j$ is:

    \begin{displaymath}
\frac{\partial F_i}{\partial x_j}
\end{displaymath}

    Now consider the corresponding line of the Hessian matrix which is:

    \begin{displaymath}
H_{ij}=\frac{\partial F_i}{\partial x_j\partial x_k}  {\rm with } k\in[1,m]
\end{displaymath}

    The elements of this line indicate the monotonic behavior of the elements of the Jacobian matrix. If some elements in this line have a constant sign, then the elements of the Jacobian are monotonic with respect to some of the unknowns and using this monotonicity we may improve the evaluation of the element of the Jacobian matrix. This improvement has to be applied recursively: indeed as we will evaluate the Jacobian elements for boxes in which some components have now a fixed value the evaluation of the Hessian matrix for these boxes may lead to a larger number of the component of the Hessian which have a constant sign. The recursion will stop if all the component of the Hessian line have a constant sign or if the number of component with a constant sign does not increase.

    Note also that not all the function must be differentiable to use this procedure: only one of them is sufficient. In that case you will have however to set special values in the gradient and hessian function (see 2.4.2.2).

    We will also use the Hessian in order to try to sharpen the evaluation of $J_{ij}$. Let the box intervals be $\{ (\underline{x_1},\overline{x_1}),\ldots,(\underline{x_n},\overline{x_n})\}$. Let $x_j^m$ be the middle point of $(\underline{x_j},\overline{x_j})$ and $X=\{(\underline{x_1},\overline{x_1}),\ldots,(\underline{x_m},\overline{x_m})\}$ be the box. Then:

    \begin{displaymath}
(\underline{J_{ij}(X)},\overline{J_{ij}(X)}) \in J_{ij}(x_1^...
...ldots,(\underline{x_i},\overline{x_i}),x_{i+1}^m,\ldots,x_m^m)
\end{displaymath} (2.5)

    see [5], pp. 52. This expression enable to get in some cases a sharper bound on $J_{ij}$.

    The improvement of the evaluation of the function intervals is due to the fact that a sharper evaluation of the Jacobian matrix may lead to a larger number of Jacobian elements with constant sign than with the direct evaluation of the Jacobian matrix. To speed up the algorithm we store the Jacobian matrix of each box: this enable to avoid the evaluation of the components of the Jacobian matrix which have a constant sign when bisecting the box

    Another interest of the Hessian is that it enable to use Kantorovitch theorem. This theorem (see section 3.1.2) can be applied if the number of unknowns is equal to the number of equations and enable to determine boxes in which there is an unique solution, solution which can be found using Newton method (see section 2.9) with as estimate of the solution any point within the boxes.

    We will use this theorem at three possible levels:

    Furthermore we use the inflation method presented in section 3.1.6 to increase the width of the box in which we may guarantee that there is a unique solution.

    As for the method using only the gradient we use the Hansen-Sengupta version of the interval Newton method to improve the boxes (see [21]).


    Single bisection mode

    Instead of bisecting all the variables we may bisect only one of the variable. The criteria for determining which of the variable will be bisected is identical to the one presented in section 2.4.1.3 for the mode up to 5. The mode 6 is different: basically we will bisect the variable having the largest diameter except that it is supposed that a weight is assigned to each variable and the diameter used by the bisection process is the diameter of the range for the variable multiplied by the weight of the variable. The weight must be indicated in the vector ALIAS_Bisection_Weight.

    Implementation

    The generic implementation of this solving procedure is:
     
    int Solve_General_JH_Interval(int Dimension_Var,int Dimension_Eq,
    			      INTEGER_VECTOR &Type_Eq,
    	INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
    		INTERVAL_MATRIX (* Gradient)(int, int,INTERVAL_VECTOR &), 
    		INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
    		INTERVAL_VECTOR & TheDomain, 
    		int Order,
    		int Iteration,
    		int Stop_First_Sol,
    		double Accuracy_Variable,
    		double Accuracy,
    		INTERVAL_MATRIX & Solution,
                    INTEGER_VECTOR & Is_Kanto,
    		int Apply_Kanto,
    			      int Nb_Max_Solution,INTERVAL_MATRIX &Grad_Init,
    			      int (* Simp_Proc)(INTERVAL_VECTOR &),
    			      int (* Local_Newton)(int Dimension,int Dimension_Eq,
    	  INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
    	  INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &),
    	  VECTOR &Input,double Accuracy,int Max_Iter, VECTOR &Residu,INTERVAL_VECTOR &In))
    
    the arguments being:

    Note that the following arguments may be omitted:

    The following variables play also a role in the computation:


    Hessian procedure

    The syntax of this function is:
     
    Hess=INTERVAL_MATRIX IntervalHessian (int l1,int l2,INTERVAL_VECTOR & in)
    
    This procedure should return an interval matrix of size m $\times$ n, m in which the Hessian of function numbered l1 to l2 has been updated (function number start at 1). The Hessian matrix of function $i$ (which is of size n $\times$ m) is stored at location Hess(($i$-1)m+1$\ldots$$i$m,1$\ldots$m). Remember that for each function the Hessian matrix is symmetric: this fact should be used in order to speed up the evaluation of this matrix. If a function in the system is not $C^2$ you set all the elements of its hessian matrix to the interval [-1e30,1e30]. Remember also here to verify that each element of the Hessian should be interval-valuable (see section 2.1.1.3).


    Storage

    The boxes generated by the bisection process are stored in an interval matrix:
     
    Box_Solve_General_Interval(M,m)
    
    while the corresponding Jacobian matrix is stored in the interval matrix of size (M, m $\times$ n):
     
    Gradient_Solve_JH_Interval
    
    The purpose of storing the gradient for each box is to avoid to re-compute a gradient as soon as it has been determined that a father of the box has already a gradient with a constant sign. This has the drawback that for large problems this storage will be also large: hence it is possible to avoid this storage by setting the variable ALIAS_Store_Gradient to 0 (its default value is 1). Note that here we store the interval gradient matrix and not the simplified gradient matrix as in the solving procedure involving only the Jacobian.

    The algorithm try to manage the storage in order to solve the problem with the given number M (see section 2.3.1.2). As seen in section 2.3.1.2 two storage modes are available, the Direct Storage and the Reverse Storage modes, which are obtained by setting the global variable Reverse_Storage to 0 (the default value) or to the number of unknowns+1.

    For both modes the algorithm will first run until the bisection of the current box leads to a total number of boxes which exceed the allowed total number. It will then delete the boxes in the list which have been already bisected, thereby freeing some storage space (usually larger for the reverse mode than for the direct mode) and will start again.

    If the procedure has to be used more than once it is possible to speed up the computation by allocating the storage space before calling the procedure. Then you may indicate that the storage space has been allocated beforehand by indicating a negative value for M, the number of boxes being given by the absolute value of M.

    Improvement of the function evaluation and of the Jacobian

    An improved value of the Jacobian is obtained by taking account its derivative in the procedure:

     
    INTERVAL_MATRIX Compute_Best_Gradient_Interval(int Dimension,
                        int Dimension_Eq,
                        INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_VECTOR &Input,
                        int Exact,INTERVAL_MATRIX &InGrad)
    
    where

    To compute only the best value of the jacobian element at l-th row nad j-th column you may use:

     
    INTERVAL Compute_Best_Gradient_Interval_line(int l,int j,int Dim,
                        int Dimension_Eq,
                        INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_VECTOR &Input,int Exact)
    

    We may also obtain the best interval evaluation of the equations through the procedure

     
    INTERVAL_VECTOR Compute_Interval_Function_Gradient(int Dimension,
                        int Dimension_Eq,
                        INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &), 
                        INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
                        INTERVAL_VECTOR &Input, int Exact)
    

    Return code and debug

    The procedure will return an integer $k$ As for the debug option they have been presented in the section 2.3.4.9.

    Examples


    Example 2

    The problem we want to solve has been presented in section 2.3.5.2 and section 15.1.1. The name of the test program is Test_Solve_JH_General1.

    If epsilonf=0.01 the program using the maximum equation ordering find the two solutions using 50 boxes.

    If we use the single bisection smear mode the program using the maximum equation ordering find the two solutions in 4260ms using 32 boxes.

    Note that we may improve the efficiency of the procedure by using simplification procedures such as the 2B (section 2.17) and the 3B method. In that case for epsilonf=1e-6, epsilon=1e-6 the number of boxes will have been reduced to 2 and both solutions will be guaranteed. even with a search space of $[-\pi,\pi]$ for $\theta$.

    Example 3

    This example is derived from example 2 and has been presented in section 2.3.5.3. The test program is Test_Solve_JH_General2. The IntervalFunction and IntervalGradient have been presented before. The IntervalHessian is:
     
    INTERVAL_MATRIX IntervalHessian (int l1,int l2,INTERVAL_VECTOR & in)
    {
    INTERVAL_MATRIX Hess(1,1);
    Hess(1,1)=39320.0+(-3128.0+(-78640.0+43560.0*Cos(in(1)))*
          Cos(in(1)))*Cos(in(1))+(-2692.0+(-15152.0+14400.0*Cos(in(1)))*
          Cos(in(1)))*Sin(in(1));
    return Hess;
    }
    
    A test main program may be written as:
     
    INT main()
    {
    int Iterations,Dimension,Dimension_Eq,Apply_Kanto;
    int Num,i,j,order,Stop_First_Sol,precision,niter,nn;
    double Accuracy,Accuracy_Variable,eps;
    INTERVAL_MATRIX SolutionList(200,3);
    INTERVAL_VECTOR TestDomain,F(1),P(1),H(3);
    VECTOR TR(1),Residu(1);
    INTEGER_VECTOR Is_Kanto(6);
    
    Dimension_Eq=Dimension=1;SetTestDomain (TestDomain);
    
    cerr << "Number of iteration = "; cin >> Iterations;
    cerr << "Accuracy on Function = "; cin >> Accuracy;
    cerr << "Accuracy on Variable = "; cin >> Accuracy_Variable;
    cerr << "Debug Level (0,1,2)="; cin >> Debug_Level_Solve_General_Interval;
    cerr << "Order (0,1)="; cin >>order;
    cerr << "Stop at first solutions (0,1,2)=";cin>>Stop_First_Sol;
    cerr << "Apply Kanto (0,1)=";cin>>Apply_Kanto;
    
    Num=Solve_General_JH_Interval(Dimension,Dimension_Eq,
          IntervalTestFunction,IntervalGradient,IntervalHessian,
          TestDomain,order,Iterations,Stop_First_Sol,Accuracy_Variable,
          Accuracy,SolutionList,Is_Kanto,Apply_Kanto,6);
    
    if(Num== -1){cout << "The procedure has failed (too many iterations)"<<endl;return 0;}
    cout << Num << " solution(s)" << endl;
    for(i=1;i<=Num;i++)
      {
        cout<<"solution "<<i<<endl;cout<<"teta="<<SolutionList(i,1)<<endl;
        cout << "Function value at this point" <<endl;F(1)=SolutionList(i,1);
        cout << Compute_Interval_Function_Gradient(Dimension,Dimension_Eq,
    				     IntervalTestFunction,IntervalGradient,
          			     IntervalHessian,F,1) << endl;
        cout << "Function value at middle interval" <<endl;
        P(1)=Mid(SolutionList(i,1));    F=IntervalTestFunction(1,Dimension_Eq,P);
        cout << Sup(F(1)) << endl;    TR(1)=Mid(SolutionList(i,1));
        if(Is_Kanto(i)==1)cout << "This solution is Kanto" <<endl;
        else cout << "This solution is not Kanto" << endl;
        if(Kantorovitch(Dimension,IntervalTestFunction,IntervalGradient,
                                 IntervalHessian,TR,&eps)==1)
          {
    	P(1)=INTERVAL(TR(1)-eps,TR(1)+eps);
    	cout << "Unique solution in: " <<P << endl;
          }
        if(Is_Kanto(i)==1)
          {
          nn=Newton(Dimension,IntervalTestFunction,IntervalGradient,TR,Accuracy,1000,Residu);
          if(nn>0)
            {
              cout << "Newton iteration converge toward: " << endl;
              cout << TR << "with residu= " << Residu<< endl;
            }
          else
            {
              if(nn==0)cout << "Newton does not converge" << endl;
              if(nn== -1)cout<<"Newton has encountered a singular matrix"<<endl;
            }
          }
      }
      return 0;
    }
    
    With epsilonf=0.0001 and epsilon=0.001 , using Kantorovitch at level 1, we get the solution intervals, using 4 boxes:

    \begin{displaymath}
\theta=[-0.04244333,0.1295874]    
\theta=[-0.8376338,-0.7968275]
\end{displaymath}

    Newton method initialized with the center of these boxes converge toward 4.08282e-15 and -0.8067834.

    Example 4

    In this example (see section 15.1.3) we deal with a complex problem of three equations in three unknowns $\psi,\theta,\phi$. We are looking for a solution in the domain:

    \begin{displaymath}[4.537856054,4.886921908], [1.570796327,1.745329252], [0.6981317008,
0.8726646262]
\end{displaymath}

    The system has a solution which is approximatively:

    \begin{displaymath}
4.6616603883, 1.70089818026, 0.86938888189
\end{displaymath}

    This problem is extremely ill conditioned as for the TestDomain the equations intervals are:

    \begin{displaymath}[-1.45096e+08,1.32527e+08]; [-38293.3,29151.5] ; [-36389.1,27705.7]
\end{displaymath}

    The name of the test program is Test_Solve_JH_General.

    If we use epsilonf=0.1 and epsilon=0, we get the first solution with the following number of boxes and computation time:

    full bisection smear bisection
    direct storage 71, 31200ms 38, 20220ms
    reverse storage 35, 14710ms 38, 20280ms
    We need 71 boxes in the full bisection mode to determine that this solution is unique and 38 boxes in the smear bisection mode.

    Note that we may improve the efficiency of the procedure by using simplification procedures such as the 2B (section 2.17) and the 3B method. An interesting point in this example is that the bisection mode 1 (bisecting along the variable whose interval has the largest diameter) is more effective than using the bisection mode 2 (using the smear function) with 33 boxes against 38 for the mode 2 for epsilonf=1e-6 and epsilon=1e-6. This can easily been explained by the complexity of the Jacobian matrix elements that leads to a large overestimation of their values when using interval: in that case the smear function is not very efficient to determine which variable has the most influence on the equations. But it must be noted that the use of the Hessian allows to reduce this overestimation and consequently the differences in term of used boxes between the two bisection mode is slightly reduced compared to the one we have observed when using only the Jacobian (see section 2.4.3.4).

    Stopping the general solving procedures

    It may be interesting to stop the solving procedures although the algorithm has not been completed. We have already seen that a possible mean to do that was to specify a number of roots in such way that the procedure will exit as soon it has found this number of roots.

    Another possible way to stop the calculation is to use a time-out mechanism. For that purpose you may define in the double ALIAS_TimeOut the maximum number of minutes allowed for the calculation. If this number is reached (approximatively) the procedure will exit and will set the flag ALIAS_TimeOut_Activated to 1.

    The solution that have been found by the algorithms are stored in the interval matrix ALIAS_Solution and their number is ALIAS_Nb_Solution. Note that for the procedures involving the Jacobian this matrix will usually describes the boxes that include a unique solution.


    Ridder method for solving one equation

    Mathematical background

    Ridder method is an iterative scheme used to obtain one root of the equation $F(x)=0$ within an interval $[x_1,x_2]$. It assumes that $F(x_1)F(x_2)<0$. Let $x_3$ be the mid-point of the interval $[x_1,x_2]$. A new estimate of the root is $x_4$ with:

    \begin{displaymath}
x_4=x_3+(x_3-x_1)\frac{{\rm
sign(F(x_1)-F(x_2))F(x_3)}}{\sqrt{F(x_3)^2-F(x_1)F(x_2)}}
\end{displaymath}

    under the assumption $F(x_1)F(x_2)<0$ it may be seen that $x_4$ is guaranteed to lie within the interval $[x_1,x_2]$. As soon as $x_4$ as been determined we choose as new $[x_1,x_2]$ the interval $[x_1,x_4]$ if $F(x_1)F(x_4)<0$ or $[x_4,x_2]$ if $F(x_2)F(x_4)<0$. The convergence of this algorithm is quadratic.

    Implementation

    Ridder's method enable to find a root of an equation $F(x)=0$ as soon as the root is bracketed in an interval $[x_1,x_2]$ such that $F(x_1)F(x_2)<0$. It is implemented as:
     
    int Ridder(REAL (* TheFunction)(REAL),INTERVAL &Input,
          double AccuracyV,double Accuracy,int Max_Iter,double *Sol, double *Residu)
    
    with: The procedure returns: The test program Test_Ridder2 present a program to solve the trigonometric equation presented as example 2 (see section 15.1.1).


    Brent method for solving one equation

    Mathematical background

    Brent method is an iterative scheme used to obtain one root of the equation $F(x)=0$ within an interval $[x_1,x_2]$. It assumes that $F(x_1)F(x_2)<0$. Let $x_3$ be the mid-point of the interval $[x_1,x_2]$. A new estimate of the root is $x_4$ with:

    \begin{displaymath}
x_4=x_3+\frac{P}{Q}
\end{displaymath}

    with:

    \begin{eqnarray*}
R&=&
\frac{F(x_3)}{F(x_2)}    S=\frac{F(X_3)}{F(x_1)}     T=\f...
...\\
P&=&S[T(R-T)(x_2-x_3)-(1-R)(x_3-x_1)]\\
Q&=&(T-1)(R-1)(S-1)
\end{eqnarray*}

    In this method $x_3$ is considered to be the current estimate of the solution. The term $P/Q$ is a correction factor: when this factor leads to a new estimate of the solution outside the interval we use a bisection method to compute a new interval $[x_1,x_2]$. In other words if $F(x_1)F(x_3)<0$ the new interval is $[x_1,x_3]$ and if $F(x_2)F(x_3)<0$ the new interval is $[x_3,x_2]$. Therefore Brent method is a cross between a bisection method and a super-linear method which insure that the estimate of the solution always lie within the interval $[x_1,x_2]$.

    Implementation

    Brent's method enable to find a root of an equation $F(x)=0$ as soon as the root is bracketed in an interval $[x_1,x_2]$ such that $F(x_1)F(x_2)<0$. It is implemented as:
     
    int Brent(REAL (* TheFunction)(REAL),INTERVAL &Input,
          double AccuracyV,double Accuracy,int Max_Iter,double *Sol, double *Residu)
    
    with: The procedure returns: The test program Test_Ridder2 present a program to solve the trigonometric equation presented as example 2 (see section 15.1.1).


    Newton method for solving systems of equations

    Mathematical background

    Let ${\cal F}$ be a system of $n$ equations in the $n$ unknowns ${\bf
x}$ and ${\bf x_0}$ be an estimate of the solution of the system. Let $J$ be the Jacobian matrix of the system of equation. Then the iterative scheme defined by:

    \begin{displaymath}
{\bf x_{k+1}}= {\bf x_k}+ J^{-1}({\bf x_k}){\cal F}({\bf x_k})
\end{displaymath} (2.6)

    starting with ${\bf x_0}$ may converge toward a solution of the system.

    A simplified Newton method consist in using a constant matrix in the classical Newton method, for example the inverse Jacobian matrix at some point like ${\bf x_0}$. The iterative scheme become:

    \begin{displaymath}
{\bf x_{k+1}}= {\bf x_k}+ J^{-1}({\bf x_0}){\cal F}({\bf x_k})
\end{displaymath} (2.7)

    Although the simplified method may need a larger number of iteration before converging than the classical scheme each iteration has a lower computation time as there is no computation of the inverse of the Jacobian matrix. This method may also encounter convergence problem as it has a convergence ball smaller than the classical Newton method.

    Newton method has advantages and drawbacks that need to be known in order to use it in the best way:

    Implementation

    The procedure for using Newton method is:
     
    int Newton(int n,VECTOR (* TheFunction)(VECTOR &), 
            MATRIX (* Gradient)(VECTOR &),
            VECTOR &Input,double Accuracy,int MaxIter,VECTOR &Residu)
    
    with

    Note that it also possible to use in the Newton method the interval evaluation of the equation and of the Jacobian matrix which are necessary for the general purpose solving algorithm with Jacobian (see section 2.4). The syntax of this implementation is:

     
    int Newton(int Dimension,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* IntervalGradient)(int, int, INTERVAL_VECTOR &),
            VECTOR &Input,double Accuracy,int MaxIter,VECTOR &Residu)
    

    To avoid overflow problem it is possible to use the vectors ALIAS_Newton_Max_Dim, (ALIAS_Newton_Min_Dim) that must be resized to the number of unknowns and in which will be indicated the maximal (minimal) possible value of each variable after each Newton operation. If one of these values is exceeded Newton will return 0.

    The version of Newton method with constant $J^{-1}$ matrix is implemented as:

     
    int Newton(int n,VECTOR (* TheFunction)(VECTOR &), 
            MATRIX &InvGrad,VECTOR &Input,double Accuracy,int MaxIter,VECTOR &Residu)
    
    There is a special implementation of Newton method for univariate polynomial $P$:
     
    int Newton(int Degree,REAL *Input,VECTOR &Coeff,double Accuracy,int Max_Iter,REAL *Residu)
    
    with: In that case we may have a problem if the Accuracy cannot be reached due to numerical errors. If you have determined that Newton should converge (using for example Kantorovitch theorem, see section 3.1.2) then you may use the procedure Newton_Safe with the same argument: this procedure will return the solution which has led to the lowest Residu during the Newton scheme.

    An example of use of the Newton method is presented in section 15.1.1, where it is compared to alternative methods.

    Return value


    Functions

    The procedures TheFunction and IntervalFunction should be user written. They return the value of the equations (either as a vector of REAL or as a vector of INTERVAL, see 2.3.4.3) for given values or intervals for the unknowns. They take one argument which is the vector of REAL which describe the unknowns.

    In the same way the procedures Gradient and IntervalGradient should be user written. They return the Jacobian matrix of the system of equations of the equations (either as a matrix of REAL or as a matrix of INTERVAL) for given values of the unknowns. The Gradient procedure take one argument which is the vector of REAL which describe the unknowns. The IntervalGradient procedure has three arguments and is described in section 2.4.2.2.

    Systematic use of Newton

    It may be interesting to systematically use the Newton scheme in a solving procedure in order to quickly determine the solutions of a system of equations.

    For that purpose we may use the TryNewton procedure whose purpose is to run a few iterations of the Newton scheme for a given box. The syntax of this procedure is:

     
    int TryNewton(int DimensionEq,int DimVar,
                  INTERVAL_VECTOR (* TheIntervalFunction)(int,int,INTERVAL_VECTOR &),
                  INTERVAL_MATRIX (* Gradient)(int, int, INTERVAL_VECTOR &), 
                  INTERVAL_MATRIX (* Hessian)(int, int, INTERVAL_VECTOR &), 
                  double Accuracy,
                  int MaxIter,
                  INTERVAL_VECTOR &Input,
                  INTERVAL_VECTOR &Domain,
                  INTERVAL_VECTOR &UnicityBox)
    
    where The mid-point of Input is used as initial guess of the Newton scheme. The parameters Accuracy is used in the Newton scheme to determine if Newton has converged i.e. if the residues are lower than Accuracy. A maximum of MaxIter iterations are performed.

    If the Newton scheme converges, the presence of a single solution in the neighborhood of the approximated solution is checked by using the Kantorovitch theorem (see section 3.1.2). If this check is positive, then a ball that includes this single solution is determined and returned in UnicityBox. If the flag ALIAS_Epsilon_Inflation is set to 1, then the inflation scheme is used to try to enlarge this unicity box.

    This procedure returns 11 if an unicity box has been determined, 0 otherwise. Note that this procedure is already embedded in HessianSolve.


    Krawczyk method for solving systems of equations

    Mathematical background

    Let ${\cal F}$ be a system of $n$ equations in the $n$ unknowns ${\bf
x}$. Let ${\bf X}$ be a range vector for ${\bf
x}$ and $y_0=Mid({\bf
X})$. Let $r_0$ be the norm of the matrix $I-YF^\prime({\bf X})$. Let the following iterative scheme for $k\ge 1$:

    \begin{eqnarray*}
&&y_k=Mid({\bf X}_k)\\
&&Y_k=\left\{ \begin{array}{l}
(Mid(F...
...} \right.\\
&&r_k=\vert\vert I-Y_kF^\prime({\bf X}_k)\vert\vert
\end{eqnarray*}

    Let define $K$ as:

    \begin{displaymath}
K({\bf X})=y-YF(y)+\{I-YF^\prime({\bf X})\}({\bf X}-y)
\end{displaymath}

    If

    \begin{displaymath}
K({\bf X}_0) \subseteq {\bf X}_0   {\rm and}  r_0<1
\end{displaymath}

    then the previous iterative scheme will converge to the unique solution of $F$ in ${\bf X}$ [8]. The procedure described in section 3.1.1 enable to verify if the scheme will be convergent.

    Implementation

    The procedure is implemented as:
     
    int Krawczyk_Solver(int m,int n,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* IntervalGradient)(int, int, INTERVAL_VECTOR &),
            INTERVAL_VECTOR &Input,double Acc , VECTOR &Result)
    
    with The procedure will return 1 if it has converged to a solution, 0 or -1 otherwise.


    Solving univariate polynomial with interval analysis

    Mathematical background

    Clearly interval analysis may be used for solving univariate polynomial, especially if we are looking for some roots within a specific interval (note that for generic polynomial we may always obtain intervals in which lie the positive and negative roots using the algorithm described in section 5.2).

    Implementation

    The algorithm we have implemented is a direct derivation of the general purpose solving algorithm with Jacobian and Hessian in which these values are automatically derived. To isolate the roots we use the Kantorovitch theorem (which may also optionally be used during the resolution, see section 3.1.2). To eliminate boxes during the bisection process we use the safe Budan-Fourier method (see section 5.5.2).
     
    int Solve_UP_JH_Interval(int Degree,VECTOR Coeff, 
                INTERVAL & TheDomain, 
                int Order,int M,int Stop,
                double epsilon,double epsilonf,
                INTERVAL_VECTOR & Solution,
                INTEGER_VECTOR & IsKanto,int NbSolution);
    
    with: The procedure will return the number of solution(s) or -3 if the order is not 0 or 1, -2 if the number of equations or unknowns is equal or lower to 0 and -1 if the number of iteration is too low. There is an alternate form of this procedure in the case where we are looking for all the roots of the polynomial.
     
    int Solve_UP_JH_Interval(int Degree,VECTOR Coeff,
                int Order,int M,int Stop,
                double epsilon,double epsilonf,
                INTERVAL_VECTOR & Solution,
                INTEGER_VECTOR & Is_Kanto,int NbSolution);
    
    There are two alternate forms of this procedure in the case where we are looking for the positive or negative roots of the polynomial.
     
    int Solve_UP_JH_Positive_Interval
    int Solve_UP_JH_Negative_Interval
    
    In the three previous procedures there is no TestDomain as it is automatically determined by the procedure. If there was a failure in the determination of the domain (for the reasons explained in section 5.2) the procedures will return -1.

    The previous procedures are numerically safe in the sense that we take into account rounding errors in the evaluation of the polynomial and its gradient. For well conditioned polynomials you may use faster procedures whose name has the prefix Fast. For example Fast_Solve_UP_JH_Interval is the general procedure for finding the roots of a polynomial.

    Clearly this procedure is not intended to be used as substitute to more classical algorithms.

    It makes use of a specific Krawczyk procedure for polynomials:

     
    int Krawczyk_UP(int Degree,INTERVAL_VECTOR &Coeff, 
                 INTERVAL_VECTOR &CoeffG,INTERVAL &Input)
    


    Example

    The program Test_Solve_UP is a general test program which enable to solve univariate polynomial whose coefficients are given in a file by increasing power of the unknown.

    We use as example the Wilkinson polynomial of degree $n$ where $P$:

    \begin{displaymath}
P= \prod_{i =1}^{i =n} (x-i)
\end{displaymath}

    It is well known that this polynomial is extremely ill-conditioned. For $n=12$ the coefficient of $x^{11}$ is 78. But if we modify this coefficient by $10^{-5}$ there is a big change in the roots, 4 of them becoming complex [4]. The general procedure leads to reasonable accurate result up to $n=18$. At $n=19$ although Kantorovitch theorem has determined interval solutions that indeed contain all the solutions, Newton method is unable to provide an accurate estimate of this root due to numerical errors.

    For $n=10$ and if we are looking for the roots in the interval [0,2] the computation time is 90ms, for $n=15$ 190ms and 330ms for $n=20$. For the fast algorithm these times are: 10ms, 20ms, 30 ms Note that the best classical solving algorithm start to give inaccurate results for $n=22$ (between 12.5 and 18.5 the interval analysis algorithm finds the roots 13.424830, 13.538691, 15.477653, 15.498664, 17.554518, 17.553513) and give imaginary roots for $n=23$.

    Solving univariate polynomial numerically

    As an alternative to interval solving ALIAS proposes a numerical algorithm

     
    ALIAS_Solve_Poly((double *C, int *Degree,double *Sol),
    
    The arguments are the coefficients C of the polynomial, a pointer to the integer Degree that is initially the degree of the polynomial and Sol which will be used to store the real roots. This procedure returns the number of real roots or -1 if the computation has failed. The procedure ALIAS_Solve_Poly_PR takes the same arguments but returns the real part of the roots.

    There is also a version of the Newton scheme for univariate polynomial:

     
    int Fast_Newton(int Degree,REAL *Input,VECTOR &Coeff,VECTOR &CoeffG,
               double Accuracy,int Max_Iter,REAL *Residu)
    
    where This procedure returns -1 if the derivative polynomial is 0. Otherwise it returns 1 if a solution has been found or 0 if Max_Iter iteration has been completed.


    Solving trigonometric equation

    Mathematical background

    The purpose of this section is to present an algorithm which enable to determine the roots of an equation $F$ in the unknown $x$ of the form:
    \begin{displaymath}
F= \sum a_k\sin^m(x)\cos^n(x)
\end{displaymath} (2.8)

    with $m$ in $[0,M]$ and $n$ in $[0,N]$, $m, n$ being integers. We use the half angle tangent substitution. If $\theta$ is the unknown we define $T$ as:

    \begin{displaymath}
T=\tan(\frac{\theta}{2})
\end{displaymath}

    Then we have:

    \begin{displaymath}
\sin(\theta)=\frac{2T}{1+T^2}       \cos(\theta)=\frac{1-T^2}{1+T^2}
\end{displaymath}

    Note that the change of variable is not valid if $\theta =\pm \pi$. In that case it will be preferable to define $\alpha = \theta+\pi$ and to transform the initial into an equation in $\alpha$. Then the change of variable may be applied.

    Using the above relation any trigonometric equation can be transformed into a polynomial equation which is solved using the tools of section 2.11.

    It remains to define an interval for angles that we will denote an angle interval. The element of an angle interval is usually defined between 0 and $2\pi$ (although in most of the following procedures any value can be used when not specified: internally the element of the angle interval are converted into value within this range). A difference between numbers interval (INTERVAL) and angle interval is that the lower bound of an angle interval may be larger than the upper bound. Indeed the order in an angle interval is important: for example the angle intervals [0,$\pi/4$] and [$\pi/4$,0] are not the same.

    Implementation

    The purpose of this procedure is to determine the roots of a trigonometric equation within a given angle interval.
     
    int Solve_Trigo_Interval(int n,VECTOR &A,INTEGER_VECTOR &SSin,
                INTEGER_VECTOR &CCos,double epsilon,double epsilonf,
          int M,int Stop,INTERVAL_VECTOR &Solution,int Nb,REAL Inf,REAL Sup);
    
    with: Note that the returned Solution will always be a range [a,b] included in [0,2$\pi$] and in [Inf,Sup], this interval angle being reduced to an angle interval in the range [0,2$\pi$]. The procedure will return: If you are looking for all the roots of the trigonometric equation you may either specify [Inf,Sup] as [0,2$\pi$] or use the procedure:
     
    int Solve_Trigo_Interval(int n,VECTOR &A,INTEGER_VECTOR &SSin,
                INTEGER_VECTOR &CCos,double epsilon,double epsilonf,
                int M,int Stop,INTERVAL_VECTOR &Solution,int Nb);
    
    This procedure first analyze the trigonometric equation to find bounds on the roots using the algorithm described in section 4.3, then use the previous procedure to determine the roots within the bound. In some case this procedure may be faster than the general purpose algorithm.

    Examples

    The test program Test_Solve_Trigo enable to determine the roots of any trigonometric equation which is described in a file. In this you indicate first the coefficient of the term, its sine power and then its cosine power, this for each term of the equation.

    We consider the trigonometric equation derived in section 15.1.2:

    \begin{displaymath}
-508 \sin(\theta)-25912 \cos(\theta)+3788 \cos(\theta)
\s...
...ght )^{2}\sin(\theta)-4840 \left (
\cos(\theta)\right )^{3}=0
\end{displaymath}

    which has 0,5.47640186917958647 as roots. The general procedure find the roots 1.743205553711625e-11, 5.476401869153828 while the procedure using the determination of the bounds (which are [0,1.570796326794896], [5.235987755982987,6.283185307179586]) find the roots 5.974250609803587e-12, 5.476401869153828.


    Solving systems with linear and non-linear terms: the simplex method

    Mathematical background

    Consider a system of $n$ equations $F_1,\ldots,F_n$ in the $m$ unknowns $X=\{x_1,\ldots,x_m\}$ such that the $F_i$ may be written as:

    \begin{displaymath}
F_i = G_i(X) + a_0+\sum a_j x_j
\end{displaymath}

    $F_i$ is the sum of the non-linear function $G_i$ and of the linear terms with coefficients $a_j$. Let the interval evaluation of $G_i$ be $[\underline{G_i},\overline{G_i}]$: we define a new variable $Y_i$ as $Y_i = G_i-\underline{G_i}$, which imply $Y_i \ge 0$. $F_i$ may now be written as the sum of linear term:

    \begin{displaymath}
F_i = Y_i +a_0+\underline{G_i}+\sum a_j x_j
\end{displaymath}

    Hence the system is now a linear system with the additional constraint that $Y_1 \ge 0, \ldots, Y_n\ge 0$. We may now apply a well-known method in linear programming: the simplex method which can be used to find the minimum or maximum of a linear function under the $m_1$ equality constraints $G_1(X)=0,\ldots,
G_{m_1}(X)=0$ and the $m_2$ inequality constraints $K_1(X)\ge 0,
\ldots, K_{m_2}(X) \ge 0$. There are two phases in the simplex method: phase I verifies if a feasible solution exists and phase II is used to determine the extremum of the linear function.

    In our case we may use only phase I or phase I and II by considering the $2m$ optimum problems which are to determine the minimum and maximum of the $m$ unknowns under the $2n$ constraints $F_i(X)=0, Y_i(X)=0$ and update the interval for an unknown if the simplex applied to minimize or maximize enable to improve the range. It may be seen that this is a recursive procedure: an improvement on one variable change the constraint equations and may thus change the result of the simplex method applied for determining the extremum of a variable which has already been considered.

    This procedure, proposed in [25], enable to correct one of the drawback of the general solving procedures: each equation is considered independently and for given intervals for the unknowns two equations may have an interval evaluation that contain 0 although these equations cannot be canceled at the same time. The previous method enable to take into account at least partly the dependence of the equations. Clearly it will more efficient if the functions has a large number of linear terms and a "small" non-linear part.

    In all of the following procedures the various storage mode and bisection mode of the general solving procedures may be used and inequalities are handled.

    Implementation without gradient

    The procedure is implemented as:
     
    int Solve_Simplex(int m,int n,int NbNl,
            INTEGER_VECTOR TypeEq,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            void (* NonLinear)(INTERVAL_VECTOR &x,INTERVAL_VECTOR &X),
            void (* CoeffLinear)(MATRIX &U),
            double MaxDiam,
            int FullSimplex,
            INTERVAL_VECTOR & TheDomain, 
            int Order,int Iteration,int Stop,
            double epsilon,double epsilonf,double Dist,
            INTERVAL_MATRIX & Solution,int Nb,
            int (* Simp_Proc)(INTERVAL_VECTOR &))
    
    the arguments being:

    Note that the Simp_Proc argument may be omitted.


    The NonLinear procedure

    The purpose of the procedure is to compute an interval evaluation of the non linear term of the equations (the $G_i$ terms in the mathematical background). This procedure take as argument a box and the returned evaluation of the non-linear terms. Consider for example the equations:

    \begin{eqnarray*}
&&-0.1 x_1^2+0.25 x_1^3+x_1+0.5 x_2+0.3=0\\
&&-0.1 x_2^2+0.55 x_2^3+2x_1-0.5 x_2+0.2=0
\end{eqnarray*}

    The non-linear terms of these equations are $-0.1 x_1^2+0.25 x_1^3$ and $-0.1 x_2^2+0.55 x_2^3$ and the NonLinear procedure is written as:
     
    void NonLinear(INTERVAL_VECTOR &x,INTERVAL_VECTOR &X)
    {
        X(1) = (-0.1+0.25*x(1))*Sqr(x(1));
        X(2) = (-0.1+0.55*x(2))*Sqr(x(2));
    }
    


    The CoeffLinear procedure

    This procedure returns a matrix $U$ of dimension n $\times$ m+1 with $U(i,j)$ equal to the coefficient of $x_j$ in the equation $i$ while $U(i,{\tt m}+1)$ is the constant term of the equation. In the previous example this procedure will be:
     
    void CoeffLinear(MATRIX &U)
    {
    U(1,1)=1; U(1,2)=0.5; U(1,3)=0.3;
    U(2,1)=2; U(1,2)=-0.5; U(1,3)=0.2;
    }
    

    Using an expansion

    In some case it may be interesting to consider an expansion of the function around a given point. For example consider the term $x^2$ with $x$ in the range [1,2]: in the simplex method the range [1,4], for this term will be added to the non linear part of the equation. But if we substitute the unknown $x$ by a new unknown $x_1$ such that $x=1+x_1$ (hence the range for $x_1$ will be [0,1]) we will get $x^2=1+2x_1+x_1$ we will get an additional linear term ($2x_1$) while the non linear part will be $x_1^2$ with the range [0,1]. For each variable $y_i$ in the range $[\underline{y_i},\overline{y_i}]$ we may define a new variable $y^1_i$ such that $y_i =\underline{y_i}+y^1_i$ where $y^1_i$ has the range $[0,\overline{y_i}-\underline{y_i}]$. We may then write the non linear and linear procedures for the unknowns $y^1_i$ but it necessary to notify the simplex procedure that such an expansion is used. This is done by setting the flag ALIAS_Simplex_Expanded to 1 (this possibility is available only for the simplex method using the gradient).

    You may also prohibit the use of the simplex method in the procedure (for example to use it only in your own simplification procedure) by setting the flag ALIAS_DONT_USE_SIMPLEX to 1.

    Example

    The main program for the previous system of 2 equations (omitting the procedure F that computes the interval evaluation of the equations) will be written as
     
    int main()
    {
    int num,i;
    INTERVAL_MATRIX Solution(100,2);
    INTERVAL_VECTOR x(2);
    INTEGER_VECTOR Type(2);
    
    Clear(Type);
    for(i=1;i<=2;i++)x(i)=INTERVAL(-10,10);
    Single_Bisection=1;
    num=Solve_Simplex(2,2,0,Type,F,NonLinear,CoeffLinear,0.001,0,x,
    		  0,40000,0,0.,0.0001,0.0001,Solution,100);
    
    for(i=1;i<=num;i++)
      {
        x(1)=Solution(i,1);x(2)=Solution(i,2);
        cout<<"Solution "<<i<<":"<<x<<endl;
      }
    }
    

    Implementation with gradient

    This procedure may be used if the gradient of the equations are available. A full version is implemented as:
     
    int Solve_Simplex_Gradient(int m,int n,int NbNl,
            INTEGER_VECTOR TypeEq,
            INTERVAL_VECTOR (* IntervalFunction)(int,int,INTERVAL_VECTOR &), 
            INTERVAL_MATRIX (* Gradient)(int, int,INTERVAL_VECTOR &), 
            void (* NonLinear)(INTERVAL_VECTOR &x,INTERVAL_VECTOR &X),
            void (* GradientNonLinear)(INTERVAL_VECTOR &x,INTERVAL_MATRIX &X),
            void (* CoeffLinear)(MATRIX &U),
            double MaxDiam,
            int FullSimplex,
            INTERVAL_VECTOR & TheDomain, 
            int Order,int Iteration,int Stop,
            double epsilon,double epsilonf,double Dist,
            INTERVAL_MATRIX & Solution,
            int Nb,int UseGradNL,
            INTEGER_MATRIX &GI,
            int (* Simp_Proc)(INTERVAL_VECTOR &))
    
    the arguments being: The following variables play also a role in the computation:

    There are several versions of this procedure in which several arguments of the general procedure may be omitted. The following table indicates which arguments may be omitted and the corresponding assumptions (EO=equations only).

    omitted
    NbNl 0 0 0
    TypeEq EO EO EO EO EO EO
    GradientNonLinear not known not known
    UseGradNL 0 0 0 0
    In all cases you may omit the GI argument (the derivatives are assumed to be unknown) and Simp_Proc.


    The GradientNonLinear procedure

    The purpose of this procedure is to compute the jacobian of the non linear part of the equations in order to improve their interval evaluation. The syntax of this procedure is:
     
    void GradientNonLinear(INTERVAL_VECTOR &x,INTERVAL_MATRIX &J)
    
    where x is an interval vector which contains the range for the unknowns and J is the corresponding jacobian interval matrix.


    Solving systems with determinants

    In some cases systems may involve determinants: for example in algebraic geometry the resultant of two algebraic equations is defined as the determinant of the Sylvester matrix. To get an analytical form of the equations it is therefore necessary to compute the determinant: but in some cases this may be quite difficult as the expression may be quite large, even for sparse matrix. A mechanism in ALIAS enable to get the interval evaluation of an equation which include determinant without having to provide the analytical form of the determinant but only the interval evaluation of the components of the matrix. This mechanism is based on procedures that compute the interval evaluation of the determinant of an interval-valued matrix (see section 7.1).

    Using these procedures it is possible to design the equation evaluation procedure that are used in the general solving procedure of ALIAS as described in 2.3.4.3. Assume for example that you have to evaluate the expression

    \begin{displaymath}
(x+\vert A\vert)*y+2*(y+\vert B\vert)
\end{displaymath}

    where A, B are two matrices of dimension 6. Assume that V is an interval vector which contain the interval values for $x,y$ and that you have to return the interval evaluation of this equation in a interval vector W. Then you may write the following procedure:
    INTERVAL_MATRIX A(6,6),B(6,6);
    A=Compute_A(V) //compute A for the interval value of x,y
    B=Compute_B(V) //compute B for the interval value of x,y
    W(1)=(V(1)+Medium_Determinant(A))*V(2)+2*(V(2)+Medium_Determinant(B));
    
    You must be however careful when using this procedure in a denominator as the presence of 0 in the interval evaluation of the determinant is not checked, which will lead to an error when computing the interval evaluation of an equation (see section 2.1.1.3).

    Note that the MakeF procedure of the ALIAS-Maple package is able to produce efficient code for an equation file even if unexpanded determinants are present in the equation.

    There are also procedures to compute the derivatives of a determinant Note that the MakeJ procedure of the ALIAS-Maple package is able to produce a procedure compatible with the requirements of the gradient procedure required by the library (see section 2.4.2.2) even if determinants are present in the equation.

    There are also procedures to compute the second order derivatives of a determinant Note that the MakeH procedure of the ALIAS-Maple package is able to produce a procedure compatible with the requirements of the hessian procedure required by the library (see section 2.5.2.1) even if determinants are present in the equation.


    Solving systems of distance equations

    Principle

    We consider here a special occurrence of quadratic equations that describe distances between points in an $n$ dimensional space. Each equation $F_k$ in such system may be written as:

    \begin{displaymath}
\sum_{i =1}^{i =n}(x_i-x_j)^2+L_k=0
\end{displaymath}

    where $x_i$ are unknowns (representing coordinates of points) and $x_j$ may be unknowns of constants. A special occurrence of unknowns are the virtual points: the coordinates of these points are linear combination of the coordinates of the points that are defined as the unknowns of the system. To illustrate the concept of virtual points consider 5 fixed points on a rigid body in the 3-dimensional space. The coordinates of any of this point may be expressed as a linear combination of the coordinates of the 4 other points (as soon as these point are not coplanar). The concept of virtual points has been introduced to allow a decrease in the number of unknowns but also because without them distance equations will be redundant and consequently the jacobian of the system of equations will be singular at a solution thereby prohibiting us of using the tests (such as Moore or Kantorovitch) that allows to determine that there is one unique solution in a given box.

    Equations involving virtual points may be written as:

    \begin{displaymath}
\sum_{i =1}^{i =n}(\sum \lambda_j x_j -X_k)^2
\end{displaymath}

    where the $x_j$ are unknowns and the $X_k$ unknowns or constants. Clearly system involving distance equations are of great practical interest and ALIAS offers a specific algorithm to deal with such type of systems.

    The method proposed in ALIAS to solve this type of systems is based on the general procedure using the gradient and hessian. A first difference is that it is not necessary to provide the gradient and hessian function as they are easily derived from the system of equations. Note also that due to the particular structure of the distance equations the interval evaluation leads to exact bounds. Furthermore the algorithm we propose uses a special version of Kantorovitch theorem (i.e. a version that produces a larger ball with a unique solution in it compared to the general version of the theorem), an interval Newton method, a specific version of the simplex method described in section 2.14 and a specific version of the inflation method described in section 3.1.6 (i.e. a method that allows to compute directly the radius of a ball around a solution that will contain only this solution). In addition two simplification rules are used:

    The algorithm returns as solution either boxes that satisfy Kantorovitch theorem and therefore are reduced to a point or boxes such that the function evaluations include 0 and have a width lower than a given threshold.

    Implementation

    A system of $m$ distance functions has a specific description. First consider distance function that involve only points of constant. Such function may be written as:

    \begin{displaymath}
\sum (x_i-x_j)^2+L_i =0
\end{displaymath}

    where the maximal number of square term is $n$ and is described by a row of 2 matrices APOW, ACONS and a vector LI. APOW is an integer matrix with $2n$ columns and $m$ rows that contain the unknown number of each term in the function and in which a 0 means that the unknown is a constant. The value of theses constants are given in the matrix of real ACONS of size $m \times 2n$. Finally the value of the constant $L_i$ is indicated in the vector LI of size $m$. For example consider a system involving the 4 unknowns $x_1, x_2, y_1, y_2$ numbered from 1 to 4:

    \begin{eqnarray*}
&&x_1^2+(y_1-4)^2-6 =0\\
&&(x_1-x_2)^2+(y_1-y_2)^2-12=0
\end{eqnarray*}

    In that case APOW, ACONS, LI will be:

    \begin{displaymath}
{\tt APOW}= \left( \begin{tabular}{cccc}
1& 0& 3& 0\\
1& 2 ...
...I}= \left( \begin{tabular}{c} -6 -12 \end{tabular} \right)
\end{displaymath}

    The two first elements of the first row of APOW (1, 0) describes the first square term of the first equation and state that it is $(x_1-a)^2$, the value of $a$ being given in the ACONS(1,2). Note that each square term must be written as (unknown-unknown or constant$)^2$ and not as (constant -unknown$)^2$.

    Consider now function involving virtual points. Each square term may thus be written as:

    \begin{displaymath}
(\sum \lambda_j x_j -X_k)^2
\end{displaymath}

    where $X_k$ may be either a constant, an unknown or the coordinate of a virtual point. Let $k$ be the number of term of the form $\sum \lambda_j x_j$ existing in the system. Such equation is described by a matrix AVARV with $k$ rows and a number of columns equal to the number of unknowns. Each term $\sum \lambda_j x_j$ is numbered from 1 to k and the row $i$ of AVARV will contain in position $j$ the value of $\lambda_j$. The existence of the coordinate of a virtual point in a distance function will be indicated in APOW by a negative number whose opposite is the number of the virtual coordinates. Hence if we add to the previous system the third equation:

    \begin{displaymath}
(0.1 x_1-0.2 x_2-3)^2+(0.1 y_1-0.2y_2-1)^2-10 =0
\end{displaymath}

    the third row of APOW will be (-1 0 -2 0), the third row of ACONS will be (0 3 0 1) and the third row of LI will be -10. AVARV will have 2 row and 4 columns and is given as:

    \begin{displaymath}
{\tt AVARV}= \left( \begin{tabular}{cccc}
0.1 & -0.2 & 0 & 0\\
0 & 0 & 0.1 & -0.2\\
\end{tabular} \right)
\end{displaymath}

    Among the $m$ equations there will be $p$ equations involving virtual points. The system must be written in such way that first are defined the $m-p$ equations not involving virtual points and then the $p$ equations. Note also that in the current implementation inequalities are handled although with less efficiency than equations. The algorithm is implemented as:
     
    int Solve_Distance(int DimVar,int DimEq,
                       INTEGER_VECTOR &Type_Eq,
                       INTEGER_MATRIX &APOW,MATRIX &ACONS,VECTOR &LI,
                       int p,int k,MATRIX &AVARV,
                       INTERVAL_VECTOR & TheDomain, 
                       int M,
                       double epsilonf,
                       int Stop,
                       INTERVAL_MATRIX & Solution,int Nb,
                       int (* Simp_Proc)(INTERVAL_VECTOR &))
    
    The arguments are: The argument Simp_Proc in this procedure may be omitted.

    The following variables play also a role in the computation:

    Return code

    This procedure will return: Note that in this procedure we use the single bisection mode (we actually select the largest variable as the one that will be bisected) and the reverse storage mode.

    Inflation and Newton scheme

    For the distance equations we use a specific procedure for the inflation method (see section 3.1.6). Indeed in that case it is possible to calculate directly the diameter of the largest ball centered at an approximation of a solution that contains one and only one solution. The procedure is:

     
    int ALIAS_Epsilon_Inflation(int Dimension,int Dimension_Eq,
          INTEGER_MATRIX &APOW,MATRIX &ACONS,
          VECTOR &LI,
          int NB_EQV,int NB_VARV,MATRIX &AVARV,
          VECTOR &Amid,
          INTERVAL_VECTOR &P,
          INTERVAL_VECTOR &PP1,
          int type,double epsilon,
          int (* Simp_Proc)(INTERVAL_VECTOR &))
    

    We have also a specific implementation of the Newton scheme:

     
    int Newton_Fast(int Dimension,int Dimension_Eq,
                       INTEGER_MATRIX &APOW,MATRIX &ACONS,VECTOR &LI,
                       int NB_EQV,int NB_VARV,MATRIX &AVARV,
                       VECTOR &Input,double Accuracy,int Max_Iter,VECTOR &Residu)
    

    Choosing the right set of equations and variables

    For the best efficiency the unknowns has to be well chosen, especially so that the jacobian of the system has full rank at a solution. The use of virtual point should be systematic to avoid having a system that is singular at a solution.


    Initial domain and simplification procedures

    A specific procedure Bound_Distance for finding an initial estimate of the search domain has been implemented in the ALIAS-Maple package ( see the ALIAS-Maple manual): this procedure may provide an initial guess for the solution ranges or improve a given guess.

    Note also that the choice of the right coordinate system (for example its origin) and which points should be defined as virtual points may have a large influence on the computation time (see an example in [10]).


    Filtering a system of equation

    As mentioned previously the 2B heuristic to improve the solving is to rewrite each equation as the equality of two different terms, to determine if the interval evaluation of both terms are consistent and if not to adjust the interval for one term and by using the inverse function for this term to improve the width for one or more unknowns.

    For example imagine that one of the equation is $x^2-2x+1=0$. The procedure will introduce a new variable $X=x^2$ such that $X=2x-1$ and compute its interval evaluation. If $X$ has a negative upper bound the equation has no solution for the current range for $x$. If the upper bound $U$ of $X$ is positive then the inverse function of $X$ indicates that $x$ should lie in $[-\sqrt{U},\sqrt{U}]$: we may then update the interval for $x$ if this is not the case. If the lower bound $V$ of $X$ is positive then the inverse function of $X$ indicates that $x$ should lie outside $[-\sqrt{V},\sqrt{V}]$. If the range for $x$ is included in this interval, then there is no solution to the equation for this range for $x$.

    Note:

    These procedures must not be used with distance equations as the 2B filtering is already implemented in the solver.


    next up previous contents Next: Analyzing systems of equations Up: ALIAS-C++ Previous: Introduction
  • J-P. Merlet home page
  • La page Présentation de HEPHAISTOS
  • HEPHAISTOS home page
  • La page "Présentation" de l'INRIA
  • INRIA home page

    jean-pierre merlet
    2018-07-25