[std-interval] Suggestions for the 2006-09 draft

Wed Sep 20 15:18:30 PDT 2006

The 2006-09 (revision 1) draft is much better than earlier versions.  See: <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2067.pdf>  The authors have done a lot of work and should be commended for it.  Sections II through IV are the only ones on which I am competent to comment.  They are clear and well written.

Mike Schulte was kind enough to make some valuable suggestions, nevertheless, I am happy to take full responsibility for any lapses in what follows.

There are still some areas where further improvements can, and in my opinion, should be made.  They follow from four fundamental principles that I believe any interval language standard must not violate:

1) Containment.  That is, the set of all possible values resulting from the evaluation of any expression over given interval arguments *must* be contained in computed interval results.  This implies, of course, that the set of possible values, the "containment set", or cset, must be defined.  As John Pryce and George Corliss point out, there are different ways to define mathematically consistent csets.  Generally, smaller csets are more complicated to define and implement.  Which is chosen can (and should) be a quality of implementation (QOI) choice, as first defined in:  <http://docs.sun.com/app/docs/doc/819-3695>.

2) Opaqueness.  The only possible reason I can think of is for specifying how interval values are internally represented in a given compiler or class library is to provide binary file compatibility.  Source code compatibility from release to release and implementation to implementation can be guaranteed with opaque data types.  The huge advantage of opaqueness is that different platforms can choose to do what makes sense for them *and* different implementers can choose to use different internal representation schemes, depending, for example on the chosen cset system.  For example, on systems that do not support infinities or NaNs, one could internally define [+1, -1] to be the representation for empty and [+2, -2] to be the whole real line.  Opaque data types would also not preclude a given implementation from supporting exterior intervals and/or various tags.

3) Intrinsic Compiler Support.  A proposed language standard should not preclude interval data types from being made intrinsic in a compiler.  Many run-time performance and interval width optimizations can best be performed at the abstract syntax tree level within a compiler.  Surely, a proposed language standard should not be written in such a way that it can *only* be implemented in a class library.  Two examples of the kind of optimizations that can best be performed inside a compiler are: symbolic expression manipulation and derivative/slope derivation, and width/performance optimizations such as the implementation of "crude range tests".  See: Walster and Hansen, Using pillow functions to efficiently compute crude range tests, "Numerical Algorithms" V37: 401-415, 2004.

4) Support Narrow Width.  A proposed interval language standard should not preclude doing *anything* to gain speed and reduce computed interval width.  The *only* constraint should be cset satisfaction.  As developments continue to be made defining narrower width cset systems, and as better algorithms become available for automating the ability to use dependence information, the language standard should not prohibit them from being implemented.  There should be no narrow width requirement, just as there are not speed requirements.  Returning "whole" should be a standard-conforming option.

5) Ease Of Use.  To the extent possible, syntax and features should be included in an interval language standard that make it easy for algorithm developer *and* end users.  An example of the latter is support for single-number interval input and output.  See:
G. W. Walster (1988) "Philosophy and practicalities of interval arithmetic" in "Reliability in computing: the role of interval methods in scientific computing", R. E. Moore (ed), Pages: 309 - 323 ISBN:0-12-505630-3
and
M. J. Schulte et al "Single-Number Interval I/O," Developments in Reliable Computing (T. Csendes ed.) pp.
141-148, Kluwer Academic Publishers, 1999.  An electronic version is available at: <http://citeseer.ist.psu.edu/schulte97singlenumber.html>

Keeping the above five principles in mind, here are some specific illustrative example in the current draft that appear to be violations of them:

1) Last 4 lines on page 4:  Not all operations on an empty interval produce an empty result.  The hull operator is the one exception. This is a good reason why un-initialized intervals should be set to the whole interval.  They will propagate (except through the intersection operator with a finite interval) and be noticed quickly when debugging.  They cannot result in a containment failure.

2) Page 5 second paragraph: The goal of not requiring language changes is fine.  However, consistent with principles 3) and 4), above, do not preclude an intrinsic compiler implementation.

3) Page 5 - lower(): With opaquely defined interval data types, there is no need for this function.  Indeed, it should not be included.  There is no need for it.

4) Page 6 - Empty Interval: Relative to the discussion of implementations on platforms that do not support NaNs, opaqueness solves the difficulty.

5) Page 7 - [a;b] notation:  Why not let this be a localization option, as is the use of , or . for a decimal point?  This is consistent with principle 5), above.  For example, in countries that use a comma for a decimal place, they should be free to adopt the semicolon delimiter.  Doing so will not conflict with the use of the comma for an interval delimiter by other countries.  Thus, [2.3, 4.5] can be unambiguously localized to [2,3; 4,5], or [2,3: 4,5].  This need not be specified in the proposed standard.

6) Page 7 - order relations on empty intervals and - Why do "certainly" comparisons return "true" for empty intervals?: There are good reasons why *all* relations tests on empty intervals should return false.  First, it is the conservative thing to do.  One would be hard put to argue that any order
relation between an interval and an empty interval can be said to be "true".  Second, if code for a sequence of relational tests is written so it is logically exhaustive, then the final else branch can *only* be taken if there is an unanticipated empty interval in one or more relational tests.  This mechanism provides a convenient way to catch unanticipated empty intervals. In the following Fortran example .CLT. and .PGE. are the certainly less-than and possibly greater-than-or-equal relational operators, respectively.

 IF(A .CLT. B) THEN

     ! Code for A .CLT. B

 ELSE IF(A .PGE. B) THEN

     ! Code for A .PGE. B

 ELSE

     ! Either A, or B, or both A and B is empty.

 END IF

A compiler might even check for violation of this principle and provide a warning diagnostic if exhaustive sequences of relational tests are not used.

7) Page 7 - Should std::set<interval> work?:  By this, I presume is meant that sets or lists of intervals could be defined and used.  The issue raised, if I understand it, is that because intervals are not naturally unambiguously ordered, this cannot be done.  On the contrary, when lists (sets) of interval and/or interval vectors (or boxes) are used, as they often are ordered using some scalar function of them, such as width, relative error upper bound, or the upper bound on an objective function in an optimization algorithm.  So, as long as set ordering can be performed a result of computing a scalar function of the elements of the set, there should be no reason why this possibility should not be supported.

8) Page 8 - Memory layout.  This issue vanishes if interval data types are opaque.

9) Page 8 - Optimization expectations: An interval language standardization goal can be to make library implementations *close* to compilers.  However, nothing should be in the standard that precludes an intrinsic compiler implementation.

The following typo was noticed by Mike Schulte:

Page 3 - The specification states "Those optimizations, however, are necessary for implementing the proposal." I believe this should be "Those optimizations, however, are *not* necessary for implementing the proposal."