[std-interval] Interval comparison operators

Sun May 28 17:55:44 PDT 2006

 -------------- Original message ----------------------
From: Guillaume Melquiond <guillaume.melquiond at ens-lyon.fr>
> Le dimanche 28 mai 2006 à 05:02 +0000, first.i.last at comcast.net a
> écrit :
> > >  Now we are interested in what the opinions of
> > > other interval developers are. Should the user be allowed to
> > > compare intervals in C++? And if so, what should the semantic
> > > of these comparisons be?
> > 
> > The attached file contains our detailed response to the issue of
> > comparison of intervals.  Readers interested in specifics without
> > explanation or rationale should not read it.
> 
> I have taken a quick look at your document and I have a
> few questions.
> 
> First, I'm a bit confused. You are speaking of "subsets of the
> machine number line" while the proposal is clearly aimed at
> representing (convex) subsets of the real number line. I hope
> this was only an oversight.

It was not an oversight.  It may have been a poor chioce of adjective in the document and a copy/paste omission on your part.  In fact it says "CONTIGUOUS subsets of the machine number line"[emphsis added].   Perhaps it should have said convex rather than contiguous.  In this context I think they mean the same thing (denotation) except that the term contiguous suggests (connotation) continuity, which is a fundamental property that line segments and typical intervals do have, but that sets in general do not have.

> 
> I do not understand what you mean by "Present day STL
> experts may be mislead by the set-based description of
> intervals, but would not be confused by a line-segment-
> based description of intervals". According to the previous
> paragraph of your document, I thought this was supposed
> to be the exact same notion. Also how would it achieve
> not to confuse them?

The set and segment implementations are identical.  It is the method of describing them that is different.  IMHO the term segment is better than the term set precisely because it indicates a fundamental property of typical intervals -- convexity.

Describing intervals as containing sets would suggest to an STL expert that they can hold any subset, which suggestion is false and thus misleading.  Describing intervals as containing line segments would suggest that they can only hold convex/contiguous subsets, which is accurate and thus not misleading.

> Later you also say that "they are also fully compatible with
> the STL". This does not strike me as obvious since STL
> algorithms usually require this property to hold:
> !(x < y) && !(y < x) implies that x and y are equivalent.

The sentence following the fragment you quoted states that they lack general applicability because they are often too strict.  In those two sentences I meant that they meet the general requirements, but aren't always useful because they may fail specific requirements such as the one you stated above about deducible equivalance.

> 
> Concerning your neighborhood model, I disagree. You are
> saying that "the certain and possible comparison functions
> are based on this model". But somebody performing global
> optimization or value-range propagation will use them, while
> the intervals are nothing like a neighborhood.  So certain and
> possible comparisons should not be associated to a "point
> model" in my opinion.

The sticking point here appears to be that "intervals are nothing like a 
neighborhood".  If they are not, what are they like?  I.e., what conceptual model do you believe the certain/possible comparisons to be based upon?  Note that the conceptual models are not used to describe to high levels of the software such as functions, algorithms, programs, or applications.  They are only useful to describe the rationale for having multiple sets of comparisons.  So they are properly applied only to distinguish the kinds of low-level 
operations available.

But we have a fundamental disagreement.  The reason certain and possible comparisons are characterized as based upon points is that the results of the comparisons desribe the presence or absence of points. Possible comparisons indicate that there exists some point in each object that satisfies the relation in question.  Certain comparisons indicate that there is one (and only one) point that satisfies the relation in question (except of course for cne() which 
indicates that there is no point that satisfies the relation).

> 
> Do you have some bibliographic references for the principal value model?

Not yet.  That is one of the reasons why it says that it would be understandable if they are not part of the standard.  The reasons for mentioning them at all is that it may benefit the team making the proposal to the committee to be aware that there is on-going research and development in this area.  It may be that there should be no operator notation for comparison until the various candidates have been evaluated in the field.  The candidates that have been suggested include set/segment, certainty, and principal value.

> This looks like a center-radius model of intervals, but this
> isn't it, so I'm a bit lost. For now, I will just ask how different
> is the "loose" comparaison from the "possible" comparison.
> Since the "loose" comparison is using "the implicit epsilon
> provided by the radius of the actual intervals", it seems to
> me the results will always be the same.

The results will not be the same.  The auithoritative definitions of the principal value comparisons can be found in Knuth, D.E. /The art of computer programming/ (vol II).  A brief description of its usage and utility can be found in the beginning of: <http://www.boost.org/libs/test/doc/components/     
test_tools/floating_point_comparison.html>

The difference between loose and possible is that peq() is true whenever one of the intervals overlaps any portion of the other, but leq() is true only whenever one of the intervals overlaps the center of the other.

> Also you are saying this model appeals to a "much wider
> audience"; could you give some examples where it is used?

At present it is not used AFAIK.  But one of the claims is that the PV model makes intervals suitable for direct replacement of built-in floating point types (in the sense of the Liskov substitution principle).   If that claim is valid then the PV model makes intervals appeal to every C++ programmer who uses built-in floating point types.

> Later, you say that "empty interval arguments should always
> yield false". This statement contradicts your line-segment
> model, since the empty set is included in every set.

What is the contradiction?  Equivalence and ordering comparisons do not deal with subsets.  The functions in() and contains() should affirm that an empty interval is included in every other interval.  But the empty interval is neither less then nor more than any other interval.  And given the the empty interval is not part of the machine number line one can claim that it is Not an Interval (NaI).  So, just as NaN is not equal to any other floating point value, not even an identical NaN, NaI should not be equal to any other interval, not even an identical NaI.

In fact this conflict illustrates some of the problems caused by the interval-as-set terminology.  It suggests that an empty interval is an empty set, and so seq(empty,empty) should return true.  But the interval-as-segment terminology suggests that an empty interval is not a segment.  (Nor is it a point.  The width is not zero, it is NaN).
So based upon the interval-as-segment terminology seq(empty,empty) should return false.

Is there a case to be made that empty intervals are less than or greater then non-empty intervals?

> It also goes against your neighborhood model, since the
> classical interpretation is: if something is not possible,
> then its contrary is certain (and reciprocally).

The classical interpretation fails anytime a NaN is encountered because they are not part of the number line.  Similarly empty intervals are not intervals (they need have no sensible bounds nor center/radius).  Moreover the classical interpretation cannot be applied to intervals at all because there are more relationships amoung intervals than there are among points on the machine number line.

> Finally, you are proposing that the interval operators compare their
> midpoint. This kind of goes against the whole point of providing
> interval arithmetic so that people can do certified computations.

The reason the conceptual models are worth discussing is that there is no single model for interval usage.  There can be a single model for the computation of interval values (i.e., outward rounding), but there cannot be a single model for interval applications.  C++ is often mentioned as a multi-paradigm language.  If conceptual models are paradigms then there are definitely multiple interval paradigms.

This is only a problem when one must choose a single paradigm to use in a linguistic feature such as operator notation.  The problme is that one must choose zero or one paradigm.  But selecting a single conceptual model for comparison operators does not "go against" the other models.  They still exist and are exactly as useful as they were prior to the selection.

In fact several accomplished users of intervals have written
that they prefer explicit bounds tests rather than comparison functions.  From this one can deduce that they can do certified computations without comparison functions, which certainly implies that they can do without an operator notation for those functions.

The point of operator notation is to make it easier to express simple concepts.  It may be that there is no simple concept related to intervals, in which case there should be no operator notation for comparison of intervals.

But there is an opposing viewpoint.  When adding up a column of figures, say measurements taken during an experiment, one might want to know how accurate the total was.  Adding up intervals constructed from the measurements and their associated errors makes it trivial to find the error for the total of the measurements.  So the utility of intervals for common-place calculcations is extremely high.  One need not be doing a global optimization to prefer intervals over built-in types.

Comparison operators are quite useful in common-place calculations, and they can be used quite naturally without danger of misleading the programmer.  For example given the need to sort some numbers the precise details of the sort mechanism are probably irrelevant.  The only universal concern is that the result of the sort operation be reasonably reproducible (e.g., swapping of identical items is usually well tolerated).  Comparison of principal values meets that need for simple and reasonably reproducible results.  None of the other conceptual models can meet that need.

> So I am a bit reluctant to consider your proposal.

I quite understand.  If you believe that the inclusion of PV comparisons would inhibit the development of certified computations you should definitely not add them to your standardization proposal.

-- Lee Winter