PR 25512: pointer overflow defined?

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

PR 25512: pointer overflow defined?

by Richard Guenther-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The problem in this PR is that code like in the testcase (from OpenOffice)
assumes that pointer overflow is defined.  As the standard does not talk
about wrapping pointer semantics at all (at least I couldn't find anything
about that), how should we treat this?

Thanks for any advice,
Richard.

Re: PR 25512: pointer overflow defined?

by Andrew Haley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Richard Guenther writes:
 >
 > The problem in this PR is that code like in the testcase (from
 > OpenOffice) assumes that pointer overflow is defined.  As the
 > standard does not talk about wrapping pointer semantics at all (at
 > least I couldn't find anything about that), how should we treat
 > this?

Look at Section 6.5.6, Para 8.  The code is undefined.

Andrew.

Re: PR 25512: pointer overflow defined?

by Richard Guenther-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 21 Dec 2005, Andrew Haley wrote:

> Richard Guenther writes:
>  >
>  > The problem in this PR is that code like in the testcase (from
>  > OpenOffice) assumes that pointer overflow is defined.  As the
>  > standard does not talk about wrapping pointer semantics at all (at
>  > least I couldn't find anything about that), how should we treat
>  > this?
>
> Look at Section 6.5.6, Para 8.  The code is undefined.

This talks about pointers that point to elements of an array object.
It does not talk about doing arithmetic on arbitrary pointer (constants),
which is what the code does.  Or is a pointer always pointing to elements
of some array object (being it the global heap "array object")?

Richard.

Re: PR 25512: pointer overflow defined?

by Andrew Haley :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Richard Guenther writes:
 > On Wed, 21 Dec 2005, Andrew Haley wrote:
 >
 > > Richard Guenther writes:
 > >  >
 > >  > The problem in this PR is that code like in the testcase (from
 > >  > OpenOffice) assumes that pointer overflow is defined.  As the
 > >  > standard does not talk about wrapping pointer semantics at all (at
 > >  > least I couldn't find anything about that), how should we treat
 > >  > this?
 > >
 > > Look at Section 6.5.6, Para 8.  The code is undefined.
 >
 > This talks about pointers that point to elements of an array
 > object.  It does not talk about doing arithmetic on arbitrary
 > pointer (constants), which is what the code does.  Or is a pointer
 > always pointing to elements of some array object (being it the
 > global heap "array object")?

Section 6.5.6, Para 8 always holds.  If p1 is a valid pointer, then it
points to a byte within an object or to an element just past the end
of an array.

Andrew.

Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Richard Guenther wrote:

>The problem in this PR is that code like in the testcase (from OpenOffice)
>assumes that pointer overflow is defined.  As the standard does not talk
>about wrapping pointer semantics at all (at least I couldn't find anything
>about that), how should we treat this?
>  
>
How could pointer arithmetic overflow, the result must be within the
same allocated object (or just past it in the array case, and if necessary
the compiler must be careful not to allocate an array at the very top end
of the address space to avoid any problems -- this is unlikely to happen
in practice -- but was an issue on large model 286 programs). At least
that's my understanding, it would be surprising if things have changed
in this area.

>Thanks for any advice,
>Richard.
>  
>




Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Richard Guenther wrote:

>On Wed, 21 Dec 2005, Andrew Haley wrote:
>
>  
>
>>Richard Guenther writes:
>> >
>> > The problem in this PR is that code like in the testcase (from
>> > OpenOffice) assumes that pointer overflow is defined.  As the
>> > standard does not talk about wrapping pointer semantics at all (at
>> > least I couldn't find anything about that), how should we treat
>> > this?
>>
>>Look at Section 6.5.6, Para 8.  The code is undefined.
>>    
>>
>
>This talks about pointers that point to elements of an array object.
>It does not talk about doing arithmetic on arbitrary pointer (constants),
>which is what the code does.
>
Right, but that's the point. "doing arithmetic on arbitrary pointer"
values is
not defined, it is not even defined to compare two pointers pointing to two
different objects.

Alex Stepanov noted to me once that he preferred Ada to C, since in Ada
general pointer arithmetic was available and it is not in C (in Ada you can
use the type Integer_Address which works as intended).

Of course in practice general pointer arithmetic works in C, but gcc takes
a very aggressive attitude to undefined code, and when code like this fails
is content to cite chapter and verse of the standard saying the code is
undefined (personally I think gcc is too aggressive in this regard, but
I guess you have the freedom to take that attitude if you are not in the
commercial business of satisfying paying customers :-)

>  Or is a pointer always pointing to elements
>of some array object (being it the global heap "array object")?
>  
>
There is no such thing in C as the "global heap array object", you can
only compare or do arithmeitc on pointers that are within a single
array object.

One way to think about the semantic model is to consider pointers
in C to consist of a base/offset pair, where the base points to the
start of the object (some debugging checkout C compilers even
use such a format). Then operations on pointers need ONLY
reference the offset.

Note that this was more than a theory in some C compilers operating
in large (multi-segment) mode on the 286, where indeed pointers
were in base offset (well segment/offset) form, and pointer
arithmetic could just deal with the offset.

>Richard.
>  
>




Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

| Richard Guenther wrote:
|
| >On Wed, 21 Dec 2005, Andrew Haley wrote:
| >
| >
| >>Richard Guenther writes:
| >> > > The problem in this PR is that code like in the testcase (from
| >> > OpenOffice) assumes that pointer overflow is defined.  As the
| >> > standard does not talk about wrapping pointer semantics at all (at
| >> > least I couldn't find anything about that), how should we treat
| >> > this?
| >>
| >>Look at Section 6.5.6, Para 8.  The code is undefined.
| >>
| >
| >This talks about pointers that point to elements of an array object.
| >It does not talk about doing arithmetic on arbitrary pointer (constants),
| >which is what the code does.
| >
| Right, but that's the point. "doing arithmetic on arbitrary pointer"
| values is
| not defined,

I think that needs qualification, given the semantics of

  pointer -> integer type
  integer type -> pointer

conversions.

|it is not even defined to compare two pointers pointing to two
| different objects.

you can (equality) compare a pointer to NULL -- which does not even
happen to designate an object.

[...]

| One way to think about the semantic model is to consider pointers
| in C to consist of a base/offset pair, where the base points to the
| start of the object (some debugging checkout C compilers even
| use such a format). Then operations on pointers need ONLY
| reference the offset.

that model is too simplistic -- hint: null pointers.

-- Gaby

Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

| Richard Guenther wrote:
|
| >The problem in this PR is that code like in the testcase (from OpenOffice)
| >assumes that pointer overflow is defined.  As the standard does not talk
| >about wrapping pointer semantics at all (at least I couldn't find anything
| >about that), how should we treat this?
| >
| How could pointer arithmetic overflow, the result must be within the
| same allocated object (or just past it in the array case, and if necessary

It highly depends on what you define to be pointer arithmetic.

Given the conversions

   pointer -> integer type
   integer type -> pointer
   T* -> U*
 
I think your sentence is way to restrictive and does not capture C
models.

Richard, to resolve this issue, we need to be more precise about our
mappings for

   pointer -> integer type
   integer type -> pointer
   T* -> U*

conversions.  This is not an issue to resolved in isolation, piece meal.

-- Gaby

Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Gabriel Dos Reis wrote:

>you can (equality) compare a pointer to NULL -- which does not even
>happen to designate an object.
>  
>
Well accurately, you can compare pointers, its not illegal, but the
result of
comparing pointers to separately allocated objects is undefined.

>[...]
>
>| One way to think about the semantic model is to consider pointers
>| in C to consist of a base/offset pair, where the base points to the
>| start of the object (some debugging checkout C compilers even
>| use such a format). Then operations on pointers need ONLY
>| reference the offset.
>
>that model is too simplistic -- hint: null pointers.
>  
>
null is a special case indeed, which can be reprsented using a
distinguished offset

>-- Gaby
>  
>




Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Gabriel Dos Reis wrote:

>It highly depends on what you define to be pointer arithmetic.
>
>Given the conversions
>
>   pointer -> integer type
>   integer type -> pointer
>   T* -> U*
>  
>
Yes, but the only defined semantics of these conversions is that you get
the same
pointer back, you cannot say anything else about the values. If you add
one to
the integer, you have no assurance that it has anything to do with
adding one
to the pointer. Yes, it probably will, but there is no guarantee in the
standard.

>I think your sentence is way to restrictive and does not capture C
>models.
>  
>
Well part of the trouble is that there are really two models here, the
one in the
standard, and the one that everyone expects and which is by and large
the one
that is implemented.

>Richard, to resolve this issue, we need to be more precise about our
>mappings for
>
>   pointer -> integer type
>   integer type -> pointer
>   T* -> U*
>
>conversions.  This is not an issue to resolved in isolation, piece meal.
>  
>
There is no obligation standard-wise to say anything at all about these
conversions,
other than they are value preserving, i.e. if you convert a pointer to
an integer
that is large enough, and back to a pointer, you get the pointer back.
As far as I
know nothing more can be said.

>-- Gaby
>  
>




Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

| Gabriel Dos Reis wrote:
|
| >you can (equality) compare a pointer to NULL -- which does not even
| >happen to designate an object.
| >
| Well accurately, you can compare pointers, its not illegal, but the
| result of
| comparing pointers to separately allocated objects is undefined.

Well, that is not what the C standard says. E.g., the exact wording is
more convoluted and given the wording for what is "object" in the C
standard

       3.14
       [#1] object
       region of data storage in  the  execution  environment,  the
       contents of which can represent values

with no more qualifications (this definition is slightly different in
some nearby languages), it is slippery to found optimizations on
"pointer overflows."

And even more so, the existing practice in widely used programs is to
compare pointers to things like (void *)-1.  And this even in GCC own
source code.

-- Gaby

Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Gabriel Dos Reis wrote:

>with no more qualifications (this definition is slightly different in
>some nearby languages), it is slippery to found optimizations on
>"pointer overflows."
>  
>
Well I think unfortunately the standard does allow such "optimizations",
but that does
not mean it is a good thing to take advantage of this.

>And even more so, the existing practice in widely used programs is to
>compare pointers to things like (void *)-1.  And this even in GCC own
>source code.
>  
>
Indeed! And as I said in an earlier message, I think gcc is far too
ready to hide
bad behavior behind the justification of wording in the standard. When
things
are undefined, there is every argument for doing what people expect and what
will cause least surprise, unless a *VERY* strong argument is made that an
optimization is really important. Since almost no individual optimizations
are in themselves really important, this is a very hard burden to meet.

We had a similar case in GNAT recently:

   X : Xtype;

   ...

  if X in Xtype then

The Ada standard clearly justifies removing this test, but equally
clearly the user
is doing this for a reason, and the likely reason is almost certainly
that a test is
wanted for the representation of X being in range of its type (e.g. after an
unchecked conversion).

GCC 3.4 optimization did indeed remove this test, due to over-enthusiastic
treatment of range information, and many users complained. So we changed
GNAT to treat this case as:

   if X'Valid then

and issue a warning

(I still don't think that's good enough, it catches only some cases, and
for example,
 if X > Xtype'Last
 will still malfunction. I would like to just eliminate this range
optimization for
 Ada, I just don't believe it is worth the aggravation of expectations).

My favorite example of hiding behind the standard is that Fortran 66
carefully
allowed either stack or static allocation of local variables. All Fortran
compilers did static allocation, and most (almost all) Fortran programs
relied on this. Burroughs 5500 Fortran translated into Algol-60 (the only
assembly language for the machine, there was no separate assembler),
and used stack allocation. Frightfully standard, but completely useless,
and Burroughs lost sales because of the failure of its Fortran compiler
to compile and run standard Fortran codes.

So please don't take my comments as supporting dubious optimizations
of pointer arithmetic.

For me the only practical acceptable implementation of pointers in C
is to use lineary addresses as integers, and implemnet wrap around
arithmetic on these values.




Re: PR 25512: pointer overflow defined?

by chris jefferson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar wrote:

> Richard Guenther wrote:
>
>> On Wed, 21 Dec 2005, Andrew Haley wrote:
>>
>>  
>>
>>> Richard Guenther writes:
>>> > > The problem in this PR is that code like in the testcase (from
>>> > OpenOffice) assumes that pointer overflow is defined.  As the
>>> > standard does not talk about wrapping pointer semantics at all (at
>>> > least I couldn't find anything about that), how should we treat
>>> > this?
>>>
>>> Look at Section 6.5.6, Para 8.  The code is undefined.
>>>  
>>
>> This talks about pointers that point to elements of an array object.
>> It does not talk about doing arithmetic on arbitrary pointer
>> (constants),
>> which is what the code does.
>>
> Right, but that's the point. "doing arithmetic on arbitrary pointer"
> values is
> not defined, it is not even defined to compare two pointers pointing
> to two
> different objects.
>
While that is true according to the standard, I believe that on most
systems you can compare any two pointers. In particular, the C++
standard does require a total ordering on pointers, and at the moment
that is implemented for all systems by just doing "a < b" on the two
pointers.

Chris

Re: PR 25512: pointer overflow defined?

by Paolo Carlini :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

chris jefferson wrote:

>>Right, but that's the point. "doing arithmetic on arbitrary pointer"
>>values is
>>not defined, it is not even defined to compare two pointers pointing
>>to two
>>different objects.
>>
>While that is true according to the standard, I believe that on most
>systems you can compare any two pointers. In particular, the C++
>standard does require a total ordering on pointers, and at the moment
>that is implemented for all systems by just doing "a < b" on the two
>pointers.
>  
>
Humpf! Can people please cite exact paragraphs of the relevant
Standards? Otherwise, I think we are just adding to the confusion. For
example, in my reading of C99 6.5.9 and C++03 5.10 pointers *can* be
compared for equality and discussing separately and correctly relational
operators and equality operators is not a language-lawyer-ism, is *very*
important for its real world implications. But this is only an example...

Paolo.

Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

| Gabriel Dos Reis wrote:
|
| >It highly depends on what you define to be pointer arithmetic.
| >
| >Given the conversions
| >
| >   pointer -> integer type
| >   integer type -> pointer
| >   T* -> U*
| >
| Yes, but the only defined semantics of these conversions is that you
| get the same
| pointer back, you cannot say anything else about the values. If you

You get the same pointer back, *when* you've done a round trip.
For

    pointer type -> integer type

or

    integer type -> pointer type

or

    T* -> U*

there is no "round trip" requirement for the values to be ligt.
However, we have an obligation to define what those mappings are.

| add one to
| the integer, you have no assurance that it has anything to do with
| adding one
| to the pointer.

No dispute on that.

| Yes, it probably will, but there is no guarantee in
| the standard.
|
| >I think your sentence is way to restrictive and does not capture C
| >models.
| >
| Well part of the trouble is that there are really two models here, the
| one in the
| standard, and the one that everyone expects and which is by and large
| the one
| that is implemented.

yes, and I should have been more precise: even the standard model is
not captured.

|
| >Richard, to resolve this issue, we need to be more precise about our
| >mappings for
| >
| >   pointer -> integer type
| >   integer type -> pointer
| >   T* -> U*
| >
| >conversions.  This is not an issue to resolved in isolation, piece meal.
| >
| There is no obligation standard-wise to say anything at all about
| these conversions,

There is:


       [#5] An integer  may  be  converted  to  any  pointer  type.
       Except    as    previously    specified,   the   result   is
       implementation-defined,  might  not  be  correctly  aligned,
       might  not  point  to  an entity of the referenced type, and
       might be a trap representation.56)

       [#6]  Any  pointer type may be converted to an integer type.
       Except   as   previously   specified,    the    result    is
       implementation-defined.  If the result cannot be represented
       in the integer type, the behavior is undefined.  The  result
       need not be in the range of values of any integer type.

and
       3.4.1
       [#1] implementation-defined behavior
       unspecified behavior where each implementation documents how
       the choice is made

| other than they are value preserving, i.e. if you convert a pointer to
| an integer
| that is large enough, and back to a pointer, you get the pointer
| back. As far as I
| know nothing more can be said.

What need is to document what mapping is used to compute the result of
the conversion. Till now, people have assumed that GCC wouold use the
"obvious" model and write codes based on that.  On the other hand, GCC
has been getting more "aggressive" (I don't quite like the word)
transformations and things start breaking.  That ask for more precise
documentation.

As for the comparison, lots of C programs do things like comparing for
equality to (void *)-1; it is a question as whether you'll declare
such programs as having undefined behaviour (if yes, I don't see how)
or have implementation-defined semantics (if yes, then we need to say
what can be done to such pointers).

-- Gaby

Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

[...]

| So please don't take my comments as supporting dubious optimizations
| of pointer arithmetic.
|
| For me the only practical acceptable implementation of pointers in C
| is to use lineary addresses as integers, and implemnet wrap around
| arithmetic on these values.

This is more reason for us (in addition of the standard requirement)
to document what the mapping for T* -> integer is.

-- Gaby

Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Paolo Carlini wrote:

>chris jefferson wrote:
>
>  
>
>>>Right, but that's the point. "doing arithmetic on arbitrary pointer"
>>>values is
>>>not defined, it is not even defined to compare two pointers pointing
>>>to two
>>>different objects.
>>>
>>>      
>>>
>>While that is true according to the standard, I believe that on most
>>systems you can compare any two pointers. In particular, the C++
>>standard does require a total ordering on pointers, and at the moment
>>that is implemented for all systems by just doing "a < b" on the two
>>pointers.
>>
>>
>>    
>>
>Humpf! Can people please cite exact paragraphs of the relevant
>Standards? Otherwise, I think we are just adding to the confusion. For
>example, in my reading of C99 6.5.9 and C++03 5.10 pointers *can* be
>compared for equality and discussing separately and correctly relational
>operators and equality operators is not a language-lawyer-ism, is *very*
>important for its real world implications. But this is only an example...
>
>Paolo.
>  
>
Surely pointers can be compared for equality (it is fine to see if a
pointer is pointing
to something). The discussion about pointer comparison across objects is wrt
expecting any kind of ordering relationshiop.



Re: PR 25512: pointer overflow defined?

by Robert Dewar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>| Yes, but the only defined semantics of these conversions is that you
>| get the same
>| pointer back, you cannot say anything else about the values. If you
>
>You get the same pointer back, *when* you've done a round trip.
>For
>
>    pointer type -> integer type
>
>or
>
>    integer type -> pointer type
>
>or
>
>    T* -> U*
>  
>
Yes, that's right, that's what I meant, I just was using shorthand,
since I assume
this requirement is familiar to everyone (it is explicitly discussed in the
original K&R for instance).

>there is no "round trip" requirement for the values to be ligt.
>  
>
Sorry, the typo at the end has me failing to guess what you meant :-(

>However, we have an obligation to define what those mappings are.
>  
>
Why?

>
>  
>




Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Paolo Carlini <pcarlini@...> writes:

| chris jefferson wrote:
|
| >>Right, but that's the point. "doing arithmetic on arbitrary pointer"
| >>values is
| >>not defined, it is not even defined to compare two pointers pointing
| >>to two
| >>different objects.
| >>
| >While that is true according to the standard, I believe that on most
| >systems you can compare any two pointers. In particular, the C++
| >standard does require a total ordering on pointers, and at the moment
| >that is implemented for all systems by just doing "a < b" on the two
| >pointers.
| >  
| >
| Humpf! Can people please cite exact paragraphs of the relevant
| Standards? Otherwise, I think we are just adding to the confusion. For
| example, in my reading of C99 6.5.9 and C++03 5.10 pointers *can* be
| compared for equality and discussing separately and correctly relational
| operators and equality operators is not a language-lawyer-ism, is *very*
| important for its real world implications. But this is only an example...

I don't understand your query.
I understood Chris' comment as having to do with the implementation of
std::less<T*> (and friends) as required by C++.  Our implementation is just
a forwarding function to operator< (and friends) on the assumption
that the compiler uses the "obvious" model.

-- Gaby

Re: PR 25512: pointer overflow defined?

by Gabriel Dos Reis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Dewar <dewar@...> writes:

| >However, we have an obligation to define what those mappings are.
| >
| Why?

Because it is an implementation-defined behaviour and we have to
document how the choice is made.

-- Gaby
< Prev | 1 - 2 | Next >