Crash during cheney-copy on Windows

View: New views
6 Messages — Rating Filter:   Alert me  

Crash during cheney-copy on Windows

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hello,

 

I am currently facing a crash of my product during the cheney-copy operation on Windows. This crash is very hard to reproduce as it is very volatile (some slight changes in the SML code make it disappear ; it depends on the memory amount etc…).

 

I finally could activate some debug messages and assertions (it is not full the debugging mode because enabling it causes the issue to disappear) :

[GC: Starting gc #73; requesting 512 nursery bytes and 0 old-gen bytes,]

[GC:    heap at 0x31880000 of size 710,967,296 bytes,]

[GC:    with nursery of size 617,405,820 bytes (86.8% of heap),]

[GC:    and old-gen of size 93,561,476 bytes (13.2% of heap),]

[GC: Starting major Cheney-copy;]

[GC:    from heap at 0x31880000 of size 710,967,296 bytes,]

[GC:    to heap at 0x08fc0000 of size 710,967,296 bytes.]

[GC: Finished major Cheney-copy; copied 97,279,788 bytes.]

 [GC: Starting gc #77; requesting 512 nursery bytes and 0 old-gen bytes,]

[GC:    heap at 0x08fc0000 of size 710,967,296 bytes,]

[GC:    with nursery of size 612,833,480 bytes (86.2% of heap),]

[GC:    and old-gen of size 98,133,816 bytes (13.8% of heap),]

[GC: Starting major Cheney-copy;]

[GC:    from heap at 0x08fc0000 of size 710,967,296 bytes,]

[GC:    to heap at 0x31880000 of size 710,967,296 bytes.]

foreachObjptrInObject (0x318c2318)  header = 000004c7  tag = ARRAY  bytesNonObjptrs = 0  numObjptrs = 1

forwardObjptr  opp = 0x318c2318  op = 0x00000000091b7714  p = 0x091b7714

forwardObjptr --> *opp = 0x319f57dc

forwardObjptr  opp = 0x318d1354  op = 0x00000000091d78c4  p = 0x091d78c4

forwardObjptr --> *opp = 0x31a173ac

forwardObjptr  opp = 0x318d1358  op = 0x00000000318da094  p = 0x318da094

Assertion failed at line 58 of file gc/object.c

===> corresponds to the “assert (1 == (header & GC_VALID_HEADER_MASK));” at the beginning of the splitHeader() function

 

Using those debug messages, we can see that all calls to forwardObjptr() are performed on objects whose address is, as expected, in the “from” heap whereas the last call that leads to a crash receives an invalid pointer whose address is in the “to” heap.

 

We also see that the 2 heaps have been allocated during a previous call to the garbage collector and a previous “cheney-copy” has already been performed between those.

 

I suspect that maybe a previous GC operation left some old pointers in one of the heaps and those have not been properly cleared during an object allocation or so on.

 

In any case, I simply don’t know what to do in order to identify the root cause of the issue. Any hint ?

 

Best regards

 

cid:image001.gif@01C7BFD3.87CF8F80

 

Accelerating the pace of  engineering and science

Nicolas Bertolotti
Senior Development Engineer

2 Rue de Paris
92196 Meudon Cedex

France

Nicolas.Bertolotti@...

tel:
fax:
mobile:

+33.1.41.14.88.55

+33.1.55.64.06.64

+33.6.86.41.87.15

 



_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: Crash during cheney-copy on Windows

by Daniel Spoonhower-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi, Nicolas.  Here are a couple of ideas.

It's not clear to me exactly what debugging you are able to enable and
still observe the problem.  I believe the most important check would be
"invariantForGC" which is run at the beginning and end of each
collection.  Are you able to run this function and still observe the
problem?  (It is in the debug version of the runtime and is also enabled
by -DASSERT=1.)

If you are suspicious of old data in the heap, you could try explicitly
clearing the new heap at the beginning of a Cheney copy (i.e. in
majorCheneyCopyGC).


--djs

Nicolas Bertolotti wrote:

> Hello,
>
>  
>
> I am currently facing a crash of my product during the cheney-copy
> operation on Windows. This crash is very hard to reproduce as it is very
> volatile (some slight changes in the SML code make it disappear ; it
> depends on the memory amount etc…).
>
>  
>
> I finally could activate some debug messages and assertions (it is not
> full the debugging mode because enabling it causes the issue to disappear) :
>
> [GC: Starting gc #73; requesting 512 nursery bytes and 0 old-gen bytes,]
>
> [GC:    heap at 0x31880000 of size 710,967,296 bytes,]
>
> [GC:    with nursery of size 617,405,820 bytes (86.8% of heap),]
>
> [GC:    and old-gen of size 93,561,476 bytes (13.2% of heap),]
>
> …
>
> [GC: Starting major Cheney-copy;]
>
> [GC:    from heap at 0x31880000 of size 710,967,296 bytes,]
>
> [GC:    to heap at 0x08fc0000 of size 710,967,296 bytes.]
>
> …
>
> [GC: Finished major Cheney-copy; copied 97,279,788 bytes.]
>
> …
>
>  [GC: Starting gc #77; requesting 512 nursery bytes and 0 old-gen bytes,]
>
> [GC:    heap at 0x08fc0000 of size 710,967,296 bytes,]
>
> [GC:    with nursery of size 612,833,480 bytes (86.2% of heap),]
>
> [GC:    and old-gen of size 98,133,816 bytes (13.8% of heap),]
>
> …
>
> [GC: Starting major Cheney-copy;]
>
> [GC:    from heap at 0x08fc0000 of size 710,967,296 bytes,]
>
> [GC:    to heap at 0x31880000 of size 710,967,296 bytes.]
>
> …
>
> foreachObjptrInObject (0x318c2318)  header = 000004c7  tag = ARRAY
> bytesNonObjptrs = 0  numObjptrs = 1
>
> forwardObjptr  opp = 0x318c2318  op = 0x00000000091b7714  p = 0x091b7714
>
> forwardObjptr --> *opp = 0x319f57dc
>
> …
>
> forwardObjptr  opp = 0x318d1354  op = 0x00000000091d78c4  p = 0x091d78c4
>
> forwardObjptr --> *opp = 0x31a173ac
>
> forwardObjptr  opp = 0x318d1358  op = 0x00000000318da094  p = 0x318da094
>
> Assertion failed at line 58 of file gc/object.c
>
> ===> corresponds to the “assert (1 == (header & GC_VALID_HEADER_MASK));”
> at the beginning of the splitHeader() function
>
>  
>
> Using those debug messages, we can see that all calls to forwardObjptr()
> are performed on objects whose address is, as expected, in the “from”
> heap whereas the last call that leads to a crash receives an invalid
> pointer whose address is in the “to” heap.
>
>  
>
> We also see that the 2 heaps have been allocated during a previous call
> to the garbage collector and a previous “cheney-copy” has already been
> performed between those.
>
>  
>
> I suspect that maybe a previous GC operation left some old pointers in
> one of the heaps and those have not been properly cleared during an
> object allocation or so on.
>
>  
>
> In any case, I simply don’t know what to do in order to identify the
> root cause of the issue. Any hint ?
>
>  
>
> Best regards
>
>  
>
> cid:image001.gif@... <http://www.mathworks.fr/>
>
>  
>
>
>
> Accelerating the pace of  engineering and science <http://www.mathworks.fr/>
>
> *Nicolas Bertolotti*
> Senior Development Engineer
>
>
>
> 2 Rue de Paris
> 92196 Meudon Cedex
>
> France
>
> Nicolas.Bertolotti@... <mailto:Nicolas.Bertolotti@...>
>
>
>
> tel:
> fax:
> mobile:
>
>
>
> +33.1.41.14.88.55
>
> +33.1.55.64.06.64
>
> +33.6.86.41.87.15
>
>  
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> MLton mailing list
> MLton@...
> http://mlton.org/mailman/listinfo/mlton

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

RE: Crash during cheney-copy on Windows

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> It's not clear to me exactly what debugging you are able to enable and
> still observe the problem.  I believe the most important check would be
> "invariantForGC" which is run at the beginning and end of each
> collection.  Are you able to run this function and still observe the
> problem?  (It is in the debug version of the runtime and is also
> enabled
> by -DASSERT=1.)

The assertions are enabled so invariantForGC() is called and does not reveal anything.

>
> If you are suspicious of old data in the heap, you could try explicitly
> clearing the new heap at the beginning of a Cheney copy (i.e. in
> majorCheneyCopyGC).

I examined the array allocation routine and it definitely properly resets the contents of all the cells.

It is also not a limit case (such as a '<' instead of a '<=' somewhere) either as the crash appears to occur around the cell 15000 of a 32768 cells array.

Still investigating ...

>
>
> --djs
>
> Nicolas Bertolotti wrote:
> > Hello,
> >
> >
> >
> > I am currently facing a crash of my product during the cheney-copy
> > operation on Windows. This crash is very hard to reproduce as it is
> very
> > volatile (some slight changes in the SML code make it disappear ; it
> > depends on the memory amount etc…).
> >
> >
> >
> > I finally could activate some debug messages and assertions (it is
> not
> > full the debugging mode because enabling it causes the issue to
> disappear) :
> >
> > [GC: Starting gc #73; requesting 512 nursery bytes and 0 old-gen
> bytes,]
> >
> > [GC:    heap at 0x31880000 of size 710,967,296 bytes,]
> >
> > [GC:    with nursery of size 617,405,820 bytes (86.8% of heap),]
> >
> > [GC:    and old-gen of size 93,561,476 bytes (13.2% of heap),]
> >
> > …
> >
> > [GC: Starting major Cheney-copy;]
> >
> > [GC:    from heap at 0x31880000 of size 710,967,296 bytes,]
> >
> > [GC:    to heap at 0x08fc0000 of size 710,967,296 bytes.]
> >
> > …
> >
> > [GC: Finished major Cheney-copy; copied 97,279,788 bytes.]
> >
> > …
> >
> >  [GC: Starting gc #77; requesting 512 nursery bytes and 0 old-gen
> bytes,]
> >
> > [GC:    heap at 0x08fc0000 of size 710,967,296 bytes,]
> >
> > [GC:    with nursery of size 612,833,480 bytes (86.2% of heap),]
> >
> > [GC:    and old-gen of size 98,133,816 bytes (13.8% of heap),]
> >
> > …
> >
> > [GC: Starting major Cheney-copy;]
> >
> > [GC:    from heap at 0x08fc0000 of size 710,967,296 bytes,]
> >
> > [GC:    to heap at 0x31880000 of size 710,967,296 bytes.]
> >
> > …
> >
> > foreachObjptrInObject (0x318c2318)  header = 000004c7  tag = ARRAY
> > bytesNonObjptrs = 0  numObjptrs = 1
> >
> > forwardObjptr  opp = 0x318c2318  op = 0x00000000091b7714  p =
> 0x091b7714
> >
> > forwardObjptr --> *opp = 0x319f57dc
> >
> > …
> >
> > forwardObjptr  opp = 0x318d1354  op = 0x00000000091d78c4  p =
> 0x091d78c4
> >
> > forwardObjptr --> *opp = 0x31a173ac
> >
> > forwardObjptr  opp = 0x318d1358  op = 0x00000000318da094  p =
> 0x318da094
> >
> > Assertion failed at line 58 of file gc/object.c
> >
> > ===> corresponds to the “assert (1 == (header &
> GC_VALID_HEADER_MASK));”
> > at the beginning of the splitHeader() function
> >
> >
> >
> > Using those debug messages, we can see that all calls to
> forwardObjptr()
> > are performed on objects whose address is, as expected, in the “from”
> > heap whereas the last call that leads to a crash receives an invalid
> > pointer whose address is in the “to” heap.
> >
> >
> >
> > We also see that the 2 heaps have been allocated during a previous
> call
> > to the garbage collector and a previous “cheney-copy” has already
> been
> > performed between those.
> >
> >
> >
> > I suspect that maybe a previous GC operation left some old pointers
> in
> > one of the heaps and those have not been properly cleared during an
> > object allocation or so on.
> >
> >
> >
> > In any case, I simply don’t know what to do in order to identify the
> > root cause of the issue. Any hint ?
> >
> >
> >
> > Best regards
> >
> >
> >
> > cid:image001.gif@... <http://www.mathworks.fr/>
> >
> >
> >
> >
> >
> > Accelerating the pace of  engineering and science
> <http://www.mathworks.fr/>
> >
> > *Nicolas Bertolotti*
> > Senior Development Engineer
> >
> >
> >
> > 2 Rue de Paris
> > 92196 Meudon Cedex
> >
> > France
> >
> > Nicolas.Bertolotti@...
> <mailto:Nicolas.Bertolotti@...>
> >
> >
> >
> > tel:
> > fax:
> > mobile:
> >
> >
> >
> > +33.1.41.14.88.55
> >
> > +33.1.55.64.06.64
> >
> > +33.6.86.41.87.15
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> ---
> >
> > _______________________________________________
> > MLton mailing list
> > MLton@...
> > http://mlton.org/mailman/listinfo/mlton

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

RE: Crash during cheney-copy on Windows

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

I am facing a sporadic signal 11 (segmentation fault) on Linux which could be caused by the same bug.

 

After enabling assertions, I could identify that there was a sporadic assertion failure in updateCrossMap() :

void updateCrossMap (GC_state s) {

  GC_cardMapIndex cardIndex;

  pointer cardStart, cardEnd;

  cardEnd = cardStart + CARD_SIZE;

loopObjects:

  assert (objectStart < oldGenEnd);     <= this assertion may sporadically fail

  assert ((objectStart == s->heap.start or cardStart < objectStart)

          and objectStart <= cardEnd);

 

The assertion fails during the execution of a minor cheney copy that occurs after 2 calls to GC_pack() and the heap size did not change during the execution of the 2nd GC_pack().

 

I finally identified that the SVN revision r6776 introduced a change that was motivated by the fact we need to clear the cross map after every major GC. But, if we look at the code, we can see that the cross map is only cleared when the ‘mayResize’ flag is set (and, as a matter of fact, this flag is not set by GC_pack()) :

void majorGC (GC_state s, size_t bytesRequested, bool mayResize) {

  if (mayResize) {

    resizeHeap (s, s->lastMajorStatistics.bytesLive + bytesRequested);

   setCardMapAndCrossMap (s);

}

}

As the revision r6776 also introduces the removal of some calls to clearCrossMap() which were performed systematically at the end of a major cheney-copy or major mark-compact, it seems to me that the call to setCardMapAndCrossMap(s) should actually always be performed (or maybe adding a else { clearCrossMap(s); } would be enough).

 

I moved the call to setCardMapAndCrossMap(s) after the if and it seems to solve the issue (anyway, as it was sporadic, I am not so sure)

 

What do you think ?

 

Nicolas

 

From: mlton-bounces@... [mailto:mlton-bounces@...] On Behalf Of Nicolas Bertolotti
Sent: Thursday, February 12, 2009 12:02 PM
To: mlton@...
Subject: [MLton] Crash during cheney-copy on Windows

 

Hello,

 

I am currently facing a crash of my product during the cheney-copy operation on Windows. This crash is very hard to reproduce as it is very volatile (some slight changes in the SML code make it disappear ; it depends on the memory amount etc…).

 

I finally could activate some debug messages and assertions (it is not full the debugging mode because enabling it causes the issue to disappear) :

[GC: Starting gc #73; requesting 512 nursery bytes and 0 old-gen bytes,]

[GC:    heap at 0x31880000 of size 710,967,296 bytes,]

[GC:    with nursery of size 617,405,820 bytes (86.8% of heap),]

[GC:    and old-gen of size 93,561,476 bytes (13.2% of heap),]

[GC: Starting major Cheney-copy;]

[GC:    from heap at 0x31880000 of size 710,967,296 bytes,]

[GC:    to heap at 0x08fc0000 of size 710,967,296 bytes.]

[GC: Finished major Cheney-copy; copied 97,279,788 bytes.]

 [GC: Starting gc #77; requesting 512 nursery bytes and 0 old-gen bytes,]

[GC:    heap at 0x08fc0000 of size 710,967,296 bytes,]

[GC:    with nursery of size 612,833,480 bytes (86.2% of heap),]

[GC:    and old-gen of size 98,133,816 bytes (13.8% of heap),]

[GC: Starting major Cheney-copy;]

[GC:    from heap at 0x08fc0000 of size 710,967,296 bytes,]

[GC:    to heap at 0x31880000 of size 710,967,296 bytes.]

foreachObjptrInObject (0x318c2318)  header = 000004c7  tag = ARRAY  bytesNonObjptrs = 0  numObjptrs = 1

forwardObjptr  opp = 0x318c2318  op = 0x00000000091b7714  p = 0x091b7714

forwardObjptr --> *opp = 0x319f57dc

forwardObjptr  opp = 0x318d1354  op = 0x00000000091d78c4  p = 0x091d78c4

forwardObjptr --> *opp = 0x31a173ac

forwardObjptr  opp = 0x318d1358  op = 0x00000000318da094  p = 0x318da094

Assertion failed at line 58 of file gc/object.c

===> corresponds to the “assert (1 == (header & GC_VALID_HEADER_MASK));” at the beginning of the splitHeader() function

 

Using those debug messages, we can see that all calls to forwardObjptr() are performed on objects whose address is, as expected, in the “from” heap whereas the last call that leads to a crash receives an invalid pointer whose address is in the “to” heap.

 

We also see that the 2 heaps have been allocated during a previous call to the garbage collector and a previous “cheney-copy” has already been performed between those.

 

I suspect that maybe a previous GC operation left some old pointers in one of the heaps and those have not been properly cleared during an object allocation or so on.

 

In any case, I simply don’t know what to do in order to identify the root cause of the issue. Any hint ?

 

Best regards

 

cid:image001.gif@01C7BFD3.87CF8F80

 

Accelerating the pace of  engineering and science

Nicolas Bertolotti
Senior Development Engineer

2 Rue de Paris
92196 Meudon Cedex

France

Nicolas.Bertolotti@...

tel:
fax:
mobile:

+33.1.41.14.88.55

+33.1.55.64.06.64

+33.6.86.41.87.15

 



_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

RE: Crash during cheney-copy on Windows

by Nicolas Bertolotti-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I am facing a sporadic signal 11 (segmentation fault) on Linux which could be caused by the same bug.

After enabling assertions, I could identify that there was a sporadic assertion failure in updateCrossMap() :
void updateCrossMap (GC_state s) {
  GC_cardMapIndex cardIndex;
  pointer cardStart, cardEnd;

  cardEnd = cardStart + CARD_SIZE;
loopObjects:
  assert (objectStart < oldGenEnd);     <= this assertion may sporadically fail
  assert ((objectStart == s->heap.start or cardStart < objectStart)
          and objectStart <= cardEnd);


The assertion fails during the execution of a minor cheney copy that occurs after 2 calls to GC_pack() and the heap size did not change during the execution of the 2nd GC_pack().

I finally identified that the SVN revision r6776 introduced a change that was motivated by the fact we need to clear the cross map after every major GC. But, if we look at the code, we can see that the cross map is only cleared when the ‘mayResize’ flag is set (and, as a matter of fact, this flag is not set by GC_pack()) :
void majorGC (GC_state s, size_t bytesRequested, bool mayResize) {

  if (mayResize) {
    resizeHeap (s, s->lastMajorStatistics.bytesLive + bytesRequested);
   setCardMapAndCrossMap (s);
}

}
As the revision r6776 also introduces the removal of some calls to clearCrossMap() which were performed systematically at the end of a major cheney-copy or major mark-compact, it seems to me that the call to setCardMapAndCrossMap(s) should actually always be performed (or maybe adding a else { clearCrossMap(s); } would be enough).

I moved the call to setCardMapAndCrossMap(s) after the if and it seems to solve the issue (anyway, as it was sporadic, I am not so sure)

What do you think ?

Nicolas

> -----Original Message-----
> From: mlton-bounces@... [mailto:mlton-bounces@...] On
> Behalf Of Nicolas Bertolotti
> Sent: Thursday, February 12, 2009 7:56 PM
> To: Daniel Spoonhower
> Cc: mlton@...
> Subject: RE: [MLton] Crash during cheney-copy on Windows
>
> > It's not clear to me exactly what debugging you are able to enable
> and
> > still observe the problem.  I believe the most important check would
> > be "invariantForGC" which is run at the beginning and end of each
> > collection.  Are you able to run this function and still observe the
> > problem?  (It is in the debug version of the runtime and is also
> > enabled by -DASSERT=1.)
>
> The assertions are enabled so invariantForGC() is called and does not
> reveal anything.
>
> >
> > If you are suspicious of old data in the heap, you could try
> > explicitly clearing the new heap at the beginning of a Cheney copy
> > (i.e. in majorCheneyCopyGC).
>
> I examined the array allocation routine and it definitely properly
> resets the contents of all the cells.
>
> It is also not a limit case (such as a '<' instead of a '<=' somewhere)
> either as the crash appears to occur around the cell 15000 of a 32768
> cells array.
>
> Still investigating ...
>
> >
> >
> > --djs
> >
> > Nicolas Bertolotti wrote:
> > > Hello,
> > >
> > >
> > >
> > > I am currently facing a crash of my product during the cheney-copy
> > > operation on Windows. This crash is very hard to reproduce as it is
> > very
> > > volatile (some slight changes in the SML code make it disappear ;
> it
> > > depends on the memory amount etc…).
> > >
> > >
> > >
> > > I finally could activate some debug messages and assertions (it is
> > not
> > > full the debugging mode because enabling it causes the issue to
> > disappear) :
> > >
> > > [GC: Starting gc #73; requesting 512 nursery bytes and 0 old-gen
> > bytes,]
> > >
> > > [GC:    heap at 0x31880000 of size 710,967,296 bytes,]
> > >
> > > [GC:    with nursery of size 617,405,820 bytes (86.8% of heap),]
> > >
> > > [GC:    and old-gen of size 93,561,476 bytes (13.2% of heap),]
> > >
> > > …
> > >
> > > [GC: Starting major Cheney-copy;]
> > >
> > > [GC:    from heap at 0x31880000 of size 710,967,296 bytes,]
> > >
> > > [GC:    to heap at 0x08fc0000 of size 710,967,296 bytes.]
> > >
> > > …
> > >
> > > [GC: Finished major Cheney-copy; copied 97,279,788 bytes.]
> > >
> > > …
> > >
> > >  [GC: Starting gc #77; requesting 512 nursery bytes and 0 old-gen
> > bytes,]
> > >
> > > [GC:    heap at 0x08fc0000 of size 710,967,296 bytes,]
> > >
> > > [GC:    with nursery of size 612,833,480 bytes (86.2% of heap),]
> > >
> > > [GC:    and old-gen of size 98,133,816 bytes (13.8% of heap),]
> > >
> > > …
> > >
> > > [GC: Starting major Cheney-copy;]
> > >
> > > [GC:    from heap at 0x08fc0000 of size 710,967,296 bytes,]
> > >
> > > [GC:    to heap at 0x31880000 of size 710,967,296 bytes.]
> > >
> > > …
> > >
> > > foreachObjptrInObject (0x318c2318)  header = 000004c7  tag = ARRAY
> > > bytesNonObjptrs = 0  numObjptrs = 1
> > >
> > > forwardObjptr  opp = 0x318c2318  op = 0x00000000091b7714  p =
> > 0x091b7714
> > >
> > > forwardObjptr --> *opp = 0x319f57dc
> > >
> > > …
> > >
> > > forwardObjptr  opp = 0x318d1354  op = 0x00000000091d78c4  p =
> > 0x091d78c4
> > >
> > > forwardObjptr --> *opp = 0x31a173ac
> > >
> > > forwardObjptr  opp = 0x318d1358  op = 0x00000000318da094  p =
> > 0x318da094
> > >
> > > Assertion failed at line 58 of file gc/object.c
> > >
> > > ===> corresponds to the “assert (1 == (header &
> > GC_VALID_HEADER_MASK));”
> > > at the beginning of the splitHeader() function
> > >
> > >
> > >
> > > Using those debug messages, we can see that all calls to
> > forwardObjptr()
> > > are performed on objects whose address is, as expected, in the
> “from”
> > > heap whereas the last call that leads to a crash receives an
> invalid
> > > pointer whose address is in the “to” heap.
> > >
> > >
> > >
> > > We also see that the 2 heaps have been allocated during a previous
> > call
> > > to the garbage collector and a previous “cheney-copy” has already
> > been
> > > performed between those.
> > >
> > >
> > >
> > > I suspect that maybe a previous GC operation left some old pointers
> > in
> > > one of the heaps and those have not been properly cleared during an
> > > object allocation or so on.
> > >
> > >
> > >
> > > In any case, I simply don’t know what to do in order to identify
> the
> > > root cause of the issue. Any hint ?
> > >
> > >
> > >
> > > Best regards
> > >
> > >
> > >
> > > cid:image001.gif@... <http://www.mathworks.fr/>
> > >
> > >
> > >
> > >
> > >
> > > Accelerating the pace of  engineering and science
> > <http://www.mathworks.fr/>
> > >
> > > *Nicolas Bertolotti*
> > > Senior Development Engineer
> > >
> > >
> > >
> > > 2 Rue de Paris
> > > 92196 Meudon Cedex
> > >
> > > France
> > >
> > > Nicolas.Bertolotti@...
> > <mailto:Nicolas.Bertolotti@...>
> > >
> > >
> > >
> > > tel:
> > > fax:
> > > mobile:
> > >
> > >
> > >
> > > +33.1.41.14.88.55
> > >
> > > +33.1.55.64.06.64
> > >
> > > +33.6.86.41.87.15
> > >
> > >
> > >
> > >
> > > -------------------------------------------------------------------
> -
> > > -
> > ---
> > >
> > > _______________________________________________
> > > MLton mailing list
> > > MLton@...
> > > http://mlton.org/mailman/listinfo/mlton

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton

Re: RE: Crash during cheney-copy on Windows

by Matthew Fluet-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 1 Sep 2009, Nicolas Bertolotti wrote:
> I am facing a sporadic signal 11 (segmentation fault) on Linux which
> could be caused by the same bug.

I'm not sure this bug would cause the previous failure.  An invalid
card/cross map would cause problems for a minor gc, but a major gc ignores
the card/cross map.  The previous failure occured during a major gc.  Of
course, an invalid card/cross map might cause the minor gc to leave the
heap in a bad state that triggers the failure in the major gc, but I would
expect the failure to come during the minor gc.

> I finally identified that the SVN revision r6776 introduced a change
> that was motivated by the fact we need to clear the cross map after
> every major GC. But, if we look at the code, we can see that the cross
> map is only cleared when the 'mayResize' flag is set (and, as a matter
> of fact, this flag is not set by GC_pack()) :
> void majorGC (GC_state s, size_t bytesRequested, bool mayResize) {
> ...
>  if (mayResize) {
>    resizeHeap (s, s->lastMajorStatistics.bytesLive + bytesRequested);
>   setCardMapAndCrossMap (s);
> }
> ...
> }
> As the revision r6776 also introduces the removal of some calls to
> clearCrossMap() which were performed systematically at the end of a
> major cheney-copy or major mark-compact, it seems to me that the call to
> setCardMapAndCrossMap(s) should actually always be performed (or maybe
> adding a else { clearCrossMap(s); } would be enough).

Your analysis looks completely right.  I will note that a major Cheney
Copy GC does clear the card/cross map (via swapHeapsForCheneyCopy), so it
is only a GC_pack that induces a Mark Compact GC that would leave
the card/cross maps in an invalid state.  Performing GC_packs in
close succession is likely to induce a Mark Compact GC (since the
packed heap isn't big enough for a Cheney Copy).

Further investigation might find a way to avoid unnecessarily setting and
clearing the maps, but your solution to unconditionally
setCardMapAndCrossMap at the end of a majorGC is certainly correct and
expedient, so I've committed it (SVN r7228).  It simply performs
setCardMapAndCrossMap twice in the case of a major Cheney Copy; it might
be possible to eliminate the setCardMapAndCrossMap call from
swapHeapsForCheneyCopy, if there are no assertions that the current heap
and the current card/cross maps agree between the swapHeapsForCheneyCopy
and the unconditional setCardMapAndCrossMap at the end of majorGC; there
don't appear to be any.  SVN r6776 was attempting to systematically
setCardMapAndCrossMap immediately after any change the heap pointer or the
heap size, but that can probably be relaxed with respect to the
swapHeapsForCheneyCopy.

_______________________________________________
MLton mailing list
MLton@...
http://mlton.org/mailman/listinfo/mlton