pv 2.6.31 (kernel.org) and save/migrate

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

pv 2.6.31 (kernel.org) and save/migrate

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sorry for another possibly stupid question:

I've observed that for a pv domain that's been updated
to a 2.6.31 kernel (straight from kernel.org), "xm save"
never completes.  When the older kernel (2.6.18)
is booted, "xm save" works fine.  Is this a known problem...
or perhaps xm save has never worked with an upstream pv
kernel and I've never noticed?

I'd assume migrate and live migrate would fail also but
haven't tried them.

Thanks,
Dan

P.S. This is with very recent xen-unstable, c/s 20399.

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:

> Sorry for another possibly stupid question:
>
> I've observed that for a pv domain that's been updated
> to a 2.6.31 kernel (straight from kernel.org), "xm save"
> never completes.  When the older kernel (2.6.18)
> is booted, "xm save" works fine.  Is this a known problem...
> or perhaps xm save has never worked with an upstream pv
> kernel and I've never noticed?
>
> I'd assume migrate and live migrate would fail also but
> haven't tried them.
>

Just checking.. are you running the latest 2.6.31.5 ? I think there has
been multiple xen related bugfixes in the 2.6.31.X releases.

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

RE: pv 2.6.31 (kernel.org) and save/migrate

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:
> > Sorry for another possibly stupid question:
> >
> > I've observed that for a pv domain that's been updated
> > to a 2.6.31 kernel (straight from kernel.org), "xm save"
> > never completes.  When the older kernel (2.6.18)
> > is booted, "xm save" works fine.  Is this a known problem...
> > or perhaps xm save has never worked with an upstream pv
> > kernel and I've never noticed?
> >
> > I'd assume migrate and live migrate would fail also but
> > haven't tried them.
> >
>
> Just checking.. are you running the latest 2.6.31.5 ? I think
> there has
> been multiple xen related bugfixes in the 2.6.31.X releases.
>
> -- Pasi

No it was plain 2.6.31.  But I downloaded/built 2.6.31.5 and
can't even get it to boot (and no console or VNC output at
all).  Are CONFIG changes required betwen 2.6.31 and 2.6.31.5
for Xen?  (I checked and I am using the same .config.)

Trying to reproduce on a different machine, just to verify.

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Nov 06, 2009 at 02:27:27PM -0800, Dan Magenheimer wrote:

> > On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:
> > > Sorry for another possibly stupid question:
> > >
> > > I've observed that for a pv domain that's been updated
> > > to a 2.6.31 kernel (straight from kernel.org), "xm save"
> > > never completes.  When the older kernel (2.6.18)
> > > is booted, "xm save" works fine.  Is this a known problem...
> > > or perhaps xm save has never worked with an upstream pv
> > > kernel and I've never noticed?
> > >
> > > I'd assume migrate and live migrate would fail also but
> > > haven't tried them.
> > >
> >
> > Just checking.. are you running the latest 2.6.31.5 ? I think
> > there has
> > been multiple xen related bugfixes in the 2.6.31.X releases.
> >
> > -- Pasi
>
> No it was plain 2.6.31.  But I downloaded/built 2.6.31.5 and
> can't even get it to boot (and no console or VNC output at
> all).  Are CONFIG changes required betwen 2.6.31 and 2.6.31.5
> for Xen?  (I checked and I am using the same .config.)
>
> Trying to reproduce on a different machine, just to verify.
>

There shouldn't be any .config changes needed.

Can you paste the full domU console output? Does it crash or?

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

RE: pv 2.6.31 (kernel.org) and save/migrate

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> On Fri, Nov 06, 2009 at 02:27:27PM -0800, Dan Magenheimer wrote:
> > > On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:
> > > > Sorry for another possibly stupid question:
> > > >
> > > > I've observed that for a pv domain that's been updated
> > > > to a 2.6.31 kernel (straight from kernel.org), "xm save"
> > > > never completes.  When the older kernel (2.6.18)
> > > > is booted, "xm save" works fine.  Is this a known problem...
> > > > or perhaps xm save has never worked with an upstream pv
> > > > kernel and I've never noticed?
> > > >
> > > > I'd assume migrate and live migrate would fail also but
> > > > haven't tried them.
> > > >
> > >
> > > Just checking.. are you running the latest 2.6.31.5 ? I think
> > > there has
> > > been multiple xen related bugfixes in the 2.6.31.X releases.
> > >
> > > -- Pasi
> >
> > No it was plain 2.6.31.  But I downloaded/built 2.6.31.5 and
> > can't even get it to boot (and no console or VNC output at
> > all).  Are CONFIG changes required betwen 2.6.31 and 2.6.31.5
> > for Xen?  (I checked and I am using the same .config.)
> >
> > Trying to reproduce on a different machine, just to verify.
>
> There shouldn't be any .config changes needed.
>
> Can you paste the full domU console output? Does it crash or?
>
> -- Pasi

Well, first, I got 2.6.31.5 to boot in a PV guest in another
machine and it fails to save also.  Are you able to save
2.6.31{,.5} successfully?  On latest xen-unstable?
(NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
know if that is important.)

(On the machine I couldn't boot 2.6.31.5 as a PV guest, there
was absolutely no console output.  However, I think tools
are out-of-date on that machine so ignore that.)

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate

by Jeremy Fitzhardinge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/06/09 12:37, Pasi Kärkkäinen wrote:

> On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:
>  
>> Sorry for another possibly stupid question:
>>
>> I've observed that for a pv domain that's been updated
>> to a 2.6.31 kernel (straight from kernel.org), "xm save"
>> never completes.  When the older kernel (2.6.18)
>> is booted, "xm save" works fine.  Is this a known problem...
>> or perhaps xm save has never worked with an upstream pv
>> kernel and I've never noticed?
>>
>> I'd assume migrate and live migrate would fail also but
>> haven't tried them.
>>
>>    
> Just checking.. are you running the latest 2.6.31.5 ? I think there has
> been multiple xen related bugfixes in the 2.6.31.X releases.
>  

Nothing relating to save/restore.  Does it work for you?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Nov 06, 2009 at 04:08:26PM -0800, Dan Magenheimer wrote:

> > On Fri, Nov 06, 2009 at 02:27:27PM -0800, Dan Magenheimer wrote:
> > > > On Fri, Nov 06, 2009 at 10:37:49AM -0800, Dan Magenheimer wrote:
> > > > > Sorry for another possibly stupid question:
> > > > >
> > > > > I've observed that for a pv domain that's been updated
> > > > > to a 2.6.31 kernel (straight from kernel.org), "xm save"
> > > > > never completes.  When the older kernel (2.6.18)
> > > > > is booted, "xm save" works fine.  Is this a known problem...
> > > > > or perhaps xm save has never worked with an upstream pv
> > > > > kernel and I've never noticed?
> > > > >
> > > > > I'd assume migrate and live migrate would fail also but
> > > > > haven't tried them.
> > > > >
> > > >
> > > > Just checking.. are you running the latest 2.6.31.5 ? I think
> > > > there has
> > > > been multiple xen related bugfixes in the 2.6.31.X releases.
> > > >
> > > > -- Pasi
> > >
> > > No it was plain 2.6.31.  But I downloaded/built 2.6.31.5 and
> > > can't even get it to boot (and no console or VNC output at
> > > all).  Are CONFIG changes required betwen 2.6.31 and 2.6.31.5
> > > for Xen?  (I checked and I am using the same .config.)
> > >
> > > Trying to reproduce on a different machine, just to verify.
> >
> > There shouldn't be any .config changes needed.
> >
> > Can you paste the full domU console output? Does it crash or?
> >
> > -- Pasi
>
> Well, first, I got 2.6.31.5 to boot in a PV guest in another
> machine and it fails to save also.  Are you able to save
> 2.6.31{,.5} successfully?  On latest xen-unstable?
> (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> know if that is important.)
>

I'll have to try it later today..

> (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> was absolutely no console output.  However, I think tools
> are out-of-date on that machine so ignore that.)

Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
parameters?

You might also change the xen guest cfgfile so that you have
on_crash=preserve and then when the PV guest is crashed run this:

/usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>

(if you have 64b host the xenctx binary might be under /usr/lib64/)

to get a stack trace..

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

RE: pv 2.6.31 (kernel.org) and save/migrate

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > machine and it fails to save also.  Are you able to save
> > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > know if that is important.)
>
> I'll have to try it later today..

Let me know.

> > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > was absolutely no console output.  However, I think tools
> > are out-of-date on that machine so ignore that.)
>
> Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> parameters?

No, but that didn't work either.

> You might also change the xen guest cfgfile so that you have
> on_crash=preserve and then when the PV guest is crashed run this:
>
> /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
>
> (if you have 64b host the xenctx binary might be under /usr/lib64/)
>
> to get a stack trace..

Very interesting and useful!  I was completely unaware of
xenctx and could have used it many times in tmem development!

The results explain why I can get it to run on
one machine (an older laptop) and not run on another
machine (a Nehalem system)... looks like this is maybe
related to the cpuid-extended-topology-leaf bug that Jeremy
sent a fix for upstream recently.

cs:eip: e019:c040342d xen_cpuid+0x46
flags: 00001206 i nz p
ss:esp: e021:c0779ee4
eax: 00000001 ebx: 00000002 ecx: 00000100 edx: 00000001
esi: c0779f1c edi: c0779f18 ebp: c0779f24
 ds:     e021 es:     e021 fs:     00d8 gs:     0000
Code (instr addr c040342d)
24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 04 24 89 18 89 0e 89


Stack:
 c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
 c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
 0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
 c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000

Call Trace:
  [<c040342d>] xen_cpuid+0x46  <--
  [<c066fd0f>] detect_extended_topology+0xae
  [<c0670ed8>] init_intel+0x140
  [<c066fe6a>] init_scattered_cpuid_features+0x82
  [<c06705e2>] identify_cpu+0x22d
  [<c040584c>] xen_force_evtchn_callback+0xc
  [<c0405e78>] check_events+0x8
  [<c07c9dec>] identify_boot_cpu+0xa
  [<c07c9e9a>] check_bugs+0x8
  [<c07c27bd>] start_kernel+0x2a0
  [<c07c5206>] xen_start_kernel+0x340

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote:

> > > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > > machine and it fails to save also.  Are you able to save
> > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > know if that is important.)
> >
> > I'll have to try it later today..
>
> Let me know.
>

Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to
"xm save" and "xm restore" it without problems.

But I noticed there was a BUG printed on the guest console:
http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt

BUG: sleeping function called from invalid context at kernel/mutex.c:94
in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
Call Trace:
 [<ffffffff8104021f>] __might_sleep+0xe6/0xe8
 [<ffffffff81419c84>] mutex_lock+0x22/0x4e
 [<ffffffff812afdce>] dpm_resume_noirq+0x21/0x11f
 [<ffffffff81272b05>] xen_suspend+0xca/0xd1
 [<ffffffff8108c172>] stop_cpu+0x8c/0xd2
 [<ffffffff8106350c>] worker_thread+0x18a/0x224
 [<ffffffff81067ae7>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff8141ab29>] ? _spin_unlock_irqrestore+0x19/0x1b
 [<ffffffff81063382>] ? worker_thread+0x0/0x224
 [<ffffffff81067765>] kthread+0x91/0x99
 [<ffffffff81012daa>] child_rip+0xa/0x20
 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81012da0>] ? child_rip+0x0/0x20


More information about my setup:

Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and
custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old).

Guest/domU: Fedora 12 (latest rawhide) with the included/default
2.6.31.5-122.fc12.x86_64 kernel.

> > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > > was absolutely no console output.  However, I think tools
> > > are out-of-date on that machine so ignore that.)
> >
> > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> > parameters?
>
> No, but that didn't work either.
>

Ok.. then it crashes really early.

> > You might also change the xen guest cfgfile so that you have
> > on_crash=preserve and then when the PV guest is crashed run this:
> >
> > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
> >
> > (if you have 64b host the xenctx binary might be under /usr/lib64/)
> >
> > to get a stack trace..
>
> Very interesting and useful!  I was completely unaware of
> xenctx and could have used it many times in tmem development!
>
> The results explain why I can get it to run on
> one machine (an older laptop) and not run on another
> machine (a Nehalem system)... looks like this is maybe
> related to the cpuid-extended-topology-leaf bug that Jeremy
> sent a fix for upstream recently.
>

Did you try with that patch applied?

-- Pasi

> cs:eip: e019:c040342d xen_cpuid+0x46
> flags: 00001206 i nz p
> ss:esp: e021:c0779ee4
> eax: 00000001 ebx: 00000002 ecx: 00000100 edx: 00000001
> esi: c0779f1c edi: c0779f18 ebp: c0779f24
>  ds:     e021 es:     e021 fs:     00d8 gs:     0000
> Code (instr addr c040342d)
> 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 04 24 89 18 89 0e 89
>
>
> Stack:
>  c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
>  c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
>  0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
>  c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000
>
> Call Trace:
>   [<c040342d>] xen_cpuid+0x46  <--
>   [<c066fd0f>] detect_extended_topology+0xae
>   [<c0670ed8>] init_intel+0x140
>   [<c066fe6a>] init_scattered_cpuid_features+0x82
>   [<c06705e2>] identify_cpu+0x22d
>   [<c040584c>] xen_force_evtchn_callback+0xc
>   [<c0405e78>] check_events+0x8
>   [<c07c9dec>] identify_boot_cpu+0xa
>   [<c07c9e9a>] check_bugs+0x8
>   [<c07c27bd>] start_kernel+0x2a0
>   [<c07c5206>] xen_start_kernel+0x340




_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 08, 2009 at 04:17:43PM +0200, Pasi Kärkkäinen wrote:

> On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote:
> > > > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > > > machine and it fails to save also.  Are you able to save
> > > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > > know if that is important.)
> > >
> > > I'll have to try it later today..
> >
> > Let me know.
> >
>
> Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to
> "xm save" and "xm restore" it without problems.
>
> But I noticed there was a BUG printed on the guest console:
> http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt
>
> BUG: sleeping function called from invalid context at kernel/mutex.c:94
> in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
> Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
> Call Trace:
>  [<ffffffff8104021f>] __might_sleep+0xe6/0xe8
>  [<ffffffff81419c84>] mutex_lock+0x22/0x4e
>  [<ffffffff812afdce>] dpm_resume_noirq+0x21/0x11f
>  [<ffffffff81272b05>] xen_suspend+0xca/0xd1
>  [<ffffffff8108c172>] stop_cpu+0x8c/0xd2
>  [<ffffffff8106350c>] worker_thread+0x18a/0x224
>  [<ffffffff81067ae7>] ? autoremove_wake_function+0x0/0x39
>  [<ffffffff8141ab29>] ? _spin_unlock_irqrestore+0x19/0x1b
>  [<ffffffff81063382>] ? worker_thread+0x0/0x224
>  [<ffffffff81067765>] kthread+0x91/0x99
>  [<ffffffff81012daa>] child_rip+0xa/0x20
>  [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b
>  [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6
>  [<ffffffff81012da0>] ? child_rip+0x0/0x20
>

Oh, I forgot to mention that this BUG is non-fatal. The guest still
works after that..

-- Pasi

>
> More information about my setup:
>
> Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and
> custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old).
>
> Guest/domU: Fedora 12 (latest rawhide) with the included/default
> 2.6.31.5-122.fc12.x86_64 kernel.
>
> > > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > > > was absolutely no console output.  However, I think tools
> > > > are out-of-date on that machine so ignore that.)
> > >
> > > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> > > parameters?
> >
> > No, but that didn't work either.
> >
>
> Ok.. then it crashes really early.
>
> > > You might also change the xen guest cfgfile so that you have
> > > on_crash=preserve and then when the PV guest is crashed run this:
> > >
> > > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
> > >
> > > (if you have 64b host the xenctx binary might be under /usr/lib64/)
> > >
> > > to get a stack trace..
> >
> > Very interesting and useful!  I was completely unaware of
> > xenctx and could have used it many times in tmem development!
> >
> > The results explain why I can get it to run on
> > one machine (an older laptop) and not run on another
> > machine (a Nehalem system)... looks like this is maybe
> > related to the cpuid-extended-topology-leaf bug that Jeremy
> > sent a fix for upstream recently.
> >
>
> Did you try with that patch applied?
>
> -- Pasi
>
> > cs:eip: e019:c040342d xen_cpuid+0x46
> > flags: 00001206 i nz p
> > ss:esp: e021:c0779ee4
> > eax: 00000001 ebx: 00000002 ecx: 00000100 edx: 00000001
> > esi: c0779f1c edi: c0779f18 ebp: c0779f24
> >  ds:     e021 es:     e021 fs:     00d8 gs:     0000
> > Code (instr addr c040342d)
> > 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 04 24 89 18 89 0e 89
> >
> >
> > Stack:
> >  c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
> >  c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
> >  0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
> >  c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000
> >
> > Call Trace:
> >   [<c040342d>] xen_cpuid+0x46  <--
> >   [<c066fd0f>] detect_extended_topology+0xae
> >   [<c0670ed8>] init_intel+0x140
> >   [<c066fe6a>] init_scattered_cpuid_features+0x82
> >   [<c06705e2>] identify_cpu+0x22d
> >   [<c040584c>] xen_force_evtchn_callback+0xc
> >   [<c0405e78>] check_events+0x8
> >   [<c07c9dec>] identify_boot_cpu+0xa
> >   [<c07c9e9a>] check_bugs+0x8
> >   [<c07c27bd>] start_kernel+0x2a0
> >   [<c07c5206>] xen_start_kernel+0x340
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@...
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

RE: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > > > machine and it fails to save also.  Are you able to save
> > > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > > know if that is important.)
>
> Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to
> "xm save" and "xm restore" it without problems.
>
> But I noticed there was a BUG printed on the guest console:
> http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86
> _64-saverestore.txt
> BUG: sleeping function called from invalid context at
> kernel/mutex.c:94
> in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
> Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1

Ok, so it appears there is something problematic with
saving an upstream kernel.  It might be (partially) fixed
in Fedora 12 or maybe there is some other environmental
difference which makes save fail entirely on my system.

> > The results explain why I can get it to run on
> > one machine (an older laptop) and not run on another
> > machine (a Nehalem system)... looks like this is maybe
> > related to the cpuid-extended-topology-leaf bug that Jeremy
> > sent a fix for upstream recently.
>
> Did you try with that patch applied?

No, the patch wasn't posted, just a pull request to Linus,
so I don't have the patch (and am not a git expert so
am not sure how to get it).

http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html

So I'll try it again when .6 or .7 is available.

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 08, 2009 at 07:29:58AM -0800, Dan Magenheimer wrote:

> > > > > machine and it fails to save also.  Are you able to save
> > > > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > > > know if that is important.)
> >
> > Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to
> > "xm save" and "xm restore" it without problems.
> >
> > But I noticed there was a BUG printed on the guest console:
> > http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86
> > _64-saverestore.txt
> > BUG: sleeping function called from invalid context at
> > kernel/mutex.c:94
> > in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
> > Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
>
> Ok, so it appears there is something problematic with
> saving an upstream kernel.  It might be (partially) fixed
> in Fedora 12 or maybe there is some other environmental
> difference which makes save fail entirely on my system.
>

Yeah, fedora kernel has some patches, but it should be pretty
close to upstream kernel..

btw was your guest UP or SMP? Mine was UP..

> > > The results explain why I can get it to run on
> > > one machine (an older laptop) and not run on another
> > > machine (a Nehalem system)... looks like this is maybe
> > > related to the cpuid-extended-topology-leaf bug that Jeremy
> > > sent a fix for upstream recently.
> >
> > Did you try with that patch applied?
>
> No, the patch wasn't posted, just a pull request to Linus,
> so I don't have the patch (and am not a git expert so
> am not sure how to get it).
>
> http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html
>
> So I'll try it again when .6 or .7 is available.

See here for changelog:
http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=bugfix

You can get the diffs/patches from there using the links..

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 08, 2009 at 05:41:53PM +0200, Pasi Kärkkäinen wrote:

> On Sun, Nov 08, 2009 at 07:29:58AM -0800, Dan Magenheimer wrote:
> > > > > > machine and it fails to save also.  Are you able to save
> > > > > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > > > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > > > > know if that is important.)
> > >
> > > Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to
> > > "xm save" and "xm restore" it without problems.
> > >
> > > But I noticed there was a BUG printed on the guest console:
> > > http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86
> > > _64-saverestore.txt
> > > BUG: sleeping function called from invalid context at
> > > kernel/mutex.c:94
> > > in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
> > > Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
> >
> > Ok, so it appears there is something problematic with
> > saving an upstream kernel.  It might be (partially) fixed
> > in Fedora 12 or maybe there is some other environmental
> > difference which makes save fail entirely on my system.
> >
>
> Yeah, fedora kernel has some patches, but it should be pretty
> close to upstream kernel..
>
> btw was your guest UP or SMP? Mine was UP..
>

Ok.. saving SMP guest fails for me too:

[2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib64/xen/bin/xc_save 28 2 0 0 0
[2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the suspend evtchn port

Jeremy: Ideas what's causing that? "xm save" for UP 2.6.31.5 guest works
OK, but for SMP guest it fails with the error above.

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

RE: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Dan Magenheimer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > Ok, so it appears there is something problematic with
> > saving an upstream kernel.  It might be (partially) fixed
> > in Fedora 12 or maybe there is some other environmental
> > difference which makes save fail entirely on my system.
> >
>
> Yeah, fedora kernel has some patches, but it should be pretty
> close to upstream kernel..
>
> btw was your guest UP or SMP? Mine was UP..

Mine was SMP... switching to UP I can now save.  BUT...
restore doesn't seem to quite work.  The restore completes
but I get no response from the VNC console.  When I
use a tty console, after restore, I am getting
an infinite dump of

WARNING: at arch/x86/time.c:180 xen_sched_clock+0x2b

(see attached).

Did you try restore on Fedora 12?
 

> > > > The results explain why I can get it to run on
> > > > one machine (an older laptop) and not run on another
> > > > machine (a Nehalem system)... looks like this is maybe
> > > > related to the cpuid-extended-topology-leaf bug that Jeremy
> > > > sent a fix for upstream recently.
> > >
> > > Did you try with that patch applied?
> >
> > No, the patch wasn't posted, just a pull request to Linus,
> > so I don't have the patch (and am not a git expert so
> > am not sure how to get it).
> >
> > http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html
> >
> > So I'll try it again when .6 or .7 is available.
>
> See here for changelog:
> http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=bugfix
>
> You can get the diffs/patches from there using the links..
Thanks.  Yes, Jeremy's patch allows 2.6.31.5 (in a PV domain)
to completely boot on my Nehalem box.

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

restore.out (12K) Download Attachment

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 08, 2009 at 08:54:23AM -0800, Dan Magenheimer wrote:

> > > Ok, so it appears there is something problematic with
> > > saving an upstream kernel.  It might be (partially) fixed
> > > in Fedora 12 or maybe there is some other environmental
> > > difference which makes save fail entirely on my system.
> > >
> >
> > Yeah, fedora kernel has some patches, but it should be pretty
> > close to upstream kernel..
> >
> > btw was your guest UP or SMP? Mine was UP..
>
> Mine was SMP... switching to UP I can now save.  BUT...
> restore doesn't seem to quite work.  The restore completes
> but I get no response from the VNC console.  When I
> use a tty console, after restore, I am getting
> an infinite dump of
>
> WARNING: at arch/x86/time.c:180 xen_sched_clock+0x2b
>
> (see attached).
>
> Did you try restore on Fedora 12?
>  

Yeah. save+restore for UP F12 guest works for me
(except I get that non-fatal BUG on the guest).

SMP guest doesn't work.. save crashes it.

> > > > > The results explain why I can get it to run on
> > > > > one machine (an older laptop) and not run on another
> > > > > machine (a Nehalem system)... looks like this is maybe
> > > > > related to the cpuid-extended-topology-leaf bug that Jeremy
> > > > > sent a fix for upstream recently.
> > > >
> > > > Did you try with that patch applied?
> > >
> > > No, the patch wasn't posted, just a pull request to Linus,
> > > so I don't have the patch (and am not a git expert so
> > > am not sure how to get it).
> > >
> > > http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html
> > >
> > > So I'll try it again when .6 or .7 is available.
> >
> > See here for changelog:
> > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=bugfix
> >
> > You can get the diffs/patches from there using the links..
>
> Thanks.  Yes, Jeremy's patch allows 2.6.31.5 (in a PV domain)
> to completely boot on my Nehalem box.

Ok. But I guess those doesn't help for the save+restore problem..

-- Pasi



_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate fails, domU BUG

by Pasi Kärkkäinen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

Jeremy: Here's summary about these save/restore problems
using upstream Linux 2.6.31.5 PV guest.

For me:
        - I can "xm save" + "xm restore" UP guest, but I get non-fatal
          BUG in the guest kernel, see [1].
        - "xm save" fails for SMP guest with "failed to get the suspend evtchn port", see [2].

For Dan:
        - "xm save" works for UP guest, but "xm restore" doesn't, giving
          infinite xen_sched_clock related dumps in the guest kernel, see [3].
        - "xm save" for SMP guest fails, it never ends. I suspect this
          is the same problem I'm seeing.


[1] non-fatal BUG on the guest kernel after "xm restore":
http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt

[2] "xm log" contains:
[2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib64/xen/bin/xc_save 28 2 0 0 0
[2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the suspend evtchn port

[3] See the attachment in this email:
http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00391.html


Any tips how to debug these?

-- Pasi


On Sun, Nov 08, 2009 at 07:27:47PM +0200, Pasi Kärkkäinen wrote:

> On Sun, Nov 08, 2009 at 08:54:23AM -0800, Dan Magenheimer wrote:
> > > > Ok, so it appears there is something problematic with
> > > > saving an upstream kernel.  It might be (partially) fixed
> > > > in Fedora 12 or maybe there is some other environmental
> > > > difference which makes save fail entirely on my system.
> > > >
> > >
> > > Yeah, fedora kernel has some patches, but it should be pretty
> > > close to upstream kernel..
> > >
> > > btw was your guest UP or SMP? Mine was UP..
> >
> > Mine was SMP... switching to UP I can now save.  BUT...
> > restore doesn't seem to quite work.  The restore completes
> > but I get no response from the VNC console.  When I
> > use a tty console, after restore, I am getting
> > an infinite dump of
> >
> > WARNING: at arch/x86/time.c:180 xen_sched_clock+0x2b
> >
> > (see attached).
> >
> > Did you try restore on Fedora 12?
> >  
>
> Yeah. save+restore for UP F12 guest works for me
> (except I get that non-fatal BUG on the guest).
>
> SMP guest doesn't work.. save crashes it.
>
> > > > > > The results explain why I can get it to run on
> > > > > > one machine (an older laptop) and not run on another
> > > > > > machine (a Nehalem system)... looks like this is maybe
> > > > > > related to the cpuid-extended-topology-leaf bug that Jeremy
> > > > > > sent a fix for upstream recently.
> > > > >
> > > > > Did you try with that patch applied?
> > > >
> > > > No, the patch wasn't posted, just a pull request to Linus,
> > > > so I don't have the patch (and am not a git expert so
> > > > am not sure how to get it).
> > > >
> > > > http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00182.html
> > > >
> > > > So I'll try it again when .6 or .7 is available.
> > >
> > > See here for changelog:
> > > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=shortlog;h=bugfix
> > >
> > > You can get the diffs/patches from there using the links..
> >
> > Thanks.  Yes, Jeremy's patch allows 2.6.31.5 (in a PV domain)
> > to completely boot on my Nehalem box.
>
> Ok. But I guess those doesn't help for the save+restore problem..
>
> -- Pasi
>


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Jeremy Fitzhardinge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/08/09 08:48, Pasi Kärkkäinen wrote:
> Ok.. saving SMP guest fails for me too:
>
> [2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib64/xen/bin/xc_save 28 2 0 0 0
> [2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the suspend evtchn port
>
> Jeremy: Ideas what's causing that? "xm save" for UP 2.6.31.5 guest works
> OK, but for SMP guest it fails with the error above.

There's no "suspend evtchn port" in a pvops kernel.  That looks like a
Remus thing.  I think.

    J


_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Jeremy Fitzhardinge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/08/09 08:54, Dan Magenheimer wrote:
> Mine was SMP... switching to UP I can now save.  BUT...
> restore doesn't seem to quite work.  The restore completes
> but I get no response from the VNC console.  When I
> use a tty console, after restore, I am getting
> an infinite dump of
>
> WARNING: at arch/x86/time.c:180 xen_sched_clock+0x2b
>  

That means that the test to see that the CPU its currently running on is
not currently running according to Xen...  It's hard to imagine how it
got into that state...

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

by Brendan Cully-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday, 12 November 2009 at 15:16, Jeremy Fitzhardinge wrote:

> On 11/08/09 08:48, Pasi Kärkkäinen wrote:
> > Ok.. saving SMP guest fails for me too:
> >
> > [2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib64/xen/bin/xc_save 28 2 0 0 0
> > [2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the suspend evtchn port
> >
> > Jeremy: Ideas what's causing that? "xm save" for UP 2.6.31.5 guest works
> > OK, but for SMP guest it fails with the error above.
>
> There's no "suspend evtchn port" in a pvops kernel.  That looks like a
> Remus thing.  I think.

This is only an INFO-level message, because xc_save falls back to the
old xenstore method if it can't find a suspend event channel. I don't
know the context here, but this particular message ought to be
harmless.

The event channel was made for Remus, but regular xc_save also uses it
to reduce the downtime at the end of live migration.

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel

Re: pv 2.6.31 (kernel.org) and save/migrate fails, domU BUG

by Jeremy Fitzhardinge :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/10/09 02:08, Pasi Kärkkäinen wrote:

> Hello,
>
> Jeremy: Here's summary about these save/restore problems
> using upstream Linux 2.6.31.5 PV guest.
>
> For me:
> - I can "xm save" + "xm restore" UP guest, but I get non-fatal
>  BUG in the guest kernel, see [1].
> - "xm save" fails for SMP guest with "failed to get the suspend evtchn port", see [2].
>
> For Dan:
> - "xm save" works for UP guest, but "xm restore" doesn't, giving
>  infinite xen_sched_clock related dumps in the guest kernel, see [3].
> - "xm save" for SMP guest fails, it never ends. I suspect this
>  is the same problem I'm seeing.
>
>
> [1] non-fatal BUG on the guest kernel after "xm restore":
> http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt
>  

Does this help:

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 10d03d7..da57ea1 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -43,7 +43,6 @@ static int xen_suspend(void *data)
  if (err) {
  printk(KERN_ERR "xen_suspend: sysdev_suspend failed: %d\n",
  err);
- dpm_resume_noirq(PMSG_RESUME);
  return err;
  }
 
@@ -69,7 +68,6 @@ static int xen_suspend(void *data)
  }
 
  sysdev_resume();
- dpm_resume_noirq(PMSG_RESUME);
 
  return 0;
 }
@@ -108,6 +106,9 @@ static void do_suspend(void)
  }
 
  err = stop_machine(xen_suspend, &cancelled, cpumask_of(0));
+
+ dpm_resume_noirq(PMSG_RESUME);
+
  if (err) {
  printk(KERN_ERR "failed to start xen_suspend: %d\n", err);
  goto out;


> [2] "xm log" contains:
> [2009-11-09 23:44:38 1353] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib64/xen/bin/xc_save 28 2 0 0 0
> [2009-11-09 23:44:38 1353] INFO (XendCheckpoint:417) xc_save: failed to get the suspend evtchn port
>  

I think this may be a Remus side-effect.

> [3] See the attachment in this email:
> http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00391.html
>  

No idea about this one.  Needs a closer look.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@...
http://lists.xensource.com/xen-devel
< Prev | 1 - 2 | Next >