|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 | Next > |
|
|
8.0RC2 amd64 - kernel panic running make buildworldHi.
I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. When I try to do a make buildworld or make buildkernel the server reboots without any message left in the logs. The same happens when building bigger ports (for example ruby18 or perl58) With 8.0-RC2 debug flags and witness seem to be disabled in the standard GENERIC kernel, so unfortunately it is not possible for me to build a debug kernel without my server crashing.. Now my idea was to install the old 8.0-BETA4 and upgrade to RC2 through makeworld + buildkernel (gdb+witness). But no luck. When trying to upgrade to RC2 the 8.0-BETA4 also crashes. At least 8.0-BETA4 has debug + witness active in the GENERIC kernel.. So below some debug output of 8.0-BETA4 crashing. Has a vfs/ffs LOR problem with the BETA4 already been fixed? Does it make sense to send in a pr with the old 8.0-BETA4? BTW. I installed 7.2-STABLE on this same server and did a "make buildworld" and "make buildkernel" which completed without any problem. Cheers, --Kai ----- make buildworld -j7 crash, freebsd 8.0-amd64-beta4 ----- lock order reversal: 1st 0xffffff00073d5ba8 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423 2nd 0xffffff819d921558 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559 3rd 0xffffff00070c19d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:544 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e __lockmgr_args() at __lockmgr_args+0xcf3 ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x47 ffs_snapshot() at ffs_snapshot+0x1b9d ffs_mount() at ffs_mount+0x666 vfs_donmount() at vfs_donmount+0xcde nmount() at nmount+0x63 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007b14fc, rsp = 0x7fffffffe9b8, rbp = 0x800902530 --- lock order reversal: 1st 0xffffff819d921558 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559 2nd 0xffffff0007d9fa30 snaplk (snaplk) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:793 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e __lockmgr_args() at __lockmgr_args+0xcf3 ffs_lock() at ffs_lock+0x8c VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x47 ffs_snapshot() at ffs_snapshot+0x1a6a ffs_mount() at ffs_mount+0x666 vfs_donmount() at vfs_donmount+0xcde nmount() at nmount+0x63 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007b14fc, rsp = 0x7fffffffe9b8, rbp = 0x800902530 --- lock order reversal: 1st 0xffffff0007d9fa30 snaplk (snaplk) @ /usr/src/sys/kern/vfs_vnops.c:296 2nd 0xffffff00073d5ba8 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:1587 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e __lockmgr_args() at __lockmgr_args+0xcf3 ffs_snapremove() at ffs_snapremove+0xe7 softdep_releasefile() at softdep_releasefile+0x139 ufs_inactive() at ufs_inactive+0x1a5 vinactive() at vinactive+0x72 vput() at vput+0x230 vn_close() at vn_close+0x118 vn_closefile() at vn_closefile+0x5a _fdrop() at _fdrop+0x23 closef() at closef+0x5b kern_close() at kern_close+0x110 syscall() at syscall+0x1af Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (6, FreeBSD ELF64, close), rip = 0x80084cf9c, rsp = 0x7fffffffe9b8, rbp = 0 --- _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldOn Sat, 2009-10-31 at 23:15 +0100, Kai Gallasch wrote:
> Hi. > > I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > When I try to do a make buildworld or make buildkernel the server > reboots without any message left in the logs. The same happens > when building bigger ports (for example ruby18 or perl58) > > With 8.0-RC2 debug flags and witness seem to be disabled in the > standard GENERIC kernel, so unfortunately it is not possible for me to > build a debug kernel without my server crashing.. First place I think I'd start id by running memtest86 on the machine overnight. This sounds like possible hardware issue to me, it would be good to see if we can confirm that that is the case. > Now my idea was to install the old 8.0-BETA4 and upgrade to RC2 through > makeworld + buildkernel (gdb+witness). But no luck. When trying to > upgrade to RC2 the 8.0-BETA4 also crashes. At least 8.0-BETA4 has debug > + witness active in the GENERIC kernel.. > > So below some debug output of 8.0-BETA4 crashing. Has a vfs/ffs LOR > problem with the BETA4 already been fixed? The debug output you included were just lock order reversals, and don't seem to be related to your crash. I think 8.0-BETA4 still had the debugger compiled in (you can test by pressing ctrl-alt-escape ion the console, if you do drop to the debugger, give the "c" command to continue). If the debugger is compiled in, then the spontaneous reboot without dropping to the debugger suggests even more that it may be hardware related. If you do get to the debugger, a copy of all of the messages on screen and the output of the "bt" command would be very useful. When you do your kernel recompile, please include full debugging, including WITNESS, INVARIANTS, KDB, DDB etc. FWIW, don't worry about building world now, a BETA4 world should work fine with a RC2 kernel. You may be able to get a kernel built even though it keeps crashing by clearing out /usr/obj to start with and then just repeating cd /usr/src && make buildkernel -DKERNFAST after every crash. > Does it make sense to send in a pr with the old 8.0-BETA4? It depends what the bug is to be honest. So far there isn't really enough information to determine the cause, and therefore there isn't really enough info for a PR. Gavin _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldKai Gallasch wrote:
> Hi. > > I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > When I try to do a make buildworld or make buildkernel the server > reboots without any message left in the logs. The same happens > when building bigger ports (for example ruby18 or perl58) > > With 8.0-RC2 debug flags and witness seem to be disabled in the > standard GENERIC kernel, so unfortunately it is not possible for me to > build a debug kernel without my server crashing.. > > Now my idea was to install the old 8.0-BETA4 and upgrade to RC2 through > makeworld + buildkernel (gdb+witness). But no luck. When trying to > upgrade to RC2 the 8.0-BETA4 also crashes. At least 8.0-BETA4 has debug > + witness active in the GENERIC kernel.. > > So below some debug output of 8.0-BETA4 crashing. Has a vfs/ffs LOR > problem with the BETA4 already been fixed? > > Does it make sense to send in a pr with the old 8.0-BETA4? > > BTW. I installed 7.2-STABLE on this same server and did a "make > buildworld" and "make buildkernel" which completed without any problem. > > Cheers, > --Kai > > > ----- make buildworld -j7 crash, freebsd 8.0-amd64-beta4 ----- Definitely try the usual memory testing, power supply testing, etc. I had a similar problem, but with a HP DL385G5 that has some sort of "memory issue," and it would just silently reboot (which turned out to be a machine check exception.) I could never finger the problem be it with bios, the actual memory, or the fact that there's only one 4 core cpu on a two socket board and only the associated memory bank filled. I did various memory swaps to no avail, it would run memtest86 all day with no errors, and in the end I just turned superpages off and it works . Like a champ. If vm.pmap.pg_ps_enabled is 1 in 8.0-rc2, you might try rebooting with vm.pmap.pg_ps_enabled="0" in /boot/loader.conf and try another buildworld. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldAm Tue, 03 Nov 2009 10:42:40 +0000
schrieb Gavin Atkinson <gavin@...>: > On Sat, 2009-10-31 at 23:15 +0100, Kai Gallasch wrote: > > Hi. > > > > I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > > > When I try to do a make buildworld or make buildkernel the server > > reboots without any message left in the logs. The same happens > > when building bigger ports (for example ruby18 or perl58) > First place I think I'd start id by running memtest86 on the machine > overnight. This sounds like possible hardware issue to me, it would > be good to see if we can confirm that that is the case. I will do so tomorrow. Following actions I have already taken to rule out a hardware problem: - ran several passes with diagnostic software from the manufacturer - reset BIOS settings to default - upgraded BIOS to newest release - booted server from 2 year old backup BIOS - took out the only pair of RAM modules that was different from the rest of the modules - installed freebsd 7.2-STABLE on the server to repeat the kernel panic (no panic with 7.2) - installed 8.0-BETA4 (crash) Besides: The server was in production with 7.2 for some time, without showing any such problems. > > Now my idea was to install the old 8.0-BETA4 and upgrade to RC2 > > through makeworld + buildkernel (gdb+witness). But no luck. When > > trying to upgrade to RC2 the 8.0-BETA4 also crashes. At least > > 8.0-BETA4 has debug > > + witness active in the GENERIC kernel.. > > > > So below some debug output of 8.0-BETA4 crashing. Has a vfs/ffs LOR > > problem with the BETA4 already been fixed? > > The debug output you included were just lock order reversals, and > don't seem to be related to your crash. Sorry for causing possible confusion about this. I realized this after my mail was already out. > I think 8.0-BETA4 still had the debugger compiled in (you can test by > pressing ctrl-alt-escape ion the console, if you do drop to the > debugger, give the "c" command to continue). > > If the debugger is compiled in, then the spontaneous reboot without > dropping to the debugger suggests even more that it may be hardware > related. If you do get to the debugger, a copy of all of the messages > on screen and the output of the "bt" command would be very useful. > When you do your kernel recompile, please include full debugging, > including WITNESS, INVARIANTS, KDB, DDB etc. In the meantime I managed it to install a RELENG_8 world + GENERIC kernel with all debug options enabled on the crashing server. (mounted /usr/src and /usr/obj on another server running 8.0RC1 through NFS and did buildworld + buildkernel over there..) So now I have a debug kernel available with dumpev + dumpdir defined. Here are my latest findings on this issue: - Running a makeworld in about 80% leads to a server crash without the server writing a crashdump to dumpdir. The server just reboots.. - In about 20% of the cases makeworld gets stuck in a not terminating process that eats up 100% cpu. This process cannot be killed. When restarting makeworld the server then reboots again - It makes no difference doing makeworld -j1 or -j8, result is the same > It depends what the bug is to be honest. So far there isn't really > enough information to determine the cause, and therefore there isn't > really enough info for a PR. Mark Atkinson also commented on my mail and he gave the hint: "If vm.pmap.pg_ps_enabled is 1 in 8.0-rc2, you might try rebooting with c in /boot/loader.conf and try another buildworld." So I thought why not and just tried it - and surprise: Disabling vm.pmap.pg_ps_enabled=1 in loader.conf resolves my problem with 8.0RC2 crashing when doing a makeworld.. After successful buildworld and buildkernel I rebooted the server again with commented out vm.pmap.pg_ps_enabled=0 and the problem was there again. And then I disabled the option again in loader.conf, rebooted + make buildworld .. no problem. Seems to be deterministic. With vm.pmap.pg_ps_enabled=1 the server crashes without being able to write crashdumps to dumpdev. (at least on this specific Proliant DL385G2 server) --Kai. -- You need more time; and you probably always will. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
|
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldKai Gallasch <gallasch@...> writes:
> I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > When I try to do a make buildworld or make buildkernel the server > reboots without any message left in the logs. The same happens > when building bigger ports (for example ruby18 or perl58) Could it be related to this? What's your CPUID? Author: attilio Date: Wed Nov 4 01:32:59 2009 New Revision: 198868 URL: http://svn.freebsd.org/changeset/base/198868 Log: Opteron rev E family of processor expose a bug where, in very rare ocassions, memory barriers semantic is not honoured by the hardware itself. As a result, some random breakage can happen in uninvestigable ways (for further explanation see at the content of the commit itself). As long as just a specific familly is bugged of an entire architecture is broken, a complete fix-up is impratical without harming to some extents the other correct cases. Considering that (and considering the frequency of the bug exposure) just print out a warning message if the affected machine is identified. Pointed out by: Samy Al Bahra <sbahra at repnop dot org> Help on wordings by: jeff MFC: 3 days Modified: head/sys/amd64/amd64/identcpu.c head/sys/i386/i386/identcpu.c DES -- Dag-Erling Smørgrav - des@... _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldAm Wed, 04 Nov 2009 16:24:01 +0100
schrieb Dag-Erling Smørgrav <des@...>: > Kai Gallasch <gallasch@...> writes: > > I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > > > When I try to do a make buildworld or make buildkernel the server > > reboots without any message left in the logs. The same happens > > when building bigger ports (for example ruby18 or perl58) > > Could it be related to this? What's your CPUID? Found this in dmesg. Is this the CPUID? "Id = 0x100f23" --Kai. CPU: Quad-Core AMD Opteron(tm) Processor 2352 (2100.09-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f23 Stepping = 3 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x802009<SSE3,MON,CX16,POPCNT> AMD Features=0xee400800<SYSCALL,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> AMD Features2=0x7ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS> TSC: P-state invariant real memory = 21474836480 (20480 MB) avail memory = 20701110272 (19742 MB) ACPI APIC Table: <HP ProLiant> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs FreeBSD/SMP: 2 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 -- If it wasn't for the last minute, nothing would get done. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldKai Gallasch wrote:
> Am Wed, 04 Nov 2009 16:24:01 +0100 > schrieb Dag-Erling Smørgrav <des@...>: > >> Kai Gallasch <gallasch@...> writes: >>> I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. >>> >>> When I try to do a make buildworld or make buildkernel the server >>> reboots without any message left in the logs. The same happens >>> when building bigger ports (for example ruby18 or perl58) >> Could it be related to this? What's your CPUID? > > Found this in dmesg. Is this the CPUID? "Id = 0x100f23" That's generation 16 model 2 stepping 3. This errata only effects generation 0xe or 15. BTW, I have the same processor/stepping/Mhz in my system, but only a single physical processor. > --Kai. > > > CPU: Quad-Core AMD Opteron(tm) Processor 2352 (2100.09-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0x100f23 Stepping = 3 > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x802009<SSE3,MON,CX16,POPCNT> > AMD > Features=0xee400800<SYSCALL,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> > AMD > Features2=0x7ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS> > TSC: P-state invariant real memory = 21474836480 (20480 MB) avail > memory = 20701110272 (19742 MB) ACPI APIC Table: <HP ProLiant> > FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs > FreeBSD/SMP: 2 package(s) x 4 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > cpu4 (AP): APIC ID: 4 > cpu5 (AP): APIC ID: 5 > cpu6 (AP): APIC ID: 6 > cpu7 (AP): APIC ID: 7 > > _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldKai Gallasch wrote:
> Am Wed, 04 Nov 2009 16:24:01 +0100 > schrieb Dag-Erling Smørgrav <des@...>: > >> Kai Gallasch <gallasch@...> writes: >>> I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. >>> >>> When I try to do a make buildworld or make buildkernel the server >>> reboots without any message left in the logs. The same happens >>> when building bigger ports (for example ruby18 or perl58) >> Could it be related to this? What's your CPUID? > > Found this in dmesg. Is this the CPUID? "Id = 0x100f23" That's generation 16 (0xf) model 2, stepping 3. This errata apparently only effects gen 15 (0xe) and some pre-release -- never released to public (0xf). I have the same processor in my system btw. > > --Kai. > > > CPU: Quad-Core AMD Opteron(tm) Processor 2352 (2100.09-MHz K8-class CPU) > Origin = "AuthenticAMD" Id = 0x100f23 Stepping = 3 > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x802009<SSE3,MON,CX16,POPCNT> > AMD > Features=0xee400800<SYSCALL,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> > AMD > Features2=0x7ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS> > TSC: P-state invariant real memory = 21474836480 (20480 MB) avail > memory = 20701110272 (19742 MB) ACPI APIC Table: <HP ProLiant> > FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs > FreeBSD/SMP: 2 package(s) x 4 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > cpu4 (AP): APIC ID: 4 > cpu5 (AP): APIC ID: 5 > cpu6 (AP): APIC ID: 6 > cpu7 (AP): APIC ID: 7 > > _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldMark Atkinson wrote:
> Kai Gallasch wrote: >> Am Wed, 04 Nov 2009 16:24:01 +0100 >> schrieb Dag-Erling Smørgrav <des@...>: >> >>> Kai Gallasch <gallasch@...> writes: >>>> I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. >>>> >>>> When I try to do a make buildworld or make buildkernel the server >>>> reboots without any message left in the logs. The same happens >>>> when building bigger ports (for example ruby18 or perl58) >>> Could it be related to this? What's your CPUID? > >> Found this in dmesg. Is this the CPUID? "Id = 0x100f23" > > That's generation 16 (0xf) model 2, stepping 3. This errata apparently > only effects gen 15 (0xe) and some pre-release -- never released to > public (0xf). I have the same processor in my system btw. sorry for the double wrong posting. I see several webpages refer to 15 as f and 16 as f. usr/ports/misc/cpuid refers to it as 15. The pages referenced via the bugzilla entry in the commit refer to it as 0xf but between 32 and 63. Does the model 2 correctly put us in the range in the commit 0x20 and 0x3f? (i.e. stepping is included?) _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldOn Wed, 2009-11-04 at 13:44 +0300, S.N.Grigoriev wrote:
> Hi list, > > I can confirm I've seen the same problem. After upgrading from 7-stable > to 8.0-RC2 my machine just reboots during 'make buildworld' without > diagnostics. But switching vm.pmap.pg_ps_enabled on/off does not > work for me. My machine reboots every time I try to build world. > I don't think I have a hardware problem: under 7-stable I can build > world/kernel for both 7-stable and 8.0-RC2 without problems. > Is it by any chance possible that you have 'debug.debugger_on_panic' set to '0' and no valid dump device configured? -- Alexandre Kovalenko (Олександр Коваленко) _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldMark Atkinson wrote:
> Mark Atkinson wrote: >> Kai Gallasch wrote: >>> Am Wed, 04 Nov 2009 16:24:01 +0100 >>> schrieb Dag-Erling Smørgrav <des@...>: >>> >>>> Kai Gallasch <gallasch@...> writes: >>>>> I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. >>>>> >>>>> When I try to do a make buildworld or make buildkernel the server >>>>> reboots without any message left in the logs. The same happens >>>>> when building bigger ports (for example ruby18 or perl58) >>>> Could it be related to this? What's your CPUID? >>> Found this in dmesg. Is this the CPUID? "Id = 0x100f23" >> That's generation 16 (0xf) model 2, stepping 3. This errata apparently >> only effects gen 15 (0xe) and some pre-release -- never released to >> public (0xf). I have the same processor in my system btw. > > sorry for the double wrong posting. I see several webpages refer to 15 > as f and 16 as f. usr/ports/misc/cpuid refers to it as 15. > > The pages referenced via the bugzilla entry in the commit refer to it as > 0xf but between 32 and 63. Does the model 2 correctly put us in the > range in the commit 0x20 and 0x3f? (i.e. stepping is included?) I'll answer my own question, no: http://support.amd.com/us/Processor_TechDocs/25481.pdf Although the some of the posts in http://bugzilla.kernel.org/show_bug.cgi?id=11305 indicate any model < 0x40. Someone must have actually narrowed the range. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworld2009/11/5 Mark Atkinson <atkin901@...>:
> > I'll answer my own question, no: > > http://support.amd.com/us/Processor_TechDocs/25481.pdf > > Although the some of the posts in > > http://bugzilla.kernel.org/show_bug.cgi?id=11305 > > indicate any model < 0x40. Someone must have actually narrowed the range. Is there a FreeBSD PR or errata URL which can be linked to instead, complete with copies of the above in it? Adrian _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldAm Tue, 03 Nov 2009 10:42:40 +0000
schrieb Gavin Atkinson <gavin@...>: > On Sat, 2009-10-31 at 23:15 +0100, Kai Gallasch wrote: > > Hi. > > > > I installed 8.0RC2-amd64 on an 8-core opteron server a few days ago. > > > > When I try to do a make buildworld or make buildkernel the server > > reboots without any message left in the logs. The same happens > > when building bigger ports (for example ruby18 or perl58) > > > > With 8.0-RC2 debug flags and witness seem to be disabled in the > > standard GENERIC kernel, so unfortunately it is not possible for me > > to build a debug kernel without my server crashing.. > > First place I think I'd start id by running memtest86 on the machine > overnight. This sounds like possible hardware issue to me, it would > be good to see if we can confirm that that is the case. Gavin. memtest86 ran for 18 hours and showed no problem with RAM. --Kai. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
|
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldAdrian Chadd wrote:
> 2009/11/5 Mark Atkinson <atkin901@...>: > >> I'll answer my own question, no: >> >> http://support.amd.com/us/Processor_TechDocs/25481.pdf >> >> Although the some of the posts in >> >> http://bugzilla.kernel.org/show_bug.cgi?id=11305 >> >> indicate any model < 0x40. Someone must have actually narrowed the range. > > Is there a FreeBSD PR or errata URL which can be linked to instead, > complete with copies of the above in it? If you read the mysql related blog post on it: http://timetobleed.com/mysql-doesnt-always-suck-this-time-its-amd/ Someone in the comments suggests this is AMD errata 147 and quotes the text. I'll include a copy of the comment here below for the mail archives (and since urls tend to disappear). # silverjam The kernel bug: http://bugzilla.kernel.org/show_bug.cgi?id=11305 Which references an AMD "errata 147" from "Revision Guide for AMD Athlon™ 64 and AMD Opteron™ Processors." http://support.amd.com/us/Processor_TechDocs/25759.pdf Which says: """ Potential Violation of Read Ordering Rules Between Semaphore Operations and Unlocked Read-Modify-Write Instructions Description Under a highly specific set of internal timing circumstances, the memory read ordering between a semaphore operation and a subsequent read-modify-write instruction (an instruction that uses the same memory location as both a source and destination) may be incorrect and allow the read-modifywrite instruction to operate on the memory location ahead of the completion of the semaphore operation. The erratum will not occur if there is a LOCK prefix on the read-modify-write instruction. This erratum does not apply if the read-only value in MSRC001_1023h[33] is 1b. Potential Effect on System In the unlikely event that the condition described above occurs, the read-modify-write instruction (in the critical section) may operate on data that existed prior to the semaphore operation. This erratum can only occur in multiprocessor or multicore configurations. Suggested Workaround To provide a workaround for this unlikely event, software can perform any of the following actions for multiprocessor or multicore systems: • Place a LFENCE instruction between the semaphore operation and any subsequent read-modifywrite instruction(s) in the critical section. • Use a LOCK prefix with the read-modify-write instruction. • Decompose the read-modify-write instruction into separate instructions. No workaround is necessary if software checks that MSRC001_1023h[33] is set on all processors that may execute the code. The value in MSRC001_1023h[33] may not be the same on all processors in a multi-processor system. Fix Planned: Yes """ _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldOn Thu, 05 Nov 2009 19:40:03 +0300
S.N.Grigoriev <serguey-grigoriev@...> wrote: > > 04.11.09, 16:51, "Alexandre \"Sunny\" Kovalenko" <gaijin.k@...> > wrote: > > > On Wed, 2009-11-04 at 13:44 +0300, S.N.Grigoriev wrote: > > > Hi list, > > > > > > I can confirm I've seen the same problem. After upgrading from 7-stable > > > to 8.0-RC2 my machine just reboots during 'make buildworld' without > > > diagnostics. But switching vm.pmap.pg_ps_enabled on/off does not > > > work for me. My machine reboots every time I try to build world. > > > I don't think I have a hardware problem: under 7-stable I can build > > > world/kernel for both 7-stable and 8.0-RC2 without problems. > > > > > Is it by any chance possible that you have 'debug.debugger_on_panic' set > > to '0' and no valid dump device configured? > > Hi Alexandre, > > I've not found 'debug.debugger_on_panic' variable in 'sysctl -a' > output. Where cat I find it? All my sysctl variables are set by > default. Do you have "options DDB" in your kernel config file? --- Gary Jennejohn _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworld05.11.09, 18:49, "Gary Jennejohn" <gary.jennejohn@...>: > On Thu, 05 Nov 2009 19:40:03 +0300 > S.N.Grigoriev wrote: > > > > 04.11.09, 16:51, "Alexandre \"Sunny\" Kovalenko" > > wrote: > > > > > On Wed, 2009-11-04 at 13:44 +0300, S.N.Grigoriev wrote: > > > > Hi list, > > > > > > > > I can confirm I've seen the same problem. After upgrading from 7-stable > > > > to 8.0-RC2 my machine just reboots during 'make buildworld' without > > > > diagnostics. But switching vm.pmap.pg_ps_enabled on/off does not > > > > work for me. My machine reboots every time I try to build world. > > > > I don't think I have a hardware problem: under 7-stable I can build > > > > world/kernel for both 7-stable and 8.0-RC2 without problems. > > > > > > > Is it by any chance possible that you have 'debug.debugger_on_panic' set > > > to '0' and no valid dump device configured? > > > > Hi Alexandre, > > > > I've not found 'debug.debugger_on_panic' variable in 'sysctl -a' > > output. Where cat I find it? All my sysctl variables are set by > > default. > Do you have "options DDB" in your kernel config file? > --- > Gary Jennejohn Hi Gary, my current kernel is GENERIC, so I don't have "options DDB". -- Regards, S.Grigoriev. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworldS.N.Grigoriev wrote:
> > 05.11.09, 18:49, "Gary Jennejohn" <gary.jennejohn@...>: > >> On Thu, 05 Nov 2009 19:40:03 +0300 >> S.N.Grigoriev wrote: >>> 04.11.09, 16:51, "Alexandre \"Sunny\" Kovalenko" >>> wrote: >>> >>>> On Wed, 2009-11-04 at 13:44 +0300, S.N.Grigoriev wrote: >>>>> Hi list, >>>>> >>>>> I can confirm I've seen the same problem. After upgrading from 7-stable >>>>> to 8.0-RC2 my machine just reboots during 'make buildworld' without >>>>> diagnostics. But switching vm.pmap.pg_ps_enabled on/off does not >>>>> work for me. My machine reboots every time I try to build world. >>>>> I don't think I have a hardware problem: under 7-stable I can build >>>>> world/kernel for both 7-stable and 8.0-RC2 without problems. >>>>> >>>> Is it by any chance possible that you have 'debug.debugger_on_panic' set >>>> to '0' and no valid dump device configured? >>> Hi Alexandre, >>> >>> I've not found 'debug.debugger_on_panic' variable in 'sysctl -a' >>> output. Where cat I find it? All my sysctl variables are set by >>> default. >> Do you have "options DDB" in your kernel config file? >> --- >> Gary Jennejohn > > Hi Gary, > > my current kernel is GENERIC, so I don't have "options DDB". I have RC2 with amd64 and buildworld/installworld runs fine. Maybe you memory (ram) problems ? I had to remove one 512mb clib in order to boot... ;-) Hope this helps, Etienne -- Etienne Robillard <robillard.etienne@...> Green Tea Hackers Club <http://gthc.org/> Blog: <http://gthc.org/blog/> PGP Fingerprint: 178A BF04 23F0 2BF5 535D 4A57 FD53 FD31 98DC 4E57 _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
|
|
Re: 8.0RC2 amd64 - kernel panic running make buildworld05.11.09, 13:46, "Etienne Robillard" <robillard.etienne@...>: > S.N.Grigoriev wrote: > > > > 05.11.09, 18:49, "Gary Jennejohn" : > > > >> On Thu, 05 Nov 2009 19:40:03 +0300 > >> S.N.Grigoriev wrote: > >>> 04.11.09, 16:51, "Alexandre \"Sunny\" Kovalenko" > >>> wrote: > >>> > >>>> On Wed, 2009-11-04 at 13:44 +0300, S.N.Grigoriev wrote: > >>>>> Hi list, > >>>>> > >>>>> I can confirm I've seen the same problem. After upgrading from 7-stable > >>>>> to 8.0-RC2 my machine just reboots during 'make buildworld' without > >>>>> diagnostics. But switching vm.pmap.pg_ps_enabled on/off does not > >>>>> work for me. My machine reboots every time I try to build world. > >>>>> I don't think I have a hardware problem: under 7-stable I can build > >>>>> world/kernel for both 7-stable and 8.0-RC2 without problems. > >>>>> > >>>> Is it by any chance possible that you have 'debug.debugger_on_panic' set > >>>> to '0' and no valid dump device configured? > >>> Hi Alexandre, > >>> > >>> I've not found 'debug.debugger_on_panic' variable in 'sysctl -a' > >>> output. Where cat I find it? All my sysctl variables are set by > >>> default. > >> Do you have "options DDB" in your kernel config file? > >> --- > >> Gary Jennejohn > > > > Hi Gary, > > > > my current kernel is GENERIC, so I don't have "options DDB". > I have RC2 with amd64 and buildworld/installworld runs fine. > Maybe you memory (ram) problems ? I had to remove one 512mb clib > in order to boot... ;-) > Hope this helps, > Etienne Hi Etienne, I think it is unlikely. I've done on this machine (under FreeBSD 7.1 and 7.2 and some Linux versions) very much compilations without issues. -- Regards, S.Grigoriev. _______________________________________________ freebsd-current@... mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@..." |
| < Prev | 1 - 2 - 3 - 4 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |