|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
Sparcstation 20 - Dude, we're getting the band back together!
by Sanford Barton
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Guys, I've been having a blast (sorta), refurbishing an old Sparcstation 20.
So, finally, I can afford to put together that ultimate SS20 that I lusted after back in 1995!! But I've run into some issues that I'm not sure if they are normal or if I need to keep digging, so heres the story: I built this machine in an Aurora 2 chassis (the one with the better cooling and full-height cd-rom). I'm using a later rev motherboard that has the newer sbus controller. In the drive bay I have 1x 72gb seagate cheetah (actually runs cooler than the older, smaller drives of the era). I have an 8mb VSIMM for the video and run Xsun in 24bpp, and 448mb RAM. The PROM is 2.25R. For processors I have a pair of SM71's and a pair of SM81's at my disposal. I also have a pair of ROSS 200Mhz and a pair of 150Mhz, but I really, REALLY, would rather use the SuperSparcs in this machine because they wipe the floor with the ROSS in general desktop/multitasking usage. I've loaded Solaris 9 and the latest recommended patch bundle as well as several other patches for gtk apps, etc. So here's the problem. With the ROSS processors the above system is rock solid, everything works as expected. With either the 2 x SM71s or 2 x SM81s, the system runs fine until I start the stress it with a few modern apps like Seamonkey, Thunderbird, Pidgin, etc. These newer apps put a lot of load on the CPU's in the form of context switches and cache (which the SuperSparcs excel at). But after about 10 minutes of usage like this, the machine locks up solid and must be power cycled. An easy way to reproduce this is to scroll a large webpage up and down continuously for about 20 seconds. That really spikes the CPU usage and will always leads to a lockup. Like I said, with the ROSS processors, I could do that all day long with no issues. For testing purposes, the top of the case is removed and supplemental cooling on the cpu modules to produce a best-case baseline for heat management. I'm very familiar with the heat challenges in the aurora cases, especially with the faster processors, but I would say as a general finger test, the SuperSparcs are generally running a tad cooler then the Hypersparcs, even after I reconditioned the heat sinks with new thermal compound. I have also applied new thermal compound to the memory and subs controller heat sinks. Things I think I can rule out: - Motherboard (tested with spare, no improvement) - Power supply (tested with spare, no improvement) - Memory (tested with spares, no improvement) So where I'm at is basically 4 possibilities: 1.) I'm taxing the SuperSparcs in such a way with the newer software that they will never be stable in this system. 2.) I have a batch of bad processor modules. 3.) The CG14 bits are being pushed beyond their limits with the SuperSparcs, especially now that it's not using much acceleration for 2D (speculation). If so, why not the same behavior with the ROSS processors? Perhaps they are not able to feed/stress the memory/graphics controller they way the SuperSparcs are? 4.) Not a heat or component stress issue at all, but some sort of multitasking, OS, or cache/memory controller bug. Any ideas, experiences, hope, discouragement, anything?? Thanks! _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by velociraptor
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message On Aug 26, 2009, at 9:33 AM, Sanford Barton wrote:
> Guys, I've been having a blast (sorta), refurbishing an old > Sparcstation 20. Nice, old school system. > 1.) I'm taxing the SuperSparcs in such a way with the newer software > that they will never be stable in this system. > 2.) I have a batch of bad processor modules. > 3.) The CG14 bits are being pushed beyond their limits with the > SuperSparcs, especially now that it's not using much acceleration for > 2D (speculation). If so, why not the same behavior with the ROSS > processors? Perhaps they are not able to feed/stress the > memory/graphics controller they way the SuperSparcs are? > 4.) Not a heat or component stress issue at all, but some sort of > multitasking, OS, or cache/memory controller bug. > > > Any ideas, experiences, hope, discouragement, anything?? I'd suggest testing headless. E.g. Use a remote X session running your browser and try your lock-up trick. This will at least narrow it down to OS bugs vs graphics sub-system. If it is stable w/out the graphics, I'd grab Sun VTS (if possible--it requires a contract to download) and see what is in there for stress- testing the graphic subsystem. Maybe you have something marginal in there? Maybe try to track down a later HW rev of your graphics card? Good luck-- =Nadine= _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by Lionel Peterson-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message I believe VTS was free and included on install media, at least on on
more mature releases of Solaris... Has that changed? Lionel On Sep 5, 2009, at 12:09 PM, Nadine Miller <velociraptor@...> wrote: > I'd grab Sun VTS (if possible--it requires a contract to download) _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by der Mouse-3
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message >> Sparcstation 20.
> Maybe try to track down a later HW rev of your graphics card? The message said cg14, so this isn't really very possible. There is no graphics card as such; much of the cg14 is on the motherboard - and what isn't is physically conflated with the VSIMM and I would guess doesn't have all that many revs. (Of course, I could well be wrong, especially about the "I would guess...".) My guess would be 2 (bad processors) or 4 (software issues, eg with cache). I've had cache issues with the SS20 myself; I have pseudo-disk drivers for NetBSD that simply don't work on the 20 unless I add code to "manually" force everything out of the cache at critical points. Yes, this indicates bugs somewhere - but the same code "works fine" on other sparc32s, so it indicates that the SS20 is relatively demanding in regard of caches and such. And as for the Ross difference, I've seen it said that the Ross processors have tiny caches but more muscle; if true, perhaps the same underlying problem exists but everything is being pushed out of the cache before the trouble has a chance to manifest? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@... / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by Sanford Barton
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Guys thank you for the help. I posted this to sunhelp a couple of
weeks ago and have been diligently working ever since. This is the current status: I wanted to see if the issue persisted with Solaris 8, so I loaded a fresh Solaris 8 2/02 + latest recommended. I had a hell of a time getting the X11 install stable. There is a MAJOR XSun memory leak with Solaris 8 on a SS20 using the CG14/SX graphics. Basically even with all the available/latest XSun/kernel patches, the XSun process will nibble your ram by about 2-3mb every few seconds until it starts swapping, and then it's only a matter of time before your hosed. The solution turned out to be using the /usr/openwin/server/module/ddxSUNWcg14.so.1 from my Solaris 9 install CD. That fixed the memoy leak. So, everthing stable now using both SM81s on Solaris 8. Running all the programs I want - Thunderbird, SeaMonkey(make sure you turn off java/javascript), XChat, XMMS streaming a 56k mp3 stream, a few xterms for a few hours now and no problems at all. Amazed I can run all that and still be 85% idle on 2 85mhz processors and STILL have 245M out of 448M free physical RAM. So unless something else pops up, looks like mystery solved. Maybe one day I can try Solaris 9 again, but I don't think I will unless I can find a later HW release than the one I have. On Sat, Sep 5, 2009 at 8:04 PM, der Mouse <mouse@...> wrote: >>> Sparcstation 20. > >> Maybe try to track down a later HW rev of your graphics card? > > The message said cg14, so this isn't really very possible. There is no > graphics card as such; much of the cg14 is on the motherboard - and > what isn't is physically conflated with the VSIMM and I would guess > doesn't have all that many revs. (Of course, I could well be wrong, > especially about the "I would guess...".) > > My guess would be 2 (bad processors) or 4 (software issues, eg with > cache). I've had cache issues with the SS20 myself; I have pseudo-disk > drivers for NetBSD that simply don't work on the 20 unless I add code > to "manually" force everything out of the cache at critical points. > Yes, this indicates bugs somewhere - but the same code "works fine" on > other sparc32s, so it indicates that the SS20 is relatively demanding > in regard of caches and such. And as for the Ross difference, I've > seen it said that the Ross processors have tiny caches but more muscle; > if true, perhaps the same underlying problem exists but everything is > being pushed out of the cache before the trouble has a chance to > manifest? > > /~\ The ASCII Mouse > \ / Ribbon Campaign > X Against HTML mouse@... > / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B > _______________________________________________ > rescue list - http://www.sunhelp.org/mailman/listinfo/rescue rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by Jonathan Katz-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Wow.
I wonder if this has to do with the various kernel changes between S8 and S9. Not the MPO stuff (I don't think that applies to sun4m systems) but the kernel's threading model changed from many-to-many to one-to-one(?) to eek out more performance. This shows how to enable one-to-one on Solaris 8. It may be worth a try to do this to see if you can "break" your Solaris 8 setup. http://www-01.ibm.com/support/docview.wss?rs=180&uid=swg21107291 More background info: http://lkml.indiana.edu/hypermail/linux/kernel/0001.3/0238.html http://www.j2ee.me/docs/hotspot/threads/threads.html http://www.northco.net/chenke/project/solaris.html On Sat, Sep 5, 2009 at 10:36 PM, Sanford Barton <xc68000@...> wrote: > Guys thank you for the help. I posted this to sunhelp a couple of > weeks ago and have been diligently working ever since. This is the > current status: _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by Mike Shields-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message > So where I'm at is basically 4 possibilities:
Is it possible that it's the Ross PROM? I've got several dual SM71 SS20's
> > 1.) I'm taxing the SuperSparcs in such a way with the newer software > that they will never be stable in this system. > 2.) I have a batch of bad processor modules. > 3.) The CG14 bits are being pushed beyond their limits with the > SuperSparcs, especially now that it's not using much acceleration for > 2D (speculation). If so, why not the same behavior with the ROSS > processors? Perhaps they are not able to feed/stress the > memory/graphics controller they way the SuperSparcs are? > 4.) Not a heat or component stress issue at all, but some sort of > multitasking, OS, or cache/memory controller bug. > > > that I've never had issues with. They all use PROM rev 2.22. I have two, specifically, that run Solaris 9. One runs 24/7, although that one is maxed out to 512mb and is headless (well, TGX on SBUS). I've got another one that I built for desktop purposes, with 448mb ram, 8mb VSIMM, 36gb 10krpm drive, etc. Again, I've never had any issues, running Gnome, web browsers, etc on the CG14. Since you've found a working system in Solaris 8, just consider this another data point. If you're interested in further investigation, I could collect revision numbers from various parts for comparison. _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
|
|
Re: Sparcstation 20 - Dude, we're getting the band back together!
by Sanford Barton
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message @Jonathan - Thats a good idea and I'll give it a shot.
@Mike it feels like a software issue at this point, but I'd be interested in what release of Solaris 9 you started with... I only have the FCS Media Kit....I'm not convinced that the latest recommended patch bundle brought everything forward that needed updating. I doubt Sun was doing much patch regression testing on the Sparc20 when 9 was getting updated ;) As an aside - I got some great steals on some ROSS modules. So pretty soon I'll be able to test and benchmark the following modules agains each other: -2 x ROSS 626D 200Mhz 512kb half-speed cache -2 x ROSS 626C 150Mhz 512kb full-speed cache -2 x ROSS 626x 142Mhz 1024kb full-speed cache (really interested in how these perform) -2 x SM71's -2 x SM81's The specInt/specFP values you see for all these processors are really whacked when you bounce them agains real world tasks. I'd like to find a benchmarking package that would give a better representation of various performance aspects of the above. It's all for fun and games at this stage of course :) Thanks for all the input from everyone btw. _______________________________________________ rescue list - http://www.sunhelp.org/mailman/listinfo/rescue |
| Free embeddable forum powered by Nabble | Forum Help |