> I wanted to share with the community some of the issues we have
> endured over the past week and what we have learned and changed as a
> result. As a short background, we have been running our Sakai based
> system (Oncourse) in parallel with the legacy system for 2+ years.
> The beginning of our fall semester marked the first time where all of
> the legacy user base transitioned to the Sakai based system. This
> additional user base represents an overall 2x increase in load on our
> production system (16x app servers and 1x Oracle 10.0.1.x server).
>
> The first sign of trouble came from the fact that DBCP was having
> trouble obtaining and maintaining connections to Oracle. DBCP has a
> nasty bug that can be triggered in these kinds of conditions that
> results in a deadlock situation. To resolve the DBCP bugs, we
> switched to the c3p0 connection pool. C3p0 behaves much more stably
> and predictably under heavy load and can recover better from
> connection issues with Oracle. This is a drop-in replacement for
> DBCP and I am going to recommend that Sakai switch to this connection
> pool as a default in the 2.5 release.
>
> Next we started troubleshooting Oracle settings to get the instance
> sized to handle the additional load being thrown at it. The one
> thing that we think made more difference than anything was turing OFF
> Oracle's automatic memory management (AMM). With AMM turned off, we
> then went through some iterations of increasing db_cache_size,
> shared_pool_size, large_pool_size, and sga_max_size. We eventually
> over tuned these settings and started causing swapping in the OS. We
> now have backed those down to a reasonable number and Oracle seems to
> be performing well.
>
> The areas of the application that are still giving us trouble are
> related to running out of heap space in the jvm. We have run the app
> servers with 1GB of heap for 2+ years with no issues, but with the
> current load we are seeing we bumped up heap to 2GB (the max for 32-
> bit architecture). We have now returned service to a level of
> stability, but we are still running dangerously close on max heap.
> The next steps, from a software perspective, are to replace the XML-
> based storage mechanisms with normalized relational database tables.
> There are a few code paths that we are aware of that consume extreme
> amounts of memory due to the loading of XML documents - Resources
> (especially quota calculation), Assignments (especially download zip
> file), and Calendar. We plan on pursuing a migration to 64-bit app
> servers as an insurance plan (more max heap), but we (Sakai) need to
> put some concerted focus on removing XML-based storage.
>
> L
>
> You can see a complete change log here:
>
https://oncourse.iu.edu/access/wiki/site/
> 3001b886-1069-4fb7-00d5-8db4b3a85f74/home.html
>
>
> Lance Speelmon +1 (317) 278-9053
> Manager Online Development / Sakai Release Manager
>
>
> [see attachment: "message0.html", size: 4945 bytes]
>
> [see attachment: "smime.p7s", size: 2417 bytes]
>
>
> Attachments:
>
> message0.html
>
https://collab.sakaiproject.org/access/content/attachment/1e66b58f-ff4a-4aee-8028-78aa1f489986/message0.html>
> smime.p7s
>
https://collab.sakaiproject.org/access/content/attachment/e4fe6cd4-fdff-4dd6-0037-1cc4aab82785/smime.p7s>
> ----------------------
> This automatic notification message was sent by Sakai Collab
> (
https://collab.sakaiproject.org/portal) from the WG: Production site.
> You can modify how you receive notifications at My Workspace > Preferences.
>
) from the DG: Development (a.k.a. sakai-dev) site.