Coldfusion MX7 Jrun Crashes

Recently we started having several crashes and "unable to create new native thread" error messages on one of our servers at work. Oddly, this was one of our most stable machines and the problem started very suddenly.

Now I know what you're thinking, and no we didn't modify the codebase of any applications on that server (there are only 2). However, the volume of requests can vary from day to day, so we assumed it was being caused by memory problems. Every couple of hours, the jrun service would hang and we'd need to restart it. The server in question hosts two critical applications, and any downtime impacts our customers greatly. Scrambling for a solution, we focused our attention on Java's garbage collection and memory settings. We tried several scenarios, options and combinations, and although the amount of time between crashes would vary we'd eventually have to restart the services.

Out of ideas, we noticed we forgot to look for solutions to the "unable to create new native thread" error message. A quick "googling" found this page, indicating that when using the cfquery tag with the timeout attribute, monitor threads would accumulate and eventually lead to an "out of memory" message (and subsequent crash). Updater 2 for CFMX7 resolves this issue. The link to the Updater in that page doesn't work, but you can get all available downloads for CFMX7 here.

We applied the Updater one morning, and crossed our fingers. We're now on day 3, and still no crashes. I'm not saying this is the miracle cure for your server problems (if you have any), as several other factors might apply (amongst other things, Java's GC and memory settings greatly affect your server's performance).

D'oh!

Comments
Tom Mollerus's Gravatar Francois, if you do experience any further crashes, check on your database performance. Look for any queries that are taking longer than they should or have too many disk reads. Such queries can slow down your db server and cause strange problems with the CF application server. I was getting "null null" errors and crashes until I created additional indexes to help speed up certain queries.
# Posted By Tom Mollerus | 2/15/08 12:42 PM
Francois Levesque's Gravatar Thanks for the tip Tom, I'll make sure and keep on eye on that as well from now on.
# Posted By Francois Levesque | 2/15/08 12:46 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.9.3.000. Contact Blog Owner