Thursday, November 24, 2005

Thread dumps...

Many a times our customers complain of Coldfusion Server(CF) hanging, becoming non-responsive, taking 100% cpu or some request taking too much time. In order to understand the reasons why this happens, only way is to analyze the "Thread dumps". Not many of our CF users are java geeks and therefore when we ask them to send a thread dump, they have a very little clue about it. I hope they wouldn't be clueless anymore ;)

What exactly is "Thread dump"
"Thread dump" basically gives you information on what each of the thread in the VM is doing at any given point of time. This makes "Thread dump" an excellent debugging tool. It can tell you the states of each of the thread in the VM, where exactly each thread is in the execution path at that point, which thread is waiting and where is it waiting and lot more. It also shows you the stack of each thread and can help you track the execution path.

How do I take a Thread dump?
For any Java VM, if you are running it from a command prompt, you can get the thread dump by pressing Ctrl+Break (for windows) or Ctrl+\ (for unix machines) on the terminal running the server.
In case you are running the VM in background or as services, you can send a "SIGQUIT" signal to the process to get the thread dump. To send this signal to VM on unix machine, you can use "kill -SIGQUIT [pid]" where 'pid' is the process id of the server process. This will send the signal to the VM to dump the thread stack on the error stream (On Sun JVM) or to a new file (IBM's JVM).

You can also use the Stacktrace tool to get the thread dump. It is a nice java webstart application that you can run on the machine running the server. All you need is to specify the "process id" and it will nicely show the threaddump in its own window.

To be more specific for CF, if the standalone CF server or CF on J2EE server has a terminal, use Ctrl+Break or Ctrl+\ depending on the platform. If its running as background process or as services, use Stacktrace tool I mentioned earlier.

Here is how the thread dump would look.

Full thread dump Java HotSpot(TM) Server VM (1.5.0_04-b05 mixed mode):

"web-4" prio=5 tid=0x28999418 nid=0x1460 runnable [0x2fbaf000..0x2fbafd18]
at java.io.WinNTFileSystem.canonicalize0(Native Method)
at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:374)
at java.io.File.getCanonicalPath(File.java:531)
at java.io.FilePermission$1.run(FilePermission.java:218)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.FilePermission.init(FilePermission.java:212)
at java.io.FilePermission.(FilePermission.java:264)
at java.lang.SecurityManager.checkRead(SecurityManager.java:871)
at java.io.File.exists(File.java:700)
at jrunx.resource.FileResource.exists(FileResource.java:98)
at jrunx.resource.FileResource.getURL(FileResource.java:140)
at jrun.servlet.JRunServletContext.getResource(JRunServletContext.java:192)
at jrun.servlet.file.FileServlet.service(FileServlet.java:148)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:91)
at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:257)
at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:541)
at jrun.servlet.http.WebService.invokeRunnable(WebService.java:172)
at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:318)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:426)
at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:264)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)

"web-3" prio=5 tid=0x2814e470 nid=0xd70 in Object.wait() [0x2faaf000..0x2faafd98]
at java.lang.Object.wait(Native Method)
- waiting on <0x064b5a90> (a jrunx.scheduler.JSemaphore)
at jrunx.scheduler.JSemaphore.acquire(JSemaphore.java:74)
- locked <0x064b5a90> (a jrunx.scheduler.JSemaphore)
at jrun.servlet.network.NetworkService.accept(NetworkService.java:348)
at jrun.servlet.http.WebService.createRunnable(WebService.java:104)
at jrunx.scheduler.ThreadPool$DownstreamMetrics.createRunnable(ThreadPool.java:285)
at jrunx.scheduler.ThreadPool$ThreadThrottle.createRunnable(ThreadPool.java:347)
at jrunx.scheduler.ThreadPool$UpstreamMetrics.createRunnable(ThreadPool.java:239)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:62)

"web-2" prio=5 tid=0x271954c0 nid=0x170c in Object.wait() [0x2ef6e000..0x2ef6fa18]
at java.lang.Object.wait(Native Method)
- waiting on <0x02af8eb8> (a java.awt.image.PixelGrabber)
at java.awt.image.PixelGrabber.grabPixels(PixelGrabber.java:254)
- locked <0x02af8eb8> (a java.awt.image.PixelGrabber)
at java.awt.image.PixelGrabber.grabPixels(PixelGrabber.java:209)
at com.lowagie.text.Image.getInstance(Unknown Source)
at com.lowagie.text.Image.getInstance(Unknown Source)
at com.lowagie.text.pdf.PdfGraphics2D.addAltText(Unknown Source)
at ice.pilots.html4.MacromediaCSSExtension.addonImageProcessing(MacromediaCSSExtension.java:186)
at ice.pilots.html4.ObjectBox.paintChunk(OEAB)
at ice.pilots.html4.InlineBox.paintChildren(OEAB)
at ice.pilots.html4.InlineBox.paintChunk(OEAB)
at ice.pilots.html4.InlineBox.paintChildren(OEAB)
at ice.pilots.html4.BlockBox.paint(OEAB)
at ice.pilots.html4.OutlinePainter.drawBox(OEAB)
at ice.pilots.html4.BlockBox.paint(OEAB)
at ice.pilots.html4.OutlinePainter.drawBox(OEAB)
at ice.pilots.html4.BlockBox.paint(OEAB)
at ice.pilots.html4.CSSLayout.paint(OEAB)
- locked <0x02cebb10> (a java.lang.Object)
at ice.pilots.html4.ThePrinter.printPage(Unknown Source)
at coldfusion.document.DocumentSection.process(DocumentSection.java:230)
at coldfusion.document.DocumentSection.print(DocumentSection.java:108)
at coldfusion.document.DocumentExporter.export(DocumentExporter.java:237)
at coldfusion.document.DocumentFrame.exportContent(DocumentFrame.java:118)
at coldfusion.document.DocumentProcessor.processContent(DocumentProcessor.java:130)
at coldfusion.document.DocumentProcessor.ProcessContent(DocumentProcessor.java:59)
at coldfusion.tagext.lang.DocumentTag.processContent(DocumentTag.java:1218)
at coldfusion.tagext.lang.DocumentTag.access$100(DocumentTag.java:84)
at coldfusion.tagext.lang.DocumentTag$3.run(DocumentTag.java:1179)
at java.security.AccessController.doPrivileged(Native Method)
at coldfusion.tagext.lang.DocumentTag.doAfterBody(DocumentTag.java:1174)
at cfcmyk2ecfm937582361.runPage(E:\CFusionMX7\wwwroot\testfolder\doctest\cmyk.cfm:3)
at coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:152)
at coldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:349)
at coldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65)
at coldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:209)
at coldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:51)
at coldfusion.filter.PathFilter.invoke(PathFilter.java:86)
at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:69)
at coldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)
at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38)
at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38)
at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22)
at coldfusion.filter.RequestThrottleFilter.invoke(RequestThrottleFilter.java:115)
at coldfusion.CfmServlet.service(CfmServlet.java:107)
at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:78)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:91)
at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:257)
at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:541)
at jrun.servlet.http.WebService.invokeRunnable(WebService.java:172)
at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:318)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:426)
at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:264)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
.
.
.

If you anayze this thread dump, you can see that "web-2" thread is handling request for "cmyk.cfm" and is waiting for the images to be loaded i.e (actually at Image.getInstance()). At this point, cfdocument has sent another http request for the image file that it has to render and that request is handled by another thread "web-4" which is running and is currently checking if the image file requested exists on the file system. Isn't it very powerful ?

You can checkout a very nice article Thread dump and stack traces written by Rajiv (my ex-colleague and mentor in Pramati) which I feel is a must read for Java developers.

No comments: