本文介绍了JVM在未指定帧的情况下崩溃,仅“计时器已过期,中止".的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Hadoop下运行Java作业,这使JVM崩溃了.我怀疑这是由于某些JNI代码(它使用具有多线程本机BLAS实现的JBLAS)所致.但是,虽然我希望崩溃日志为调试提供问题框架",但日志看起来像:

I am running a Java job under Hadoop which is crashing the JVM. I suspect this is due to some JNI code (it uses JBLAS with a multithreaded native BLAS implementation). However, while I expect the crash log to supply the "problematic frame" for debugging, instead the log looks like:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f204dd6fb27, pid=19570, tid=139776470402816
#
# JRE version: 6.0_38-b05
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# # [ timer expired, abort... ]

JVM是否有一些计时器可以在生成故障转储输出时等待多长时间?如果可以,是否有增加时间的方法,这样我可以获得更多有用的信息?我认为引用的计时器不是来自Hadoop,因为我在许多未提及Hadoop的地方看到(无用的)对该错误的引用.

Does the JVM have some timer for how long it will wait when producing this crash dump output? If so, is there a way to increase the time so I can get more helpful information? I don't think the timer referred to is coming from Hadoop, since I see (unhelpful) references to this error in many places which do not mention Hadoop.

Google搜索似乎表明字符串"timer expired,abort"仅出现在这些JVM错误消息中,因此它不太可能来自操作系统.

Googling appears to show that the string "timer expired, abort" only shows up in these JVM error messages, so it is unlikely to come from the OS.

编辑:看来我可能不走运.从JVM源的OpenJDK版本中的./hotspot/src/share/vm/runtime/thread.cpp:

It looks like I am probably out of luck. From ./hotspot/src/share/vm/runtime/thread.cpp in the OpenJDK version of the JVM source:

 if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

它似乎被硬编码为等待两分钟.我不知道为什么为什么要为我的工作报告崩溃需要更长的时间,但我认为至少这个问题已经得到了答案.

It appears to be hard-coded to wait two minutes. Why crash reporting for my job is taking longer than that, I don't know, but I think this question at least has been answered.

推荐答案

看来我可能不走运.从JVM源的OpenJDK版本中的./hotspot/src/share/vm/runtime/thread.cpp中:

It looks like I am probably out of luck. From ./hotspot/src/share/vm/runtime/thread.cpp in the OpenJDK version of the JVM source:

 if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

它似乎被硬编码为等待两分钟.我不知道为什么为什么要为我的工作报告崩溃需要更长的时间,但我认为至少这个问题已经得到了答案.

It appears to be hard-coded to wait two minutes. Why crash reporting for my job is taking longer than that, I don't know, but I think this question at least has been answered.

这篇关于JVM在未指定帧的情况下崩溃,仅“计时器已过期,中止".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 23:46