本文介绍了为什么我们的 MonoTouch 应用程序会在垃圾收集器中崩溃?不是内存不足的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个简单的问题,但原因却很复杂.我们是经验丰富的开发人员,并且已经对可能导致它的原因进行了大量研究.我们希望 MonoTouch 开发人员可以与我们合作,找出人们普遍遇到的问题,但似乎还没有解决方案.我们已经为此研究了两个多星期,但未能解决.

问题是:为什么我们的 MonoTouch 应用程序会在垃圾收集器中崩溃?不是内存不足.

情况是我们有一个应用程序会定期检查网络服务(可能每 5 秒一次).一段时间后,它因内存管理中止而失败.这通常发生在大约一个半小时后,但也可能发生在十分钟到一夜之间.这发生在我们所有的测试设备上(我们总共有 7 个,涵盖 iOS3 和 iOS4、iPod Touch、iPhone 和 iPad(1&2).在查看 StackOverflow 之后,我们在计时器中添加了 System.Gc.Collect 之前我们采取任何行动.这使事情有所改善(失败需要更长的时间),但它并没有消失.还值得补充的是,来自 iPad 的内存日志显示有 777 个空闲块,我们正在使用 2041 个应用程序,总共有 26488 个有线页面.由于我们已经进行了垃圾回收,并且没有做任何与 5 秒前所做的不同的事情,因此内存不足似乎很奇怪.

我们升级到 MonoTouch 4.0.1 但这并没有修复它.

StackOverflow 问题可能与同一问题有关,但没有回答:5666905/4545383/5492469/5426733

iPad2 上失败的堆栈如下.失败可能发生在主线程或 http 线程中,但始终按此 GC_ 序列进行.我在下面讨论了内存管理器 GC_remap 的代码.

线程 10 崩溃:0 libsystem_kernel.dylib 0x34b4da1c __pthread_kill + 81 libsystem_c.dylib 0x3646a3b4 pthread_kill + 522 libsystem_c.dylib 0x36462bf8 中止 + 723 MyApp 0x004ca92c mono_handle_native_sigsegv (mini-exceptions.c:2249)4 MyApp 0x004f2208 sigabrt_signal_handler (mini-posix.c:195)5 libsystem_c.dylib 0x36475728 _sigtramp + 366 libsystem_c.dylib 0x3646a3b4 pthread_kill + 527 libsystem_c.dylib 0x36462bf8 中止 + 728 MyApp 0x0061dc94 GC_remap (os_dep.c:2092)9 MyApp 0x00611678 GC_allochblk_nth (allchblk.c:730)10 我的应用程序 0x00611028 GC_allochblk (allchblk.c:561)11 我的应用程序 0x0061d0e0 GC_new_hblk (new_hblk.c:253)12 MyApp 0x006133d0 GC_allocobj (alloc.c:1116)13 MyApp 0x00617d30 GC_generic_malloc_inner (malloc.c:136)14 MyApp 0x00617f40 GC_generic_malloc (malloc.c:192)15 MyApp 0x00618264 GC_malloc_atomic (malloc.c:262)16 MyApp 0x005a46d4 mono_object_allocate_ptrfree (object.c:4221)17 MyApp 0x005a4aa0 mono_string_new_size (object.c:4848)18 MyApp 0x005c1b14 ves_icals_System_String_InternalAllocateStr (string-icalls.c:213)19 MyApp 0x002d34c4 wrapper_managed_to_native_string_InternalAllocateStr_int + 5220 我的应用程序 0x002cff5c string_ToLower_System_Globalization_CultureInfo + 5621 我的应用程序 0x003e6ac0 System_Net_WebRequest_GetCreator_string + 4022 我的应用程序 0x003e694c System_Net_WebRequest_Create_System_Uri + 4823 我的应用程序 0x003e68d8 System_Net_WebRequest_Create_string + 6424 MyApp 0x004489c4 MyApp_Services_Client_GetResponseContent_string + 15225 MyApp 0x00446288 MyApp_Services_Client_GetCurrentQuestion_long_long + 91626 MyApp 0x00196fcc MyApp_Iphone_RootViewController_RetrieveCurrentQuestion + 86827 MyApp 0x002e6368 System_Threading_Thread_StartUnsafe + 16828 MyApp 0x00306890 wrapper_runtime_invoke_object_runtime_invoke_dynamic_intptr_intptr_intptr_intptr + 19229 MyApp 0x004b0274 mono_jit_runtime_invoke (mini.c:5746)30 MyApp 0x0059f924 mono_runtime_invoke (object.c:2756)31 MyApp 0x005a1350 mono_runtime_delegate_invoke (object.c:3421)32 MyApp 0x005ca884 start_wrapper_internal (threads.c:788)33 MyApp 0x005ca924 start_wrapper (threads.c:830)34 MyApp 0x005ef4b8 thread_start_routine (wthreads.c:285)35 MyApp 0x0061f1d0 GC_start_routine (pthread_support.c:1468)36 libsystem_c.dylib 0x3646a30a _pthread_start + 24237 libsystem_c.dylib 0x3646bbb4 thread_start + 0

这是似乎是失败点的 GC_remap 代码,来自 https://github.com/mono/mono/blob/master/libgc/os_dep.c

#ifdef NACL{/* NaCl 不暴露 mprotect,但 mmap 应该可以正常工作 */无效 * mmap_result;mmap_result = mmap(start_addr, len, PROT_READ | PROT_WRITE | OPT_PROT_EXEC,MAP_PRIVATE |MAP_FIXED |OPT_MAP_ANON,zero_fd, 0/* 偏移量 */);if (mmap_result != (void *)start_addr) ABORT("mmap as mprotect failed");/* 伪造返回值,就好像 mprotect 成功一样.*/结果 = 0;}#else/* NACL */结果 = mprotect(start_addr, len,PROT_READ |PROT_WRITE |OPT_PROT_EXEC);#endif/* NACL */如果(结果!= 0){GC_err_printf3("Mprotect 在 0x%lx (长度 %ld) 失败,错误号为 %ld",start_addr, len, errno);ABORT("Mprotect 重映射失败");}GC_unmapped_bytes -= len;

看起来 ABORT 是由 mprotect 功能失败引起的.我们一直无法获得故障代码,因为问题并未在模拟器上表现出来.mprotect 函数似乎只是将内存标记为可读取/写入/执行.内存管理器如何传递导致它失败的参数?可能是传递了不正确的指针,还是不正确的长度?还是某些区域或边界在 iOS 上的处理方式不同?

https://github.com/mono/的代码GC_allochblk_nth 的 mono/blob/master/libgc/allchblk.c 意味着只有在找到的内存块有效时才会调用 GC_remap 函数.(此文件与堆栈跟踪的行号不太匹配,因此推测它不是完全相同的文件.)

http://developer.apple.com/library/ios/#documentation/System/Conceptual/ManPages_iPhoneOS/man2/mprotect.2.html 说它可能会因 EACCES、EINVAL、ENOTSUP 而失败,它们分别是 13、22 和分别为 45.其中一份关于 SO 的报告说他们收到错误 12 (ENOMEM).我不确定这意味着什么,因为 mprotect 不应该分配内存,而且文档没有说这是有效的.

http://linux.die.net/man/2/上的更通用的文档mprotect 表示ENOMEM可能是内部内核结构无法分配.或者:[addr, addr+len]范围内的地址对于进程的地址空间无效,或者指定一个或多个页面没有映射的."怎么会这样?

我们非常感谢您就如何推进这项工作提出任何建议.除了 C# 代码之外,我们没有做任何事情,除了定期读取 https 之外,我们没有做任何事情.我们可以做些什么来改进调试(我们无法追踪任何东西,因为应用程序被 iOS 杀死了).我们尝试创建一个更简单的演示,但它失败的速度不够快,值得使用.如果 Novell MonoTouch 开发人员想要我们的来源,我们可以在明显保密的情况下提供.

解决方案

感谢您的重现,我们发现并更正了垃圾收集器中一个非常模糊的问题.它将包含在 MonoTouch 4.0.2 中.

We have a simple question, but the cause is complicated. We are experienced developers, and have done a lot of research into what may be causing it. We are hoping that MonoTouch developers can work with us to identify what appears to be a common problem that people are having and for which no solution appears to exist yet. We've been working on this for over two weeks, and not been able to resolve it.

The question is: Why is our MonoTouch app breaking in the garbage collector? It is not out of memory.

The situation is that we have an app that checks a web service regularly (perhaps every 5 seconds). After a period of time it fails with a memory management abort. This typically happens after about an hour and a half, but can be anywhere from ten minutes to overnight. This happens on all of our test devices (we have 7 in total covering iOS3 and iOS4, iPod Touch, iPhones and iPads (1&2). After looking on StackOverflow, we have added a System.Gc.Collect in a timer before we take any action. This improved things a little (it takes longer to fail), but it did not go away. It is also worth adding that the memory log from the iPad shows that there are 777 free blocks, and 2041 in use by our app, with a total of 26488 wired pages. Since we've garbage collected, and are not doing anything different to what we did 5 seconds before, it seems odd to run out of memory.

We upgraded to MonoTouch 4.0.1 but that has not fixed it.

StackOverflow questions that might be on the same issue, but not answering it: 5666905 / 4545383 / 5492469 / 5426733

The stack at failure on an iPad2 is below. The failure can happen in the main thread or an http thread, but always goes in this GC_ sequence. I have included the code for the memory manager GC_remap below, with discussion.

Thread 10 Crashed:
0   libsystem_kernel.dylib  0x34b4da1c __pthread_kill + 8
1   libsystem_c.dylib       0x3646a3b4 pthread_kill + 52
2   libsystem_c.dylib       0x36462bf8 abort + 72
3   MyApp                   0x004ca92c mono_handle_native_sigsegv (mini-exceptions.c:2249)
4   MyApp                   0x004f2208 sigabrt_signal_handler (mini-posix.c:195)
5   libsystem_c.dylib       0x36475728 _sigtramp + 36
6   libsystem_c.dylib       0x3646a3b4 pthread_kill + 52
7   libsystem_c.dylib       0x36462bf8 abort + 72
8   MyApp                   0x0061dc94 GC_remap (os_dep.c:2092)
9   MyApp                   0x00611678 GC_allochblk_nth (allchblk.c:730)
10  MyApp                   0x00611028 GC_allochblk (allchblk.c:561)
11  MyApp                   0x0061d0e0 GC_new_hblk (new_hblk.c:253)
12  MyApp                   0x006133d0 GC_allocobj (alloc.c:1116)
13  MyApp                   0x00617d30 GC_generic_malloc_inner (malloc.c:136)
14  MyApp                   0x00617f40 GC_generic_malloc (malloc.c:192)
15  MyApp                   0x00618264 GC_malloc_atomic (malloc.c:262)
16  MyApp                   0x005a46d4 mono_object_allocate_ptrfree (object.c:4221)
17  MyApp                   0x005a4aa0 mono_string_new_size (object.c:4848)
18  MyApp                   0x005c1b14 ves_icall_System_String_InternalAllocateStr (string-icalls.c:213)
19  MyApp                   0x002d34c4 wrapper_managed_to_native_string_InternalAllocateStr_int + 52
20  MyApp                   0x002cff5c string_ToLower_System_Globalization_CultureInfo + 56
21  MyApp                   0x003e6ac0 System_Net_WebRequest_GetCreator_string + 40
22  MyApp                   0x003e694c System_Net_WebRequest_Create_System_Uri + 48
23  MyApp                   0x003e68d8 System_Net_WebRequest_Create_string + 64
24  MyApp                   0x004489c4 MyApp_Services_Client_GetResponseContent_string + 152
25  MyApp                   0x00446288 MyApp_Services_Client_GetCurrentQuestion_long_long + 916
26  MyApp                   0x00196fcc MyApp_Iphone_RootViewController_RetrieveCurrentQuestion + 868
27  MyApp                   0x002e6368 System_Threading_Thread_StartUnsafe + 168
28  MyApp                   0x00306890 wrapper_runtime_invoke_object_runtime_invoke_dynamic_intptr_intptr_intptr_intptr + 192
29  MyApp                   0x004b0274 mono_jit_runtime_invoke (mini.c:5746)
30  MyApp                   0x0059f924 mono_runtime_invoke (object.c:2756)
31  MyApp                   0x005a1350 mono_runtime_delegate_invoke (object.c:3421)
32  MyApp                   0x005ca884 start_wrapper_internal (threads.c:788)
33  MyApp                   0x005ca924 start_wrapper (threads.c:830)
34  MyApp                   0x005ef4b8 thread_start_routine (wthreads.c:285)
35  MyApp                   0x0061f1d0 GC_start_routine (pthread_support.c:1468)
36  libsystem_c.dylib       0x3646a30a _pthread_start + 242
37  libsystem_c.dylib       0x3646bbb4 thread_start + 0

This is the GC_remap code that appears to be the point of failure, from https://github.com/mono/mono/blob/master/libgc/os_dep.c

#ifdef NACL
      {
    /* NaCl doesn't expose mprotect, but mmap should work fine */
    void * mmap_result;
        mmap_result = mmap(start_addr, len, PROT_READ | PROT_WRITE | OPT_PROT_EXEC,
              MAP_PRIVATE | MAP_FIXED | OPT_MAP_ANON,
              zero_fd, 0/* offset */);
        if (mmap_result != (void *)start_addr) ABORT("mmap as mprotect failed");
        /* Fake the return value as if mprotect succeeded. */
        result = 0;
      }
#else /* NACL */
      result = mprotect(start_addr, len,
                PROT_READ | PROT_WRITE | OPT_PROT_EXEC);
#endif /* NACL */
      if (result != 0) {
      GC_err_printf3(
        "Mprotect failed at 0x%lx (length %ld) with errno %ld
",
            start_addr, len, errno);
      ABORT("Mprotect remapping failed");
      }
      GC_unmapped_bytes -= len;

It would appear that the ABORT is caused by the mprotect function failing. We have been unable to get the failure code as the problem does not manifest itself on the simulator. The mprotect function appears to just mark the memory as accessible for read/write/execute. How is the memory manager passing parameters that cause it to fail? Could it be passing an incorrect pointer, or an incorrect length? Or are certain areas or boundaries handled differently on iOS?

The code at https://github.com/mono/mono/blob/master/libgc/allchblk.c for GC_allochblk_nth implies that the GC_remap function is only called if the memory block found was valid. (This file doesn't quite match the line numbers of the stack trace, so presumably it is not exactly the same file.)

http://developer.apple.com/library/ios/#documentation/System/Conceptual/ManPages_iPhoneOS/man2/mprotect.2.html says that it might fail with EACCES, EINVAL, ENOTSUP which are 13, 22, & 45 respectively. One of the reports on SO says that they get an error 12 (ENOMEM). I'm not sure what that means, as mprotect shouldn't be allocating memory, and the documentation doesn't say that is valid.

A more generic documentation at http://linux.die.net/man/2/mprotect indicates that ENOMEM can be caused by "Internal kernel structures could not be allocated. Or: addresses in the range [addr, addr+len] are invalid for the address space of the process, or specify one or more pages that are not mapped." How could this be?

We would much appreciate any suggestions on how we might move this forward. We are not doing anything other than C# code, and are not doing anything other than a periodic https read. What can we do to improve debugging (we can't trace anything as the app is killed by iOS). We have tried creating a simpler demonstration, but it does not fail fast enough to be worth using. If a Novell MonoTouch developer wants our source, we can provide it subject to the obvious confidentiality.

解决方案

Thanks to your reproduction we have found and corrected a very obscure issue in the garbage collector. It will be included in MonoTouch 4.0.2.

这篇关于为什么我们的 MonoTouch 应用程序会在垃圾收集器中崩溃?不是内存不足的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 12:52