本文介绍了积极的垃圾收集器战略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个创建并忘记大量对象的应用程序,长时间存在的对象的数量确实增长缓慢,但与短期对象相比,这是非常少的。这是一款具有高可用性要求的桌面应用程序,需要每天24小时开启。大部分的工作都是在一个线程上完成的,这个线程将使用所有可以得到它的CPU。



在过去,我们已经看到以下重负载:
当垃圾收集器收集的内存少于新分配的内存量时,已用堆空间缓慢增加,已使用堆大小缓慢增长并最终接近指定的最大堆。此时垃圾收集器将大量使用,并开始使用大量资源来防止超过最大堆大小。这会减慢应用程序的速度(容易慢10倍),此时大部分时间GC都会在几分钟后成功清理垃圾或失败并抛出 OutOfMemoryException ,他们都不是真的可以接受。

使用的硬件是四核处理器,至少有4GB内存运行64位Linux,如果需要,我们可以使用所有这些。目前,该应用程序正在大量使用单核,该核正在使用大部分时间运行单核/线程。其他内核大多处于空闲状态,可用于垃圾收集。

我有一种感觉,垃圾收集器应该在早期阶段更积极地收集数据,而不是在内存耗尽之前。我们的应用程序没有任何吞吐量问题,较低的暂停时间要求比吞吐量要重要一些,但远比不接近最大堆大小更重要。如果单个繁忙线程仅以当前速度的75%运行,只要它意味着垃圾收集器可以跟上创建过程就可以接受。总之,性能的稳步下降要好于我们现在看到的突然下降。 我已阅读,这意味着我很好地理解了这些选项,但是我仍然觉得很难选择正确的设置,因为我的要求有点不同在本文中讨论的内容。



目前我正在使用带有选项的ParallelGC -XX:GCTimeRatio = 4 。这比时间比例的默认设置稍微好一点,但我有一种感觉,允许GC通过该设置运行更多。

对于监控,我主要使用jconsole和jvisualvm。

我想知道您为上述情况推荐的垃圾收集选项。还有哪些GC调试输出可以让我更好地理解瓶颈。



编辑:
我了解一个非常好的选项这里是创造更少的垃圾,这是我们真正考虑的事情,但是我想知道我们如何通过GC调整来解决这个问题,因为这是我们可以更容易地做到的事情,并且比改变大量的源代码。此外,我运行了不同的内存分析器,并且我了解垃圾被使用的情况,并且我知道它由可以收集的对象组成。



我正在使用:

  java version1.6.0_27- ea
Java™SE运行时环境(内部版本1.6.0_27-ea-b03)
Java HotSpot™64位服务器虚拟机(版本20.2-b03,混合模式)

使用JVM参数:

  -Xmx1024M和-XX:GCTimeRatio = 4 

编辑回复Matts评论:
大多数内存(和cpu)都是用来构造代表当前情况的对象。其中一些将立即丢弃,因为情况变化很快,如果一段时间没有更新进入,其他一些将有一个中等生活时间。

解决方案

你没有提到你正在运行的JVM的哪个版本,这是至关重要的信息。你也没有提到应用程序运行多久(例如,它是为了一天的工作时间?一周?)?

其他一些点


  1. 如果您不断地将对象泄漏到终身使用中,因为您的分配速度比您的年轻的gen can被扫描,然后你的世代不正确的大小。您需要对应用程序的行为进行适当的分析才能正确确定其大小,您可以使用visualgc。

  2. 吞吐量收集器旨在接受单个,大的停顿,而不是许多较小的暂停,好处是它是一个压缩收集器,它可以提供更高的总吞吐量CMS存在服务于另一端,即更多暂停时间较短但总吞吐量较低。缺点是不能压缩,所以碎片可能成为问题。在6月26日,碎片问题得到了改善,所以如果你不在那个版本上,那可能是升级时间。请注意,您所提到的出血到终身效应会加剧碎裂问题,并且在一段时间后,这将导致提升失败(也称为非计划性全面gc和同伴STW暂停)。我以前在上写了一个有关此问题的答案。
    $如果您运行的是具有> 4GB内存和最近足够JVM的64位JVM,请确保您 -XX:+ UseCompressedOops 否则,你只是在浪费空间,因为64位JVM在没有它的情况下占用了相同工作负载的32位JVM空间的1.5倍(如果不是,则升级以访问更多的RAM)


您可能还想阅读,它用于确定你的幸存者空间和放大器的大小。适当的伊甸园。基本上你想达到的是;


  • 伊甸园足够大,不会太频繁收集

  • 幸存者空间大小与时长阈值相匹配

  • 设定一个持续时间阈值,以尽可能确保只有真正长期居住的对象才能进入终身服务状态



因此,假设你有6G的堆,你可以做一些类似于5G伊甸园+ 16M的幸存者空间+ 1的持久门槛。



基本过程是


  1. 分配到eden

  2. eden填充

  3. 生存对象扫入到生存空间中

  4. 生存空间中的生命对象复制到空间或提升到终身(取决于持续时间阈值< / li>
  5. 在eden中剩下的任何东西都会被清除

因此,给定的空间适合您的应用在分配配置文件中,完全可以配置系统以便它很好地处理负载。对此有几点需要注意;


  1. 你需要一些长时间运行的测试才能正确执行此操作(例如,可能需要几天才能解决CMS碎片问题)

  2. 您需要多次执行每次测试才能获得良好结果

  3. 您需要在GC配置中每次更改1件东西
  4. li>
  5. 您需要能够向应用程序呈现合理可重复的工作负载,否则将难以客观比较不同测试运行的结果

  6. 如果工作负载不可预测并且存在巨大的高峰/低谷,那么确实很难做到可靠。

点1-3意味着需要很长时间才能获得对。另一方面,你也许能够很快做出足够好的表现,这取决于你是怎样的肛门!

最后,与Peter Lawrey的观点相呼应,你可以节省很多麻烦(尽管引入了其他一些麻烦),如果你真的严格对象分配。


I am running an application that creates and forgets large amounts of objects, the amount of long existing objects does grow slowly, but this is very little compared to short lived objects. This is a desktop application with high availability requirements, it needs to be turned on 24 hours per day. Most of the work is done on a single thread, this thread will just use all CPU it can get it's hands.

In the past we have seen the following under heavy load:The used heap space slowly goes up as the garbage collector collects less than the amount of memory newly allocated, the used heap size slowly grows and eventually comes near the specified max heap. At that point the garbage collector will kick in heavily and start using a huge amount of resources to prevent going over the max heap size. This slows the application down (easily 10x as slow) and at this point most of times the GC will succeed to clean up the garbage after a few minutes or fail and throw an OutOfMemoryException, both of them are not really acceptable.

The hardware used is a quad core processor with at least 4GB of memory running 64 bit Linux, all of that we can use if needed. Currently the application is heavily using a single core, which is using most of its time running a single core/thread. The other cores are mostly idle and could be used for garbage collection.

I have a feeling the garbage collector should be collecting more aggressively at an early stage, well before it runs out of memory. Our application does not have any throughput issues, low pause time requirements are a bit more important than throughput, but far less important than not getting near the max heap size. It is acceptable if the single busy thread runs at only 75% of the current speed, as long as it means the garbage collector can keep up with the creation. So in short, a steady decrease of performance is better than the sudden drop we see now.

I have read Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning thoroughly, which means I understand the options well, however I still find it hard to chose the right settings as my requirements are a bit different from what is discussed in the paper.

Currently I am using the ParallelGC with the option -XX:GCTimeRatio=4. This works a bit better than the default setting for time ratio, but I have a feeling the GC is allowed to run more by that setting than it does.

For monitoring I am using jconsole and jvisualvm mostly.

I would like to know what garbage collection options you recommend for the above situation. Also which GC debug output can I look at to understand the bottle neck better.

EDIT:I understand a very good option here is to create less garbage, this is something we are really considering, however I would like to know how we can tackle this with GC tuning, as that is something we can do much more easily and roll out more quickly than changing large amounts of the source code. Also I have ran the different memory profilers and I understand what the garbage is used by, and there by I know it consists of objects that could be collected.

I am using:

java version "1.6.0_27-ea"
Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode)

With JVM parameters:

-Xmx1024M and -XX:GCTimeRatio=4 

Edit in reply to Matts comments:Most memory (and cpu) goes towards constructing objects that represent the current situation. Some of these will be discarded right away as the situation changes rapidly, some others will have a medium life time if no updates come in for a while.

解决方案

You don't mention which build of the JVM you're running, this is crucial info. You also don't mention how long the app tends to run for (e.g. is it for the length of a working day? a week? less?)

A few other points

  1. If you are continually leaking objects into tenured because you're allocating at a rate faster than your young gen can be swept then your generations are incorrectly sized. You will need to do some proper analysis of the behaviour of your app to be able to size them correctly, you can use visualgc for this.
  2. the throughput collector is designed to accept a single, large pause as opposed to many smaller pauses, the benefit is it is a compacting collector and it enables higher total throughput
  3. CMS exists to serve the other end of the spectrum, i.e. many more much much smaller pauses but lower total throughput. The downside is it is not compacting so fragmentation can be a problem. The fragmentation issue was improved in 6u26 so if you're not on that build then it may be upgrade time. Note that the "bleeding into tenured" effect you have remarked on exacerbates the fragmentation issue and, given time, this will lead to promotion failures (aka unscheduled full gc and associates STW pause). I have previously written an answer about this on this question

    1. If you're running a 64bit JVM with >4GB RAM and a recent enough JVM, make sure you -XX:+UseCompressedOops otherwise you're simply wasting space as a 64bit JVM occupies ~1.5x the space of a 32bit JVM for the same workload without it (and if you're not, upgrade to get access to more RAM)

You may also want to read another answer I've written on this subject which goes into sizing your survivor spaces & eden appropriately. Basically what you want to achieve is;

  • eden big enough that it is not collected too often
  • survivor spaces sized to match the tenuring threshold
  • a tenuring threshold set to ensure, as much as possible, that only truly long lived objects make it into tenured

Therefore say you had a 6G heap, you might do something like 5G eden + 16M survivor spaces + a tenuring threshold of 1.

The basic process is

  1. allocate into eden
  2. eden fills up
  3. live objects swept into the to survivor space
  4. live objects in from survivor space either copied to the to space or promoted to tenured (depending on tenuring threshold & space available & no of times they've been copied from 1 to the other)
  5. anything left in eden is swept away

Therefore, given spaces appropriately sized for your application's allocation profile, it's perfectly possible to configure the system such that it handles the load nicely. A few caveats to this;

  1. you need some long running tests to do this properly (e.g. can take days to hit the CMS fragmentation problem)
  2. you need to do each test a few times to get good results
  3. you need to change 1 thing at a time in the GC config
  4. you need to be able to present a reasonably repeatable workload to the app otherwise it will be difficult to objectively compare results from different test runs
  5. this will get really hard to do reliably if the workload is unpredictable and has massive peaks/troughs

Points 1-3 mean this can take ages to get right. On the other hand you may be able to make it good enough v quickly, it depends how anal you are!

Finally, echoing Peter Lawrey's point, you can save a lot of bother (albeit introducing some other bother) if you are really rigorous about object allocation.

这篇关于积极的垃圾收集器战略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 07:03