本文介绍了垃圾收集器为什么在释放前等待?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个为什么这样工作?有关垃圾收集的问题(任何/所有实现:Java,Python,CLR等)。当垃圾收集器不在任何范围内时,它们会释放对象;指向它的引用数量为零。在我看来,只要引用的数量达到零,框架就可以释放,但是我遇到的所有实现都会等待一段时间,然后一次释放多个对象。我的问题是,为什么?



我假设框架为每个对象保留一个整数(我认为Python是这样做的,因为您必须调用<$ c $当在C中为它写扩展模块时,PyINCREF 和 PyDECREF ;大概这些函数修改了某个真实的计数器)。如果是这样,那么它不应该花费更多的CPU时间来消除它在超出范围时的时间。如果现在每个对象需要x纳秒,那么每个对象需要x纳秒,对吗?



如果我的假设是错误的,并且没有与每个对象关联的整数,那么我理解垃圾收集等待的原因:它必须走引用图来确定每个对象的状态,并且计算需要时间。这样的方法会比显式的引用计数方法消耗更少的内存,但我惊讶地发现它更快或者是其他原因的首选方法。这听起来像很多工作。



从编程的角度来看,如果对象在超出范围后立即释放,那将会很好。我们不仅可以依赖于我们希望它们被执行的析构函数(其中一个Python陷阱是 __ del __ 不会在可预测的时间被调用),但它会变成更容易记忆 - 分析程序。 多少钱这会导致混淆。在我看来,在一个立即释放的框架中编程的好处是如此之大,以至于我所听到的所有实现在释放之前都有一个很好的理由。这是什么好处?注意:如果仅仅需要遍历引用图表来标识循环引用(纯引用计数不能),那么为什么不能混合方法?一旦引用计数达到零就释放对象,然后定期扫描以查找循环引用。在这样的框架下工作的程序员会有一个性能/决定论的理由,尽可能地坚持非循环引用。这通常是可行的(例如,所有数据都是JSON对象的形式,没有父指针)。这就是任何流行的垃圾收集器的工作方式吗?

解决方案

首先,术语的一点:垃圾收集意味着不同的东西不同的人,一些GC方案比其他人更复杂。有些人认为引用计数是GC的一种形式,但是我个人认为真正的GC与引用计数不同。



使用refcounts,有一个整数跟踪引用的数量,并且可以在引用计数达到零时立即触发释放。我们了解CPython实现如何工作,以及大多数C ++智能指针是如何工作的。 CPython实现添加了标记/扫描GC作为备份,因此它非常像您描述的混合设计。



但是refcounting实际上是一个非常糟糕的解决方案,因为它每次通过引用时都会引起(相对)昂贵的内存写入(加上内存屏障和/或锁定,以确保线程安全),这种情况发生的很多。在像C ++这样的命令式语言中,通过宏和编码约定来管理内存所有权是可能的(只是很困难),但是在像Lisp这样的函数式语言中,几乎不可能,因为内存分配通常由于闭包中的局部变量捕获而隐含地发生。 / p>

因此,对于现代GC的第一步是为Lisp发明的,这应该不足为奇。它被称为双空间分配器或双空间收集器,它的工作原理与它听起来一样:它将可分配内存(堆)分成两个空间。每个新的对象都被分配到第一个空间中,直到它变得太满为止,此时分配将停止,并且运行时将会走参考图并仅将实时(仍然引用的)对象复制到第二个空间。在复制实时对象之后,第一个空间将被标记为空,并且分配将继续,从第二个空间分配新的对象,直到太满为止,此时活动对象将被复制回第一个空间,并且进程将重新开始。



这个二进制收集器的优点是,不用做 O(N) work,其中 N 是垃圾对象的总数,它只会执行 O(M) work,其中 M 不是垃圾 的对象数量。因为在实践中,大多数对象被分配,然后在短时间内释放,这可以导致性能的显着改善。

另外,两个空间收集器也可以简化分配器。大多数 malloc()实现都维护着所谓的空闲列表:一个可以分配哪些块的列表。要分配一个新对象, malloc()必须扫描空闲列表以寻找足够大的空白空间。但是这个双空间分配器并没有打扰到它:它只是像一个堆栈那样在每个空间中分配对象,只需将指针向上按所需的字节数即可。



因此,这个二进制收集器比 malloc()要快得多,这很好,因为Lisp程序会做很多比C程序更多的分配。或者,换句话说,Lisp程序需要一种像堆栈一样分配内存的方式,但是其生命周期不限于执行堆栈 - 换句话说,如果程序内存不足,堆栈可以无限增长。事实上,Raymond Chen认为,人们应该如何看待GC。我强烈推荐他的一系列博客文章,从。但是这个双面空间收集器有一个主要缺陷,那就是没有一个程序可以使用超过一半的可用内存:另一半总是浪费。所以GC技术的历史就是试图在二空间收集器上进行改进的历史,通常通过使用程序行为的启发式。但是,GC算法不可避免地会涉及折衷,通常更倾向于批量而非单独地取消分配对象,这不可避免地导致对象未立即释放的延迟。

编辑:为了回答你的后续问题,现代地方选区通常包含,其中对象根据生命周期被分组到不同的世代中,并且一代中的对象一旦足够长的寿命被提升到另一代。有时,对象生命周期中的小差异(例如,在请求驱动的服务器中,存储对象的时间超过一个请求)可能导致对象被释放之前所花费的时间差异很大,因为它会导致它变成更多终身。

您正确地观察到一个真正的GC必须在 malloc() free()。 (作为一个附注,值得学习如何实现 malloc() free() - 它们不是)另外,对于一个有效的GC,你需要保守(比如Boehm GC)并且不要移动对象,并且检查可能是指针的东西,否则你需要一些一种不透明指针类型 - Java和C#称之为引用。不透明指针实际上对于分配系统来说非常好,因为它意味着你可以通过更新指针来移动对象。在像C那样直接与原始内存地址交互的语言中,移动对象永远不会安全。



GC算法有多种选择。标准Java运行时包含不少于5个收集器(年轻,串行,旧CMS,新CMS和G1,尽管我认为我忘记了其中一个),并且每个收集器都有一组可选的配置。



然而,GC不是魔术。大多数GC只是利用批处理工作的,这意味着速度通常用于增加内存使用量(与手动内存管理或refcounting相比)。但是,与现在RAM的低成本相比,增加的程序性能和程序员性能的组合使得折衷通常值得。

希望这有助于让事情变得更加清晰!

I have a "why does it work that way?" question about garbage collection (any/all implementations: Java, Python, CLR, etc.). Garbage collectors deallocate an object when it is no longer in any scope; the number of references pointing to it is zero. It seems to me that a framework could deallocate as soon as the number of references reaches zero, but all implementations I've encountered wait a while and then deallocate many objects at a time. My question is, why?

I'm assuming that the framework keeps an integer for each object (which I think Python does, because you have to call PyINCREF and PyDECREF when writing extension modules for it in C; presumably these functions modify a real counter somewhere). If so, then it shouldn't take any more CPU time to eliminate the object the moment it goes out of scope. If it takes x nanoseconds per object now, then it would take x nanoseconds per object later, right?

If my assumption is wrong and there is no integer associated with each object, then I understand why garbage collection waits: it would have to walk the graph of references to determine the status of each object, and that calculation takes time. Such a method would consume less memory than the explicit reference-count method, but I'm astonished that it's quicker or is the preferred method for other reasons. It sounds like a lot of work.

From a programming point of view, it would be nice if objects deallocated immediately after they go out of scope. Not only could we rely on destructors being executed when we want them to be (one of the Python gotchas is that __del__ is not called at a predictable time), but it would become much easier to memory-profile a program. Here's an example of how much confusion this causes. To my mind, the benefits of programming in a deallocate-right-away framework are so great that there must be some good reason why all the implementations I've heard of wait before deallocating. What is that benefit?

Note: if the walk over the graph of references is only needed to identify circular references (a pure reference count can't), then why not a hybrid approach? Deallocate objects as soon as their reference count hits zero and then also do periodic sweeps to look for circular references. Programmers working in such a framework would have a performance/determinism reason to stick to non-circular references as much as is feasible. It's often feasible (e.g. all data are in the form of JSON objects with no pointers to parents). Is this how any popular garbage collectors work?

解决方案

To start with, a point of terminology: "garbage collection" means different things to different people, and some GC schemes are more sophisticated than others. Some people consider reference counting to be a form of GC, but personally I consider "true GC" to be distinct from reference counting.

With refcounts, there is an integer tracking the number of references, and you can trigger deallocation immediately when the refcount hits zero. This us how the CPython implementation works, and how most varieties of C++ smart pointers work. The CPython implementation adds a mark/sweep GC as a backup, so it's very much like the hybrid design you describe.

But refcounting is actually a pretty terrible solution, since it incurs a (relatively) expensive memory write (plus a memory barrier and/or lock, to ensure thread safety) every time a reference is passed, which happens quite a lot. In imperative languages like C++ it's possible (just difficult) to manage memory ownership through macros and coding conventions, but in functional languages like Lisp it becomes well-nigh impossible, because memory allocation usually happens implicitly due to local variable capture in a closure.

So it should come as no surprise that the first step toward a modern GC was invented for Lisp. It was called the "twospace allocator" or "twospace collector" and it worked exactly like it sounds: it divided allocatable memory (the "heap") into two spaces. Every new object was allocated out of the first space until it got too full, at which point allocation would stop and the runtime would walk the reference graph and copy only the live (still referenced) objects to the second space. After the live objects were copied, the first space would be marked empty, and allocation would resume, allocating new objects from the second space, until it got too full, at which point the live objects would be copied back to the first space and the process would start all over again.

The advantage of the twospace collector is that, instead of doing O(N) work, where N is the total number of garbage objects, it would only do O(M) work, where M is the number of objects that were not garbage. Since in practice, most objects are allocated and then deallocated in a short period of time, this can lead to a substantial performance improvement.

Additionally, the twospace collector made it possible to simplify the allocator side as well. Most malloc() implementations maintain what is called a "free list": a list of which blocks are still available to be allocated. To allocate a new object, malloc() must scan the free list looking for an empty space that's big enough. But the twospace allocator didn't bother with that: it just allocated objects in each space like a stack, by just pushing a pointer up by the desired number of bytes.

So the twospace collector was much faster than malloc(), which was great because Lisp programs would do a lot more allocations than C programs would. Or, to put it another way, Lisp programs needed a way to allocate memory like a stack but with a lifetime that was not limited to the execution stack -- in other words, a stack that could grow infinitely without the program running out of memory. And, in fact, Raymond Chen argues that that's exactly how people should think about GC. I highly recommend his series of blog posts starting with Everybody thinks about garbage collection the wrong way.

But the twospace collector had a major flaw, which is that no program could ever use more than half the available RAM: the other half was always wasted. So the history of GC techniques is the history of attempts to improve on the twospace collector, usually by using heuristics of program behavior. However, GC algorithms inevitably involve tradeoffs, usually preferring to deallocate objects in batches instead of individually, which inevitably leads to delays where objects aren't deallocated immediately.

Edit: To answer your follow-up question, modern GCs generally incorporate the idea of generational garbage collection, where objects are grouped into different "generations" based on lifetime, and an object in one generation gets "promoted" to another generation once it's lived long enough. Sometimes a small difference in object lifetime (e.g. in a request-driven server, storing an object for longer than one request) can lead to a large difference in the amount of time it takes before the object gets deallocated, since it causes it to become more "tenured".

You correctly observe that a true GC has to operate "beneath" the level of malloc() and free(). (As a side note, it's worth learning about how malloc() and free() are implemented -- they aren't magic either!) Additionally, for an effective GC, you either need to be conservative (like the Boehm GC) and never move objects, and check things that might be pointers, or else you need some kind of "opaque pointer" type -- which Java and C# call "references". Opaque pointers are actually great for an allocation system, since it means you can always move objects by updating pointers to them. In a language like C where you interact directly with raw memory addresses, it's never really safe to move objects.

And there are multiple options for GC algorithms. The standard Java runtime contains no less than five collectors (Young, Serial, old CMS, new CMS, and G1, although I think I'm forgetting one) and each has a set of options that are all configurable.

However, GCs aren't magic. Most GCs are just exploiting the time-space tradeoff of batching work, which means that the gains in speed are usually paid for in increased memory usage (compared to manual memory management or refcounting). But the combination of increased program performance and increased programmer performance, versus the low cost of RAM these days, makes the tradeoff usually worth it.

Hopefully that helps make things clearer!

这篇关于垃圾收集器为什么在释放前等待?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 06:28