◐ Shell
clean mode source ↗

GH-108362: Incremental GC implementation by markshannon · Pull Request #108038 · python/cpython

I think including objects from the other space can lead to some problematic behavior. If you have a large object graph which is referenced from smaller more frequently allocated objects this will continuously pull that large object graph in and blow the budget. This is basically what I was concerned with regarding sys.modules - you can have a module which imports sys, you can be creating lambda's in that module which are short lived but become cyclical trash, and to collect the lambda you need to traverse the world.

I also think given this behavior the need for two different lists of objects isn't really necessary, you could instead just move the objects to the end of the collecting space and we'd get back to them when we can, and I think the behavior would be identical to the existing algorithm (except maybe we'd pick up some extra objects when we'd flip the spaces).

I think another potential problem with this behavior is that you're not eagerly adding objects in the current space to this list transitively. That means if we visit a large object graph and blow the budget then we may not get to other transitively referenced objects that are in the space we're collecting from added to the container, and therefore despite collecting a huge object graph we still won't have collected enough to clear the cycle.

Below's a program that seems to grow unbounded, I had briefly experimented with using the _PyGC_PREV_MASK_COLLECTING flag here to mark objects we want to include instead of using the aging space, but that also didn't work (I would have expected on some collections we get a bunch of these little cycles, and then after a flip we need to walk the large object graph once), so I'm not certain what exactly needs to be done to fix this.

class LinkedList:
    def __init__(self, next=None, prev=None):
        self.next = next
        if next is not None:
            next.prev = self
        self.prev = prev
        if prev is not None:
            prev.next = self


def make_ll(depth):
    head = LinkedList()
    for i in range(depth):
        head = LinkedList(head, head.prev)
    return head


head = make_ll(10000)

while True:
    newhead = make_ll(200)
    newhead.surprise = head