在Python 2.7中增加递归限制和堆栈大小。

Question

6 浏览2023年1月13日

匿名的 2023年1月13日

0 Comments

我正在处理大型树结构，并且需要增加Python 2.7的递归限制。使用sys.setrecursionlimit(10000)会导致我的内核崩溃，所以我认为我需要增加堆栈大小。然而，我不知道堆栈大小应该设为多大。我尝试了像100 MiB这样的大小，例如threading.stack_size(104857600)，但是内核仍然会崩溃。给它1 GiB会报错。我还没有使用过threading模块，所以当我将上述语句放在脚本开头时，我使用得是否正确？我没有进行任何形式的并行处理，所有操作都在同一个线程中进行。我的计算机有128 GB的物理内存，运行Windows 10，使用Spyder中的iPython控制台。错误信息只是显示：

内核已死亡，重新启动

没有更多的信息。编辑：以下是重现问题的完整代码。构建树的过程很好，尽管时间很长，但只有在递归执行treeToDict()将整个树读入字典时内核才会崩溃。也许这个函数的代码有问题。树是一个非二叉树：

import pandas as pd
import threading
import sys
import random as rd
import itertools as it
import string
threading.stack_size(104857600)
sys.setrecursionlimit(10000)
class treenode:
    # 用于构建树的类
    def __init__(self,children,name='',weight=0,parent=None,depth=0):
        self.name = name
        self.weight = weight
        self.children = children
        self.parent = parent
        self.depth = depth
        self.parentname = parent.name if parent is not None else ''
def add_child(node,name):
    # 将元素添加到树中
    # 如果在给定节点中已经存在该元素，则增加权重
    # 否则添加一个新的子节点
    for i in range(len(node.children)):
        if node.children[i].name == name:
            node.children[i].weight += 1
            newTree = node.children[i]
            break
    else:
        newTree = treenode([],name=name,weight=1,parent=node,depth=node.depth+1)
        node.children.append(newTree)
    return newTree
def treeToDict(t,data):
    # 将树读入字典
    if t.children != []:
        for i in range(len(t.children)):
            data[str(t.depth)+'_'+t.name] = [t.name, t.children[i].name, t.depth, t.weight, t.parentname]
    else:
        data[str(t.depth)+'_'+t.name] = [t.name, '', t.depth, t.weight, t.parentname]
    for i in range(len(t.children)):
        treeToDict(t.children[i],data)
# 创建导致树分支非常长的随机数据集：
# A是与数据集B的每个集合相关联的索引，成为一条分支
rd.seed(23)
testSet = [''.join(l) for l in it.combinations(string.ascii_uppercase[:20],2)]
A = []
B = []
for i in range(10):
    for j in range(rd.randint(10,6000)):
        A.append(i)
        B.append(rd.choice(testSet))
dd = {"A":A,"B":B}
data = pd.DataFrame(dd)
# 最大长度应该超过5500，如果不是，请使用其他种子：
print data.groupby('A').count().max()
# 创建树
root = treenode([],name='0')
for i in range(len(data.values)):
    if i == 0:
        newTree = add_child(root,data.values[i,1])
        oldses = data.values[i,0]
    else:
        if data.values[i,0] == oldses:
            newTree = add_child(newTree,data.values[i,1])
        else:
            newTree = add_child(root,data.values[i,1])
            oldses = data.values[i,0]
result={}
treeToDict(root,result)

注：我知道treeToDict()函数有一个错误，即它会覆盖条目，因为可能存在重复的键。但对于这个错误，这个bug并不重要。

0

1 答案

匿名的 · Answer 1 · 2023-02-21T13:00:09+00:00

在Python 2.7中，有时候会遇到递归深度限制和堆栈大小的问题。然而，实际上这个问题可能并不是由于堆栈大小的限制，而是由于算法本身存在问题。

解决这个问题的方法是实现一个基于堆栈的深度/广度优先搜索算法来遍历树。以下是一个类似Python的伪代码示例：

stack = []
def traverse_tree(root):
  stack.append(root)
  while stack:
    cur = stack.pop()
    cur.do_some_awesome_stuff()
    stack.append(cur.get_children())

这种方法具有极高的可扩展性，可以处理任意类型的树。如果想进一步了解，可以参考这个问题（链接）和这个问题（链接）。

通过实现这种基于堆栈的遍历算法，你可以避免递归深度限制和堆栈大小的问题。希望对你有所帮助，祝你成功！