序列化结果的总大小为16个任务(1048.5 MB),比spark.driver.maxResultSize(1024.0 MB)要大。

13 浏览
0 Comments

序列化结果的总大小为16个任务(1048.5 MB),比spark.driver.maxResultSize(1024.0 MB)要大。

当我在我的spark-submit命令中添加--conf spark.driver.maxResultSize=2050之后,出现了以下错误。

17/12/27 18:33:19 ERROR TransportResponseHandler: 当连接来自/XXX.XX.XXX.XX:36245时,仍有1个未完成的请求
17/12/27 18:33:19 WARN Executor: 在心跳发送器中与驱动程序通信时出现问题
org.apache.spark.SparkException: 在等待结果时抛出异常:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:726)
        at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:755)
        at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:755)
        at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:755)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
        at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:755)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: 连接来自/XXX.XX.XXX.XX:36245已关闭
        at org.apache.spark.network.client.TransportResponseHandler.channelInactive(TransportResponseHandler.java:146)

添加这个配置的原因是错误:

py4j.protocol.Py4JJavaError: 调用o171.collectToPython时发生错误。
: org.apache.spark.SparkException: 作业中止,因为阶段失败:16个任务的序列化结果总大小(1048.5 MB)大于spark.driver.maxResultSize(1024.0 MB)

因此,我将maxResultSize增加到2.5 GB,但是Spark作业仍然失败(如上所示的错误)。

如何解决这个问题?

0