org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 29.1 failed 4 times, most recent failure: Lost task 1.3 in stage 29 ...
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 29.1 failed 4 times, most recent failure: Lost task 1.3 in stage 29.1 (TID 466, magnesium, executor 4): java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at shade.com.datastax.spark.connector.google.common.base.Throwables.propagate(Throwables.java:160)
at com.datastax.driver.core.NettyUtil.newEventLoopGroupInstance(NettyUtil.java:136)
at com.datastax.driver.core.NettyOptions.eventLoopGroup(NettyOptions.java:99)
at com.datastax.driver.core.Connection$Factory.
我是在SparkStreaming查詢Cassandra時遇到這個報錯的。
dataFrame.foreachPartition { part =>
val poolingOptions = new PoolingOptions
poolingOptions
.setCoreConnectionsPerHost(HostDistance.LOCAL, 4)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 10)
val cluster = Cluster
.builder
.addContactPoints("localhost")
.withCredentials("cassandra", "cassandra")
.withPoolingOptions(poolingOptions)
.build
val session = cluster.connect("keyspace")
part.foreach { item =>
// 業務邏輯
}
cluster.close()
session.close()
}
每個批次中,首先檢查cluster和session,是否都close,沒有close會報這個錯誤。
若還沒有解決,需要檢查netty的版本。
推薦在IDEA中安裝Maven Helper插件。然後把衝突的低版本netty相關的依賴刪掉即可。