/** * User: 过往记忆 * Date: 2015-05-19 * Time: 下午23:50 * bolg: * 本文地址:/archives/1362 * 过往记忆博客,专注于hadoop、hive、spark、shark、flume的技术博客,大量的干货 * 过往记忆博客微信公共帐号:iteblog_hadoop */ DeprecatedConfig("spark.cleaner.ttl", "1.4", "TTL-based metadata cleaning is no longer necessary in recent Spark versions " + "and can lead to confusing errors if metadata is deleted for entities that are still" + " in use. Except in extremely special circumstances, you should remove this setting" + " and rely on Spark's reference-tracking-based cleanup instead." + " See SPARK-7689 for more details.")

  同时社区重新设计了删除RDD的逻辑,使得Spark可以自动地清除已经持久化RDD相关的metadata和数据,以及shuffles和broadcast 相关变量数据,并引入了ContextCleaner类,这个类在SparkContext中被实例化:

/**
 * User: 过往记忆
 * Date: 2015-05-19
 * Time: 下午23:50
 * bolg: 
 * 本文地址:/archives/1362
 * 过往记忆博客,专注于hadoop、hive、spark、shark、flume的技术博客,大量的干货
 * 过往记忆博客微信公共帐号:iteblog_hadoop
 */

  private[spark] val cleaner: Option[ContextCleaner] = {
    if (conf.getBoolean("spark.cleaner.referenceTracking", true)) {
      Some(new ContextCleaner(this))
    } else {
      None
    }
  }
  cleaner.foreach(_.start())

  在ContextCleaner 中会调用RDD.unpersist()来清除已经持久化的RDD数据:

  /** Perform RDD cleanup. */
  def doCleanupRDD(rddId: Int, blocking: Boolean) {
    try {
      logDebug("Cleaning RDD " + rddId)
      sc.unpersistRDD(rddId, blocking)
      listeners.foreach(_.rddCleaned(rddId))
      logInfo("Cleaned RDD " + rddId)
    } catch {
      case t: Throwable => logError("Error cleaning RDD " + rddId, t)
    }
  }

清除ShuffleBroadcast相关的数据会分别调用doCleanupShuffledoCleanupBroadcast函数。根据需要清除数据的类型分别调用:

task match {
   case CleanRDD(rddId) =>
         doCleanupRDD(rddId, blocking = blockOnCleanupTasks)
   case CleanShuffle(shuffleId) =>
         doCleanupShuffle(shuffleId, blocking = blockOnCleanupTasks)
   case CleanBroadcast(broadcastId) =>
         doCleanupBroadcast(broadcastId, blocking = blockOnCleanupTasks)
}

  相信加上这些逻辑之后,Spark清除RDD会更加智能,期待吧。

本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【spark.cleaner.ttl将在Spark 1.4中取消】(https://www.iteblog.com/archives/1362.html)
喜欢 (7)
分享 (0)
发表我的评论
取消评论

表情
本博客评论系统带有自动识别垃圾评论功能,请写一些有意义的评论,谢谢!