SparkCore之运行架构

Cluster Mode Overview

集群模式概述
This document gives a short overview of how Spark runs on clusters, to make it easier to understand the components involved. Read through the application submission guide to learn about launching applications on a cluster.
这篇文档简短概述了Spark如何在集群上运行，以便理解所涉及到的组件更加容易一些。阅读应用程序提交指南，了解有关在群集上启动应用程序的信息。

Components

组件
Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program (called the driver program).
sets of processes：进程的集合
Spark应用程序在集群上以独立的进程集合运行，由主程序（称作驱动程序）中的SparkContext对象来协调和组织。一个Spark应用程序包括一个driver和多个executors。
分析：Spark应用程序是一组独立的进程，这些进程有哪些？一个driver program进程和多个executors进程组成。

Specifically, to run on a cluster, the SparkContext can connect to several types of cluster managers (either Spark’s own standalone cluster manager, Mesos or YARN), which allocate resources across applications. Once connected, Spark acquires executors on nodes in the cluster, which are processes that run computations and store data for your application. Next, it sends your application code (defined by JAR or Python files passed to SparkContext) to the executors. Finally, SparkContext sends tasks to the executors to run.
具体来说，集群模式下，SparkContext 能够连接不同类型的cluster managers集群管理器，比如说Spark自己的standalone cluster manager， Mesos或者YARN，而这些cluster managers所扮演的角色是在各个应用程序application之间分配资源。一旦Spark连接上这些cluster managers，Spark就获得了分布在集群各个节点上的executors，这些executors其实是一系列的进程，这些进程负责执行我们的应用程序application中的计算并存储相关的数据。接着，SparkContext将我们的应用程序代码发送给executors，这些应用程序代码是由JAR或者Python文件所定义并且传给SparkContext。最后，SparkContext把tasks发送给executors去执行。
SparkCore之运行架构-LMLPHP
Driver program里面包含SparkContext，它为了能够在集群上去运行，能够把作业运行到集群上面，需要SparkContext需要通过Cluster manager去集群上申请资源，比如去了两个节点上面申请了两个executors。一旦连接上，拿到了资源，获得了分布在集群各个节点上的executors后，SparkContext就可以将我们的应用程序代码发送给executors。最后，SparkContext把tasks发送给executors去执行。

关于这个体系结构，需要注意以下几点：

1.Each application gets its own executor processes, which stay up for the duration of the whole application and run tasks in multiple threads. This has the benefit of isolating applications from each other, on both the scheduling side (each driver schedules its own tasks) and executor side (tasks from different applications run in different JVMs). However, it also means that data cannot be shared across different Spark applications (instances of SparkContext) without writing it to an external storage system.
每个Spark应用程序都有属于它自己本身的executor进程，这些进程在这个Spark应用程序的整个生命周期都一直存活着，并且以多线程的方式来执行内部的多个tasks。这样做的好处是使各个应用之间相互隔离，无论是在调度层面scheduling side还是在执行层面executor side都是相互隔离的，从调度层面来看，每个driver调度属于它自身的tasks，从执行层面上来看，属于不同applications的tasks运行在不同的JVM上，（假如现在有两个Spark应用程序，分别有多个executors，这些executors分布在各个节点之上，假如两个Spark应用程序的其中分别有一个executor，它俩是可以在同一个节点上运行的，而且互不影响）。然而，这也意味着不同的Spark applications（就是SparkContext的实例）之间是不能共享各自所属的数据，除非，你把数据写到外部存储系统。还有一个方法可以不用写到外部存储系统，比如说Alluxio内存高速虚拟分布式存储系统。
2.Spark is agnostic to the underlying cluster manager. As long as it can acquire executor processes, and these communicate with each other, it is relatively easy to run it even on a cluster manager that also supports other applications (e.g. Mesos/YARN).
Spark并不关心底层的cluster manager。只要Spark可以获取得到executor进程，并且这些executor进程能够互相通信，那么这个Spark应用程序即使运行在同样支持其他applications的cluster manager(比如Mesos/YARN)上面相对来说也是比较容易的。
就是说spark可以跑在好多个地方，但是它并不去关心你的底层，不管你是standalone 还是Mesos还是YARN，它的代码都是一样的。
3.The driver program must listen for and accept incoming connections from its executors throughout its lifetime (e.g., see spark.driver.port in the network config section). As such, the driver program must be network addressable from the worker nodes.
在driver program的整个生命周期中，它一直在监听并且接收来自其executors的连接。(可以查看spark.driver.port in the network config sectio)。因此，driver program必须跟各个worker nodes节点网络互通。
4.Because the driver schedules tasks on the cluster, it should be run close to the worker nodes, preferably on the same local area network. If you’d like to send requests to the cluster remotely, it’s better to open an RPC to the driver and have it submit operations from nearby than to run a driver far away from the worker nodes.
由于driver是在集群上调度各个任务的，所以它应该靠近工作节点运行，最好是在同一局域网上运行。如果你想发送请求给远端的集群，最好向驱动程序打开RPC并让它从附近提交操作，而不是远离工作节点运行驱动程序。

Cluster Manager Types

集群管理类型
The system currently supports several cluster managers:

Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster.
Apache Mesos – a general cluster manager that can also run Hadoop MapReduce and service applications.
Hadoop YARN – the resource manager in Hadoop 2.
Kubernetes – an open-source system for automating deployment, scaling, and management of containerized applications.

Spark现在直接支持以下几种集群管理器：

Standalone – 一个简单的cluster manager，Spark内置的，使得我们设置一个集群变得很容易。
Apache Mesos – 一个通用的cluster manager，它也可以运行Hadoop MapReduce以及service applications。
Hadoop YARN – Hadoop 2.x中的resource manager。
Kubernetes – 一个开源的项目，用于自动部署，扩容以及容器内部应用的管理。

Submitting Applications

Applications can be submitted to a cluster of any type using the spark-submit script. The application submission guide describes how to do this.
通过spark-submit脚本，我们可以把应用Applications提交到任何类型的集群上。这篇 application submission guide 文章描述了具体的实现方式。

Monitoring

Each driver program has a web UI, typically on port 4040, that displays information about running tasks, executors, and storage usage. Simply go to http://:4040 in a web browser to access this UI. The monitoring guide also describes other monitoring options.
每个驱动程序有一个web UI，典型的是在4040端口，你可以看到有关运行的任务、executors和存储空间大小等信息。我们可以通过浏览器访问http://:4040来浏览这个UI界面。这篇monitoring guide文章详细介绍了其他监控选项。

Job Scheduling

Spark gives control over resource allocation both across applications (at the level of the cluster manager) and within applications (if multiple computations are happening on the same SparkContext). The job scheduling overview describes this in more detail.
Spark在跨应用程序(在集群管理器的级别)和应用程序内(在同一个SparkContext上运行多个计算)中，对资源分配进行控制。 job scheduling overview 更加详细的描述了这个特性。

Glossary(术语)

SparkCore之运行架构-LMLPHP
总结一下，所谓的Spark应用程序，它包含一个driver加上多个executor。所谓的driver，它是运行我们的Spark应用程序里面的main方法，并且在里面创建SparkContext的一个进程。所谓的Cluster manager，它是一个外部服务，用于申请集群资源。所谓的Deploy mode，它分为两种模式，client和cluster模式，client模式下，就是driver跑在本地，如果是cluster模式，就是driver跑在集群里面。所谓的Worker node，就是在它上面可以跑executor进程，对于Yarn来说，就是NM上面运行container。所谓的executor，它是一个进程，这个进程可以运行多个任务并将数据保存在内存或磁盘存储中，每个Spark应用程序都有它自己的一组executors。所谓的job，就是当一个Spark应用程序遇到一个action的时候，就会创建一个job，job里面会有很多个task，task是进行计算的最小单元，task将会被发送到executor上面去执行。

liweihope