Hadoop 1.x和Hadoop 2.x，Hadoop 1.x局限性和Hadoop 2.x YARN优点之间的区别

Before reading this post, please go through my previous posts to get some Basic knowledge about BigData Hadoop 1.x and 2.x.

在阅读本文之前，请浏览我以前的文章，以获取有关BigData Hadoop 1.x和2.x的一些基本知识。

BigData Hadoop 1.x Architecture and ComponentsBigData Hadoop 1.x体系结构和组件
BigData Hadoop 2.x Architecture and ComponentsBigData Hadoop 2.x体系结构和组件

In this post, we are going to discuss about Difference between Hadoop 1.x and Hadoop 2.x, Hadoop 1.x Architecture Drawbacks or Limitations and How Hadoop 2.x Architecture solves Hadoop 1.x Limitations in detail.

在本文中，我们将详细讨论Hadoop 1.x和Hadoop 2.x之间的区别，Hadoop 1.x架构的缺点或局限性以及Hadoop 2.x架构如何解决Hadoop 1.x局限性。

Apache Hadoop Latest version is 2.7.0.

Apache Hadoop的最新版本是2.7.0 。

Hadoop V.1.x组件 (Hadoop V.1.x Components)

Apache Hadoop V.1.x has the following two major Components

Apache Hadoop V.1.x具有以下两个主要组件

HDFS (HDFS V1)HDFS（HDFS V1）
MapReduce (MR V1)MapReduce（MR V1）

In Hadoop V.1.x, these two are also know as Two Pillars of Hadoop.

在Hadoop V.1.x中，这两个也称为Hadoop的两个Struts。

Hadoop V.2.x组件 (Hadoop V.2.x Components)

Apache Hadoop V.2.x has the following three major Components

Apache Hadoop V.2.x具有以下三个主要组件

HDFS V.2HDFS V.2
YARN (MR V2)纱（MR V2）
MapReduce (MR V1)MapReduce（MR V1）

In Hadoop V.2.x, these two are also know as Three Pillars of Hadoop.

在Hadoop V.2.x中，这两个也被称为Hadoop的三大Struts。

Hadoop 1.x限制 (Hadoop 1.x Limitations)

Hadoop 1.x has many limitations or drawbacks. Main drawback of Hadoop 1.x is that MapReduce Component in it’s Architecture. That means it supports only MapReduce-based Batch/Data Processing Applications.

Hadoop 1.x具有许多局限性或缺点。 Hadoop 1.x的主要缺点是其体系结构中的MapReduce组件。这意味着它仅支持基于MapReduce的批处理/数据处理应用程序。

Hadoop 1.x has the following Limitations/Drawbacks:

Hadoop 1.x具有以下局限性/缺点：

It is only suitable for Batch Processing of Huge amount of Data, which is already in Hadoop System.它仅适用于Hadoop系统中已经存在的大量数据的批处理。
It is not suitable for Real-time Data Processing.它不适用于实时数据处理。
It is not suitable for Data Streaming.它不适用于数据流。
It supports upto 4000 Nodes per Cluster.每个群集最多支持4000个节点 。
It has a single component : JobTracker to perform many activities like Resource Management, Job Scheduling, Job Monitoring, Re-scheduling Jobs etc.它具有单个组件：JobTracker，可以执行许多活动，例如资源管理，作业计划，作业监视，重新计划作业等。
JobTracker is the single point of failure.JobTracker是单点故障。
It does not support Multi-tenancy Support.它不支持多租户支持。
It supports only one Name Node and One Namespace per Cluster.每个群集仅支持一个名称节点和一个名称空间。
It does not support Horizontal Scalability.它不支持水平可伸缩性。
It runs only Map/Reduce jobs.它仅运行Map / Reduce作业。
It follows Slots concept in HDFS to allocate Resources (Memory, RAM, CPU). It has static Map and Reduce Slots. That means once it assigns resources to Map/Reduce jobs, it cannot re-use them even though some slots are idle.它遵循HDFS中的插槽概念来分配资源（内存，RAM，CPU）。它具有静态贴图和缩小插槽。这意味着一旦将资源分配给Map / Reduce作业，即使某些插槽处于空闲状态，也无法重新使用它们。
For Example:- Suppose, 10 Map and 10 Reduce Jobs are running with 10 + 10 Slots to perform a computation. All Map Jobs are doing their tasks but all Reduce jobs are idle. We cannot use these Idle jobs for other purpose.

例如：-假设正在运行10个Map和10个Reduce作业以及10 + 10个插槽以执行计算。所有地图作业都在执行任务，但所有简化作业都处于空闲状态。我们不能将这些闲置作业用于其他目的。

NOTE:- In Summary, Hadoop 1.x System is a Single Purpose System. We can use it only for MapReduce Based Applications.

注意：-总之，Hadoop 1.x系统是一个单一目的系统。 我们只能将其用于基于MapReduce的应用程序。

Hadoop 1.x和Hadoop 2.x之间的差异 (Differences between Hadoop 1.x and Hadoop 2.x)

If we observe the components of Hadoop 1.x and 2.x, Hadoop 2.x Architecture has one extra and new component that is : YARN (Yet Another Resource Negotiator).

如果我们观察Hadoop 1.x和2.x的组件，则Hadoop 2.x架构具有一个额外的新组件： YARN（又是另一个资源协商器） 。

It is the game changing component for BigData Hadoop System.

它是BigData Hadoop系统的改变游戏规则的组件。

New Components and API新组件和API
As shown in the below diagram, Hadoop 1.x is re-architected and introduced new component to solve Hadoop 1.x Limitations.

如下图所示，重新构建了Hadoop 1.x并引入了新组件来解决Hadoop 1.x的局限性。
Hadoop 1.x Job TrackerHadoop 1.x作业跟踪器
As shown in the below diagram, Hadoop 1.x Job Tracker component is divided into two components:

如下图所示，Hadoop 1.x Job Tracker组件分为两个组件：

Resource Manager:-资源管理器：-
To manage resources in cluster

管理集群中的资源
Application Master:-应用主管：
To manage applications like MapReduce, Spark etc.

管理MapReduce，Spark等应用程序

Hadoop 1.x supports only one namespace for managing HDFS filesystem whereas Hadoop 2.x supports multiple namespaces.Hadoop 1.x仅支持一个用于管理HDFS文件系统的名称空间，而Hadoop 2.x支持多个名称空间。
Hadoop 1.x supports one and only one programming model: MapReduce. Hadoop 2.x supports multiple programming models with YARN Component like MapReduce, Interative, Streaming, Graph, Spark, Storm etc.Hadoop 1.x仅支持一种编程模型：MapReduce。 Hadoop 2.x使用YARN组件支持多种编程模型，例如MapReduce，Interative，Streaming，Graph，Spark，Storm等。
Hadoop 1.x has lot of limitations in Scalability. Hadoop 2.x has overcome that limitation with new architecture.Hadoop 1.x在可伸缩性方面有很多限制。 Hadoop 2.x通过新架构克服了这一限制。
Hadoop 2.x has Multi-tenancy Support, but Hadoop 1.x doesn’t.Hadoop 2.x具有多租户支持，但是Hadoop 1.x没有。
Hadoop 1.x HDFS uses fixed-size Slots mechanism for storage purpose whereas Hadoop 2.x uses variable-sized Containers.Hadoop 1.x HDFS使用固定大小的插槽机制进行存储，而Hadoop 2.x使用可变大小的容器。
Hadoop 1.x supports maximum 4,000 nodes per cluster where Hadoop 2.x supports more than 10,000 nodes per cluster.Hadoop 1.x每个集群最多支持4,000个节点，而Hadoop 2.x每个集群最多支持10,000个节点。
Hadoop 2.x如何解决Hadoop 1.x局限性 (How Hadoop 2.x solves Hadoop 1.x Limitations)

Hadoop 2.x has resolved most of the Hadoop 1.x limitations by using new architecture.

Hadoop 2.x通过使用新架构解决了Hadoop 1.x的大多数限制。
- By decoupling MapReduce component responsibilities into different components.通过将MapReduce组件职责分离为不同的组件。
- By Introducing new YARN component for Resource management.通过引入用于资源管理的新YARN组件。
- By decoupling component’s responsibilities, it supports multiple namespace, Multi-tenancy, Higher Availability and Higher Scalability.通过分离组件的职责，它支持多个名称空间，多租户，更高可用性和更高可伸缩性。
Hadoop 2.x YARN的好处 (Hadoop 2.x YARN Benefits)

Hadoop 2.x YARN has the following benefits.

Hadoop 2.x YARN具有以下优点。
- Highly Scalability高度可扩展
- Highly Availability高可用性
- Supports Multiple Programming Models支持多种编程模型
- Supports Multi-Tenancy支持多租户
- Supports Multiple Namespaces支持多个命名空间
- Improved Cluster Utilization改进的集群利用率
- Supports Horizontal Scalability支持水平可伸缩性
That’s it all about Differences between Hadoop 1.x and Hadoop 2.x. We will discuss some more BigData and Hadoop Basics in my coming posts.

这就是有关Hadoop 1.x和Hadoop 2.x之间差异的全部内容。在我的后续文章中，我们将讨论更多BigData和Hadoop基础。

Please drop me a comment if you like my post or have any issues/suggestions.

如果您喜欢我的帖子或有任何问题/建议，请给我评论。

翻译自: https://www.journaldev.com/8806/differences-between-hadoop1-and-hadoop2