本文介绍了Hadoop伪分布式操作错误:协议消息标记具有无效的线路类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设置 Hadoop 2.6.0 单节点群集。我遵循 hadoop-common / SingleCluster文档。我在 Ubuntu 14.04 上工作。到目前为止,我已经成功地运行了独立操作。

尝试执行伪分布式操作时,我会遇到错误。我设法启动了NameNode守护进程和DataNode守护进程。 jps oputut:

  martakarass @ marta-komputer:/ usr / local / hadoop $ jps 
4963 SecondaryNameNode
4785 DataNode
8400 Jps
martakarass @ marta-komputer:/ usr / local / hadoop $

但是,当我尝试使HDFS目录执行MapReduce作业时,我收到以下错误:

  martakarass @ marta-komputer:/ usr / local / hadoop $ bin / hdfs dfs -mkdir / user 
15/05/01 20:36:00 WARN util.NativeCodeLoader:无法加载native-hadoop库,适用于您的平台...在适用的情况下使用builtin-java类
mkdir:本地异常失败:com.google.protobuf.InvalidProtocolBufferException:协议消息标记的线路类型无效。主机详细信息:本地主机是:marta-komputer / 127.0.0.1;目的地主机是:localhost:9000;
martakarass @ marta-komputer:/ usr / local / hadoop $

(我相信我可以忽略 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库... 此时警告。)






当谈到 Hadoop 配置文件时,我只更改了文档中提到的文件。我有:

etc / hadoop / core-site.xml

 <结构> 
<属性>
<名称> fs.defaultFS< / name>
< value> hdfs:// localhost:9000< / value>
< / property>
< / configuration>

etc / hadoop / hdfs-site.xml

 <配置> 
<属性>
< name> dfs.replication< / name>
<值> 1< /值>
< / property>
< / configuration>

我设法连接到本地主机:

  martakarass @ marta-komputer:〜$ ssh localhost 
martakarass @ localhost的密码:
欢迎使用Ubuntu 14.04.1 LTS(GNU / Linux 3.13.0-45-通用x86_64)

*文档:https://help.ubuntu.com/

上次登录:周五五月1 2015年20:28:58 from localhost

我格式化了文件系统:

  martakarass @ marta-komputer:/ usr / local / hadoop $ bin / hdfs namenode -format 
15/05/01 20:30:21 INFO namenode.NameNode:STARTUP_MSG:
/ * ************************************************** *********
STARTUP_MSG:启动NameNode
STARTUP_MSG:host = marta-komputer / 127.0.0.1
STARTUP_MSG:args = [-format]
STARTUP_MSG: version = 2.6.0
(...)
15/05/01 20:30:24信息namenode.NameNode:SHUTDOWN_MSG:
/ ********** **************************************************
SHUTDOWN_MSG:关闭名称节点在marta-komputer / 127.0.0.1
************************************* *********************** /



marta-komputer

#对于支持IPv6的主机,以下几行是可取的:
:: 1 ip6-localhost ip6-loopback
fe00 :: 0 ip6-localnet
ff00 :: 0 ip6-mcastprefix
ff02 :: 1 ip6-allnodes
ff02 :: 2 ip6-allrouters

etc / hostname

  marta-komputer 


解决方案

这是我在Ubuntu上遇到完全相同的问题时所遵循的一组步骤但使用 2.7.1 ,这些步骤对于以前和未来版本应该没有太大差异(我相信)。



1) / etc / hosts 文件夹的格式:



  127.0.0.1 localhost< computer-name> ; 
#127.0.1.1< computer-name>
< ip-address> <计算机名称>

#其余文件无变化



2) *。xml 配置文件(显示<配置> 标签内的内容):




  • 对于 core-site.xml

     <属性> 
    <名称> fs.defaultFS< / name>
    < value> hdfs:// localhost /< / value>
    < / property>
    <属性>
    < name> hadoop.tmp.dir< / name>
    < value>set / a / directory / on / your / machine /< / value>
    < description>其他临时目录的基础< / description>
    < / property>


  • 对于 hdfs-site.xml

     < property> 
    < name> dfs.replication< / name>
    <值> 1< /值>
    < / property>


  • yarn-site.xml

     < property> 
    < name> yarn.recourcemanager.hostname< / name>
    <值>本地主机< /值>
    < / property>

    <属性>
    < name> yarn.nodemanager.aux-services< / name>
    < value> mapreduce_shuffle< /值>
    < / property>


  • 对于 mapred-site.xml

     < property> 
    < name> mapreduce.framework.name< / name>
    <值>纱线< /值>
    < / property>




3)确认 $ HADOOP_CONF_DIR



这是验证您确实使用此配置的好机会。在 .xml 文件所在的文件夹中,查看脚本 hadoop_env.sh 的内容,并确保指向正确的目录。



4)检查您的PORTS:



NameNode在我的标准发行版上绑定端口 50070 8020 ,并且DataNode绑定端口 50010 50020 50075 43758 。运行 sudo lsof -i 确保没有其他服务正在使用它们。


$ b

5)格式if必要:

此时,如果您已更改值 hadoop.tmp.dir ,则应重新格式化NameNode由 hdfs namenode -format 。如果不是删除临时文件,该文件已存在于您正在使用的tmp目录中(默认 / tmp / ):



6)启动节点和纱线:



/ sbin / 通过使用 start-dfs.sh 脚本和 的纱线来启动名称和数据节点。 start-yarn.sh 并评估jps的输出:

  ./start -dfs.sh 
./start-yarn.sh






此时,如果NameNode,DataNode,NodeManager和ResourceManager都在运行,则应该设置为去!



如果其中任何一个尚未开始,请分享我们的日志输出以重新评估。


I am setting up a Hadoop 2.6.0 Single Node Cluster. I follow the hadoop-common/SingleCluster documentation. I work on Ubuntu 14.04. So far I have managed to run Standalone Operation successfully.

I face an error when trying to perform Pseudo-Distributed Operation. I managed to start NameNode daemon and DataNode daemon. jps oputut:

martakarass@marta-komputer:/usr/local/hadoop$ jps
4963 SecondaryNameNode
4785 DataNode
8400 Jps
martakarass@marta-komputer:/usr/local/hadoop$

But when I try to make the HDFS directories required to execute MapReduce jobs, I receive the following error:

martakarass@marta-komputer:/usr/local/hadoop$ bin/hdfs dfs -mkdir /user
15/05/01 20:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "marta-komputer/127.0.0.1"; destination host is: "localhost":9000;
martakarass@marta-komputer:/usr/local/hadoop$

(I believe I can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... warning at this point.)


When it comes to Hadoop config files, I changed only the files mentioned in the documentation. I have:

etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

I managed to connect to localhost:

martakarass@marta-komputer:~$ ssh localhost
martakarass@localhost's password:
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-45-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

Last login: Fri May  1 20:28:58 2015 from localhost

I formatted the filesystem:

martakarass@marta-komputer:/usr/local/hadoop$  bin/hdfs namenode -format
15/05/01 20:30:21 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = marta-komputer/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
(...)
15/05/01 20:30:24 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.0.1
************************************************************/

/etc/hosts:

127.0.0.1       localhost
127.0.0.1       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

etc/hostname:

marta-komputer
解决方案

This is a set of steps I followed on Ubuntu when facing exactly the same problem but with 2.7.1, the steps shouldn't differ much for previous and future version (I'd believe).

1) Format of my /etc/hosts folder:

    127.0.0.1    localhost   <computer-name>
    # 127.0.1.1    <computer-name>
    <ip-address>    <computer-name>

    # Rest of file with no changes

2) *.xml configuration files (displaying contents inside <configuration> tag):

  • For core-site.xml:

        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost/</value>
        </property>
        <!-- set value to a directory you want with an absolute path -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>"set/a/directory/on/your/machine/"</value>
            <description>A base for other temporary directories</description>
        </property>
    

  • For hdfs-site.xml:

        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    

  • For yarn-site.xml:

        <property>
            <name>yarn.recourcemanager.hostname</name>
            <value>localhost</value>
        </property>
    
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    

  • For mapred-site.xml:

        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    

3) Verify $HADOOP_CONF_DIR:

This is a good opportunity to verify that you are indeed using this configuration. In the folder where your .xml files reside, view contents of script hadoop_env.sh and make sure $HADOOP_CONF_DIR is pointing at the right directory.

4) Check your PORTS:

NameNode binds ports 50070 and 8020 on my standard distribution and DataNode binds ports 50010, 50020, 50075 and 43758. Run sudo lsof -i to be certain no other services are using them for some reason.

5) Format if necessary:

At this point, if you have changed the value hadoop.tmp.dir you should reformat the NameNode by hdfs namenode -format. If not remove the temporary files already present in the tmp directory you are using (default /tmp/):

6) Start Nodes and Yarn:

In /sbin/ start the name and data node by using the start-dfs.sh script and yarn with start-yarn.sh and evaluate the output of jps:

    ./start-dfs.sh
    ./start-yarn.sh


At this point if NameNode, DataNode, NodeManager and ResourceManager are all running you should be set to go!

If any of these hasn't started, share the log output for us to re-evaluate.

这篇关于Hadoop伪分布式操作错误:协议消息标记具有无效的线路类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-23 12:15