08-21-2015 10:34 AM
Hi All,
I am running cdh-5.3.2 on a single Ubuntu box. I am trying to run a sample mapreduce (mrjob) example below on my hadoop machine
I am getting below error and cannot figure out why. Seems this might be due to the Google API :
STDERR: mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "hadoop1.lab.mycompany.com/172.16.67.36"; destination host is: "localhost":9000;
https://github.com/google/protobuf
https://developers.google.com/protocol-buffers/docs/encoding#structure
Can you let me know how to overcome this issue ? I tried google but all the suggestions seem to go nowhere (I tested them all):
http://stackoverflow.com/questions/31849433/hadoop-protocol-message-tag-had-invalid-wire-type
hduser@hadoop1:~$ python mrjob-master/mrjob/examples/mr_word_freq_count.py /home/hduser/mrjob-master/README.rst -r hadoop -v Deprecated option hdfs_scratch_dir has been renamed to hadoop_tmp_dir Unexpected option hdfs_tmp_dir looking for configs in /home/hduser/.mrjob.conf using configs in /home/hduser/.mrjob.conf Active configuration:
{'bootstrap_mrjob': None,
'check_input_paths': True,
'cleanup': ['ALL'],
'cleanup_on_failure': ['NONE'],
'cmdenv': {},
'hadoop_bin': None,
'hadoop_extra_args': [],
'hadoop_home': None,
'hadoop_streaming_jar': None,
'hadoop_tmp_dir': 'tmp/mrjob',
'hadoop_version': '0.20',
'interpreter': None,
'jobconf': {},
'label': None,
'local_tmp_dir': '/tmp',
'owner': 'hduser',
'python_archives': [],
'python_bin': None,
'setup': [],
'setup_cmds': [],
'setup_scripts': [],
'sh_bin': ['sh', '-ex'],
'steps_interpreter': None,
'steps_python_bin': None,
'strict_protocols': True,
'upload_archives': [],
'upload_files': []}
Hadoop streaming jar is /home/hduser/Desktop/hadoop-2.5.0-cdh5.3.2/share/hadoop/tools/lib/hadoop-streaming-2.5.0-cdh5.3.2.jar
creating tmp directory /tmp/mr_word_freq_count.hduser.20150820.213521.092743
archiving /usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob -> /tmp/mr_word_freq_count.hduser.20150820.213521.092743/mrjob.tar.gz as mrjob/ writing wrapper script to /tmp/mr_word_freq_count.hduser.20150820.213521.092743/setup-wrapper.sh
WRAPPER: # store $PWD
WRAPPER: __mrjob_PWD=$PWD
WRAPPER:
WRAPPER: # obtain exclusive file lock
WRAPPER: exec 9>/tmp/wrapper.lock.mr_word_freq_count.hduser.20150820.213521.092743
WRAPPER: python -c 'import fcntl; fcntl.flock(9, fcntl.LOCK_EX)'
WRAPPER:
WRAPPER: # setup commands
WRAPPER: {
WRAPPER: export PYTHONPATH=$__mrjob_PWD/mrjob.tar.gz:$PYTHONPATH
WRAPPER: } 0</dev/null 1>&2
WRAPPER:
WRAPPER: # release exclusive file lock
WRAPPER: exec 9>&-
WRAPPER:
WRAPPER: # run task from the original working directory
WRAPPER: cd $__mrjob_PWD
WRAPPER: "$@"
Making directory hdfs:///user/hduser/tmp/mrjob/mr_word_freq_count.hduser.20150820.213521.092743/files/ on HDFS
> /home/hduser/Desktop/hadoop-2.5.0-cdh5.3.2/bin/hadoop version
Using Hadoop version 2.5.0
> /home/hduser/Desktop/hadoop-2.5.0-cdh5.3.2/bin/hadoop fs -mkdir -p
> hdfs:///user/hduser/tmp/mrjob/mr_word_freq_count.hduser.20150820.21352
> 1.092743/files/
STDERR: 15/08/20 14:35:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
STDERR: mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "hadoop1.lab.mycompany.com/172.16.67.36"; destination host is: "localhost":9000; Traceback (most recent call last):
File "mrjob-master/mrjob/examples/mr_word_freq_count.py", line 37, in <module>
MRWordFreqCount.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/job.py", line 433, in run
mr_job.execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/job.py", line 451, in execute
super(MRJob, self).execute()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/launch.py", line 160, in execute
self.run_job()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/launch.py", line 227, in run_job
runner.run()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/runner.py", line 452, in run
self._run()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/hadoop.py", line 234, in _run
self._upload_local_files_to_hdfs()
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/hadoop.py", line 261, in _upload_local_files_to_hdfs
self._mkdir_on_hdfs(self._upload_mgr.prefix)
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/hadoop.py", line 281, in _mkdir_on_hdfs
self.invoke_hadoop(['fs', '-mkdir', '-p', path])
File "/usr/local/lib/python2.7/dist-packages/mrjob-0.5.0_dev-py2.7.egg/mrjob/fs/hadoop.py", line 101, in invoke_hadoop
raise CalledProcessError(proc.returncode, args)
subprocess.CalledProcessError: Command '['/home/hduser/Desktop/hadoop-2.5.0-cdh5.3.2/bin/hadoop', 'fs', '-mkdir', '-p', 'hdfs:///user/hduser/tmp/mrjob/mr_word_freq_count.hduser.20150820.213521.092743/files/']' returned non-zero exit status 1 hduser@hadoop1:~$
hduser@hadoop1:~$ jps
23519 NodeManager
23192 ResourceManager
23667 Jps
23029 SecondaryNameNode
22842 DataNode
hduser@hadoop1:~$
hduser@hadoop1:~$ sudo ufw status | grep 9000
9000 ALLOW Anywhere
9000 (v6) ALLOW Anywhere (v6)
hduser@hadoop1:~$
hduser@hadoop1:~$ telnet localhost 9000
Trying 127.0.0.1...
Connected to localhost.localdomain.
Escape character is '^]'.
SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.3
hduser@hadoop1:~/Desktop/hadoop-dns-checker-master$ ./run-on-cluster.sh my_hosts ==== hadoop1.lab.mycompany.com ==== hduser@hadoop1.lab.mycompany.com's password:
sending incremental file list
created directory hadoop-dns
a.jar
my_hosts
run.sh
sent 2,449 bytes received 106 bytes 393.08 bytes/sec total size is 2,620 speedup is 1.03 hduser@hadoop1.lab.mycompany.com's password:
# self check...
-- host : hadoop1.lab.mycompany.com
host lookup : success (172.16.67.36)
reverse lookup : success (hadoop1.lab.mycompany.com)
is reachable : yes
# end self check
==== Running on : hadoop1.lab.mycompany.com/172.16.67.36 =====
-- host : hadoop1.lab.mycompany.com
host lookup : success (172.16.67.36)
reverse lookup : success (hadoop1.lab.mycompany.com)
is reachable : yes
hduser@hadoop1:~/Desktop/hadoop-dns-checker-master$
hduser@hadoop1:~$ cat hadoop-2.5.0-cdh5.3.2/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
</configuration>
hduser@hadoop1:~$
hduser@hadoop1:~$ cat hadoop-2.5.0-cdh5.3.2/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
hduser@hadoop1:~$
hduser@hadoop1:~$ cat hadoop-2.5.0-cdh5.3.2/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
hduser@hadoop1:~$
hduser@hadoop1:~$ cat hadoop-2.5.0-cdh5.3.2/etc/hadoop/hadoop-env.sh | grep HADOOP_CONF_DIR export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
hduser@hadoop1:~$
hduser@hadoop1:~$ hostname --fqdn
hadoop1.lab.mycompany.com
hduser@hadoop1:~$
hduser@hadoop1:~$ cat /etc/hosts
#127.0.0.1 localhost
#127.0.1.1 myhost-1
#127.0.0.1 localhost ubuntu
#172.16.67.36 ubuntu.cisco.com ubuntu
#127.0.0.1 ubuntu.cisco.com ubuntu
#172.16.67.36 myhost-1
#127.0.0.1 myhost-1
127.0.0.1 localhost.localdomain localhost
172.16.67.36 hadoop1.lab.mycompany.com hadoop1
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
hduser@hadoop1:~$
hduser@hadoop1:~$ hostname
hadoop1.lab.mycompany.com
hduser@hadoop1:~$
Appreciate the help.
08-25-2015 10:41 PM
08-26-2015 11:33 AM
Hi Harsh,
Thanks for your reply.
I think you might be correct. I do not see Namenode process working. I did check my core-site.xml file and it shows localhost:9000 but I do not see the process running.
I formated my namenode and restarted the services but still the same results.
root@hadoop1:/home/hduser/hadoop-2.5.0-cdh5.3.2/sbin# netstat -anp | grep NNPID | grep LISTEN root@hadoop1:/home/hduser/hadoop-2.5.0-cdh5.3.2/sbin# whoami root root@hadoop1:/home/hduser/hadoop-2.5.0-cdh5.3.2/sbin# su hduser hduser@hadoop1:~/hadoop-2.5.0-cdh5.3.2/sbin$ jps 7568 NodeManager 7241 ResourceManager 7078 SecondaryNameNode 6896 DataNode 9842 Jps hduser@hadoop1:~/hadoop-2.5.0-cdh5.3.2/sbin$
could you let me know the files/scripts etc I could use to troubleshoot further.
Thanks.
01-23-2019 07:46 AM
Hi , i just like you , and check your version of Hadoop by "Hadoop version" , so when i check my Hadoop version is 2.0.0-cdh4.2.1 , after that , i change my gradle to
compile group: 'org.apache.hadoop', name: 'hadoop-core', version: '2.0.0-mr1-cdh4.2.1', ext: 'pom'
compile group: 'org.apache.hadoop', name: 'hadoop-hdfs', version: '2.0.0-cdh4.2.1'
compile group: 'org.apache.hadoop', name: 'hadoop-common', version: '2.0.0-cdh4.2.1'
compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-core', version: '2.0.0-cdh4.2.1'
compile group: 'org.apache.hadoop', name: 'hadoop-client', version: '2.0.0-cdh4.2.1'
it works !