Created on 03-12-2015 08:06 PM - edited 09-16-2022 02:24 AM
Hi,
I start my quickstart VM and enter the terminal the following:
sqoop import-all-tables \
-m 1 \
--connect jdbc:mysql://quickstart.cloudera:3306/retail_db \
--username=retail_dba \
--password=cloudera \
--compression-codec=snappy \
--as-avrodatafile \
--warehouse-dir=/user/hive/warehouse
but get the following error
failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, in.folder = if (is.list(input)) { :
hadoop streaming failed with error code 5
any help is appreciated. I am just a beginner and I dont know anything thanks.
Created on 03-13-2015 12:07 PM - edited 03-13-2015 12:08 PM
I believe this procedure should get you switched over from YARN / MR2 to MR1. After running it I was able to comput pi using MR1:
for service in mapreduce-historyserver yarn-nodemanager yarn-proxyserver yarn-resourcemanager; do sudo service hadoop-${service} stop sudo chkconfig hadoop-${service} off done sudo yum remove -y hadoop-conf-pseudo sudo yum install -y hadoop-0.20-conf-pseudo for service in 0.20-mapreduce-jobtracker 0.20-mapreduce-tasktracker; do sudo service hadoop-${service} start sudo chkconfig hadoop-${service} on done
It stops and disables the MR2 / YARN services, swaps the configuration files, then starts and enables the MR1 services. Again, the tutorial is not written to be used (or tested) with with MR1, so it's possible you'll run into some other issues. I can't think if any specific incompatibilities - just recommending that if you want to walk through the tutorial, you do it with an environment as close to the original VM as possible - otherwise who knows what differences may be involved.
Created 03-13-2015 08:16 AM
Perhaps see if you can connect directly to MySql database from the command line. Here is how it looks for me in the quickstart VM:
[cloudera@quickstart ~]$ mysql --user=retail_dba --password=cloudera
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 26097
Server version: 5.1.66 Source distribution
Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use retail_db
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+---------------------+
| Tables_in_retail_db |
+---------------------+
| categories |
| customers |
| departments |
| order_items |
| orders |
| products |
+---------------------+
6 rows in set (0.00 sec)
mysql>
If that fails, then perhaps MySQL is not running?
Morgan
Created 03-13-2015 09:49 AM
Morgan,
Thanks for your repy. I can connect to mysql - I don't think that is the problem.
The problem is I can't connect to something and mapreduce jobs cannot be performed.
In the very first tutorial on cloudera, it reads "You should first log in to the Master Node of your cluster using SSH - you can get the credentials using the instructions on Your Cloudera Cluster. ".
I don't know how to do this. I'm just using the cloudera quickstart VM via Virtual Box. I start the VM and open the terminal, and enter the lines
sqoop import-all-tables \
-m 1 \
--connect jdbc:mysql://quickstart.cloudera:3306/retail_db \
--username=retail_dba \
--password=cloudera \
--compression-codec=snappy \
--as-avrodatafile \
--warehouse-dir=/user/hive/warehouse
and I get a connection refused error. SAme error happens if I try to use R and Hadoop together. It seems like I can't connect to a server or something? Do I have to really login to a server after starting my VM? I'm just trying to learn and all of it is new to me. Thanks for your help.
Here is the complete outpue I get after I run the above comments. I am on MacOSX 10.7.5, using cloudera quickstart VM via Virtual Box.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/03/13 09:44:56 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.0
15/03/13 09:44:56 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/03/13 09:44:56 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/03/13 09:44:56 INFO tool.CodeGenTool: Beginning code generation
15/03/13 09:44:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 09:44:56 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 09:44:56 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
Note: /tmp/sqoop-cloudera/compile/47d81c933f89fd992607ae4a35707074/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/03/13 09:44:59 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/47d81c933f89fd992607ae4a35707074/categories.jar
15/03/13 09:44:59 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/03/13 09:44:59 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/03/13 09:44:59 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/03/13 09:44:59 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/03/13 09:44:59 INFO mapreduce.ImportJobBase: Beginning import of categories
15/03/13 09:45:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 09:45:00 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/47d81c933f89fd992607ae4a35707074/sqoop_import_categories.avsc
15/03/13 09:45:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:11 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8021. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/03/13 09:45:11 WARN security.UserGroupInformation: PriviledgedActionException as:cloudera (auth:SIMPLE) cause:java.net.ConnectException: Call From quickstart.cloudera/127.0.0.1 to localhost:8021 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
15/03/13 09:45:11 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.net.ConnectException: Call From quickstart.cloudera/127.0.0.1 to localhost:8021 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
thanks,
ER
Created 03-13-2015 09:59 AM
>> In the very first tutorial on cloudera, it reads "You should first log in to the Master Node of your cluster using SSH - you can get the credentials using the instructions on Your Cloudera Cluster. "
It's a little confusing whether you're running these commands on your host machine, or on the VM. If you're reading the tutorial hosted on a website somewhere, it's written with you running this on a fully-distributed cluster in mind and SSH'ing in to the machine. There's a modified copy hosted on the VM itself (just go to localhost in the web browser in the VM, or on your host as port-forwarding should work for VirtualBox) that (in my copy at least) just tells you to click on the terminal icon on the VM's desktop and enter commands there. Which version of the VM are you using and where do you see that text? It should be possible to SSH into the VM, and even run these commands from your host machine but doing so requires a lot of network configuration to be set up correctly - it won't be set up that way by default and it can be complicated to get it working consistently on different hosts - which is why I recommend just using the terminal on the VM's desktop.
The root cause of your connection refused error problem appears to be that Sqoop is trying to use MR1. The VM is set up to use MR2 / YARN by default, so that is probably why MR1 is not running and you can't connect. Cloudera supports running both MR1 and MR2, but you can't have a machine configured as a client to both at the same time. When I run this on my copy of the VM (and in all recent versions) Sqoop is definitely using MR2 / YARN. Have you change any other configurations before running Sqoop? Is it possible you've got Sqoop installed on your host machine and it's configured differently than Sqoop in the VM?
Created 03-13-2015 10:31 AM
Sean, Thank you for your response.
I am running these commands on the VM. I am just using the terminal on my VM's desktop. I am reading the text that says "login to the master node of your cluster using SSH" in the web browser that is opened automatically upon starting the VM at the address (http://quickstart.cloudera/#/tutorial/ingest_structured_data). I am using Oracle VM VirtualBox Manager 4.3.20.
I don't think I made any configuration changes before running sqoop. I just opened the cloudera-quickstart-vm-5.3.0-0-virtualbox-disk1.vmdk using my Virtual Box.
I made some changes to use R and hadoop together using the blog at
http://blogr-cs.blogspot.com/2012/12/integration-of-r-rstudio-and-hadoop-in.html
but I think those are irrelevant.
I do not have sqoop on my host machine. I'd really really appreciate if you could please suggest some solutions I can understand/implement. Thank you.
ER
Created 03-13-2015 11:07 AM
ER,
I am fairly new to this also. Started with the Virtualbox quickstart VM running on Windows host.
FWIW, here is what I get when I run the same command...
cloudera@quickstart morgan]$ sqoop import-all-tables \
> -m 1 \
> --connect jdbc:mysql://quickstart.cloudera:3306/retail_db \
> --username=retail_dba \
> --password=cloudera \
> --compression-codec=snappy \
> --as-avrodatafile \
> --warehouse-dir=/user/hive/warehouse
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/03/13 13:47:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.0
15/03/13 13:47:04 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/03/13 13:47:04 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
15/03/13 13:47:05 INFO tool.CodeGenTool: Beginning code generation
15/03/13 13:47:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 13:47:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 13:47:05 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/03/13 13:47:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/categories.jar
15/03/13 13:47:09 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/03/13 13:47:09 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/03/13 13:47:09 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/03/13 13:47:09 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/03/13 13:47:09 INFO mapreduce.ImportJobBase: Beginning import of categories
15/03/13 13:47:09 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
15/03/13 13:47:09 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/03/13 13:47:11 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/03/13 13:47:12 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/sqoop_import_categories.avsc
15/03/13 13:47:12 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/03/13 13:47:12 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/03/13 13:47:15 INFO db.DBInputFormat: Using read commited transaction isolation
15/03/13 13:47:15 INFO mapreduce.JobSubmitter: number of splits:1
15/03/13 13:47:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1425573450783_0059
15/03/13 13:47:16 INFO impl.YarnClientImpl: Submitted application application_1425573450783_0059
15/03/13 13:47:16 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1425573450783_0059/
15/03/13 13:47:16 INFO mapreduce.Job: Running job: job_1425573450783_0059
15/03/13 13:47:29 INFO mapreduce.Job: Job job_1425573450783_0059 running in uber mode : false
15/03/13 13:47:29 INFO mapreduce.Job: map 0% reduce 0%
15/03/13 13:47:41 INFO mapreduce.Job: map 100% reduce 0%
15/03/13 13:47:41 INFO mapreduce.Job: Job job_1425573450783_0059 completed successfully
15/03/13 13:47:41 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=131709
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=1344
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=9535
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=9535
Total vcore-seconds taken by all map tasks=9535
Total megabyte-seconds taken by all map tasks=9763840
Map-Reduce Framework
Map input records=58
Map output records=58
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=118
CPU time spent (ms)=1430
Physical memory (bytes) snapshot=118579200
Virtual memory (bytes) snapshot=856969216
Total committed heap usage (bytes)=60751872
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=1344
15/03/13 13:47:41 INFO mapreduce.ImportJobBase: Transferred 1.3125 KB in 29.6161 seconds (45.3808 bytes/sec)
15/03/13 13:47:41 INFO mapreduce.ImportJobBase: Retrieved 58 records.
I'm not sure why you are getting this error:
Retrying connect to server: localhost/127.0.0.1:8021.
In fact, in my VM, I don't have a listener on port 8021, but do have one on 8020. Maybe someone more knowledgeable can address that?
Have you tried a restart of the VM? If you do that, give it some time for all the baclground processes to fire up before you try sqoop again
Morgan
Created 03-13-2015 11:13 AM
After reviewing the blog post, I noticed that it is written for the CDH 4.1.1 VM. I'm afraid there have been a number of changes since then that might be complicating things. The primary change, and the one that I think is complicating Sqoop for you, is the in CDH 4 we recommend MR1 for production, whereas in CDH 5 YARN has stabilized and we now recommend MR2 for production because of the superior resource management.
I believe the following line is responsible for setting up your environment such that Sqoop is trying to use MR1 when it is not running:
ln -s /etc/default/hadoop-0.20-mapreduce /etc/profile.d/hadoop.sh
You could either try getting rid of that symlink and anything else that's telling the system to use MR1, or you could stop YARN / MR2 and use MR1 instead. I'll try post some instructions for doing the latter shortly...
Created 03-13-2015 11:15 AM
To answer Morgan's question, port 8020 is the HDFS NameNode, port 8021 is the JobTracker in MR1, which is where you would have submitted jobs in CDH 4. It can still be used in CDH 5, but as it is not the default, you'll need to switch around some configuration and services (and understand that the rest of the tutorial may not work exactly as expected because of the switch - I'd suggest perhaps starting with a fresh copy of the tutorial to be sure everything in the tutorial will work and not conflict with what you've been doing in R).
Created on 03-13-2015 12:07 PM - edited 03-13-2015 12:08 PM
I believe this procedure should get you switched over from YARN / MR2 to MR1. After running it I was able to comput pi using MR1:
for service in mapreduce-historyserver yarn-nodemanager yarn-proxyserver yarn-resourcemanager; do sudo service hadoop-${service} stop sudo chkconfig hadoop-${service} off done sudo yum remove -y hadoop-conf-pseudo sudo yum install -y hadoop-0.20-conf-pseudo for service in 0.20-mapreduce-jobtracker 0.20-mapreduce-tasktracker; do sudo service hadoop-${service} start sudo chkconfig hadoop-${service} on done
It stops and disables the MR2 / YARN services, swaps the configuration files, then starts and enables the MR1 services. Again, the tutorial is not written to be used (or tested) with with MR1, so it's possible you'll run into some other issues. I can't think if any specific incompatibilities - just recommending that if you want to walk through the tutorial, you do it with an environment as close to the original VM as possible - otherwise who knows what differences may be involved.
Created 03-17-2015 09:50 AM
Sean,
Your procedure for stopping MR2/YARN and starting MR1 solved the problem.
I am not sure if you are familiar with R but my purpose is to set up R and Hadoop together. I did this using that blog I sent the link for. The mapreduce jobs run now and and output file is created as a result of a very simple 3 line R test code. But when I try to access that file, I get an "output file does not exist" error, which is given below. Any comments here that could help me proceed would be very very appreciated. Thanks. -ER
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:8020/user/cloudera/0
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.streaming.DumpTypedBytes.run(DumpTypedBytes.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:8020/user/cloudera/128432
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.streaming.DumpTypedBytes.run(DumpTypedBytes.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:8020/user/cloudera/422
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.streaming.DumpTypedBytes.run(DumpTypedBytes.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:8020/user/cloudera/122
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.streaming.DumpTypedBytes.run(DumpTypedBytes.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)